Zum Hauptinhalt springen

Showing 1–50 of 57 results for author: Kim, G

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.13564  [pdf, other

    math.OC eess.SY

    Convergence result for the gradient-push algorithm and its application to boost up the Push-DIging algorithm

    Authors: Hyogi Choi, Woocheol Choi, Gwangil Kim

    Abstract: The gradient-push algorithm is a fundamental algorithm for the distributed optimization problem \begin{equation} \min_{x \in \mathbb{R}^d} f(x) = \sum_{j=1}^n f_j (x), \end{equation} where each local cost $f_j$ is only known to agent $a_i$ for $1 \leq i \leq n$ and the agents are connected by a directed graph. In this paper, we obtain convergence results for the gradient-push algorithm with consta… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.11365  [pdf, other

    eess.AS

    Team HYU ASML ROBOVOX SP Cup 2024 System Description

    Authors: Jeong-Hwan Choi, Gaeun Kim, Hee-Jae Lee, Seyun Ahn, Hyun-Soo Kim, Joon-Hyuk Chang

    Abstract: This report describes the submission of HYU ASML team to the IEEE Signal Processing Cup 2024 (SP Cup 2024). This challenge, titled "ROBOVOX: Far-Field Speaker Recognition by a Mobile Robot," focuses on speaker recognition using a mobile robot in noisy and reverberant conditions. Our solution combines the result of deep residual neural networks and time-delay neural network-based speaker embedding… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Technical report for IEEE Signal Processing Cup 2024, 9 pages

  3. arXiv:2406.16994  [pdf, other

    eess.SP cs.AI

    Quantum Multi-Agent Reinforcement Learning for Cooperative Mobile Access in Space-Air-Ground Integrated Networks

    Authors: Gyu Seon Kim, Yeryeong Cho, Jaehyun Chung, Soohyun Park, Soyi Jung, Zhu Han, Joongheon Kim

    Abstract: Achieving global space-air-ground integrated network (SAGIN) access only with CubeSats presents significant challenges such as the access sustainability limitations in specific regions (e.g., polar regions) and the energy efficiency limitations in CubeSats. To tackle these problems, high-altitude long-endurance unmanned aerial vehicles (HALE-UAVs) can complement these CubeSat shortcomings for prov… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 17 pages, 22 figures

  4. arXiv:2406.05270  [pdf

    physics.med-ph cs.CV cs.LG eess.IV

    fastMRI Breast: A publicly available radial k-space dataset of breast dynamic contrast-enhanced MRI

    Authors: Eddy Solomon, Patricia M. Johnson, Zhengguo Tan, Radhika Tibrewala, Yvonne W. Lui, Florian Knoll, Linda Moy, Sungheon Gene Kim, Laura Heacock

    Abstract: This data curation work introduces the first large-scale dataset of radial k-space and DICOM data for breast DCE-MRI acquired in diagnostic breast MRI exams. Our dataset includes case-level labels indicating patient age, menopause status, lesion status (negative, benign, and malignant), and lesion type for each case. The public availability of this dataset and accompanying reconstruction code will… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  5. arXiv:2406.02562  [pdf, other

    eess.AS cs.AI cs.CL

    Gated Low-rank Adaptation for personalized Code-Switching Automatic Speech Recognition on the low-spec devices

    Authors: Gwantae Kim, Bokyeung Lee, Donghyeon Kim, Hanseok Ko

    Abstract: In recent times, there has been a growing interest in utilizing personalized large models on low-spec devices, such as mobile and CPU-only devices. However, utilizing a personalized large model in the on-device is inefficient, and sometimes limited due to computational cost. To tackle the problem, this paper presents the weights separation method to minimize on-device model weights using parameter… ▽ More

    Submitted 23 April, 2024; originally announced June 2024.

    Comments: Table 2 is revised

    Journal ref: ICASSP 2024 Workshop(HSCMA 2024) paper

  6. arXiv:2405.19380  [pdf, other

    stat.ML cs.LG eess.SY

    Approximate Thompson Sampling for Learning Linear Quadratic Regulators with $O(\sqrt{T})$ Regret

    Authors: Yeoneung Kim, Gihun Kim, Insoon Yang

    Abstract: We propose an approximate Thompson sampling algorithm that learns linear quadratic regulators (LQR) with an improved Bayesian regret bound of $O(\sqrt{T})$. Our method leverages Langevin dynamics with a meticulously designed preconditioner as well as a simple excitation mechanism. We show that the excitation signal induces the minimum eigenvalue of the preconditioner to grow over time, thereby acc… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 61 pages, 6 figures

  7. arXiv:2405.13762  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

    Authors: Gwanghyun Kim, Alonso Martinez, Yu-Chuan Su, Brendan Jou, José Lezama, Agrim Gupta, Lijun Yu, Lu Jiang, Aren Jansen, Jacob Walker, Krishna Somandepalli

    Abstract: Training diffusion models for audiovisual sequences allows for a range of generation tasks by learning conditional distributions of various input-output combinations of the two modalities. Nevertheless, this strategy often requires training a separate model for each task which is expensive. Here, we propose a novel training approach to effectively learn arbitrary conditional distributions in the a… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  8. arXiv:2405.11807  [pdf, other

    cs.HC cs.RO eess.SY

    Dual-sided Peltier Elements for Rapid Thermal Feedback in Wearables

    Authors: Seongjun Kang, Gwangbin Kim, Seokhyun Hwang, Jeongju Park, Ahmed Elsharkawy, SeungJun Kim

    Abstract: This paper introduces a motor-driven Peltier device designed to deliver immediate thermal sensations within extended reality (XR) environments. The system incorporates eight motor-driven Peltier elements, facilitating swift transitions between warm and cool sensations by rotating preheated or cooled elements to opposite sides. A multi-layer structure, comprising aluminum and silicone layers, ensur… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 3 pages, 4 figures, ICRA Wearable Workshop 2024 - 1st Workshop on Advancing Wearable Devices and Applications through Novel Design, Sensing, Actuation, and AI

  9. arXiv:2401.08962  [pdf, other

    cs.HC cs.LG cs.SD eess.AS

    DOO-RE: A dataset of ambient sensors in a meeting room for activity recognition

    Authors: Hyunju Kim, Geon Kim, Taehoon Lee, Kisoo Kim, Dongman Lee

    Abstract: With the advancement of IoT technology, recognizing user activities with machine learning methods is a promising way to provide various smart services to users. High-quality data with privacy protection is essential for deploying such services in the real world. Data streams from surrounding ambient sensors are well suited to the requirement. Existing ambient sensor datasets only support constrain… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  10. arXiv:2312.13313  [pdf, other

    eess.IV cs.CV

    ParamISP: Learned Forward and Inverse ISPs using Camera Parameters

    Authors: Woohyeok Kim, Geonu Kim, Junyong Lee, Seungyong Lee, Seung-Hwan Baek, Sunghyun Cho

    Abstract: RAW images are rarely shared mainly due to its excessive data size compared to their sRGB counterparts obtained by camera ISPs. Learning the forward and inverse processes of camera ISPs has been recently demonstrated, enabling physically-meaningful RAW-level image processing on input sRGB images. However, existing learning-based ISP methods fail to handle the large variations in the ISP processes… ▽ More

    Submitted 14 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  11. arXiv:2312.05465  [pdf, other

    cs.LG eess.SY

    On Task-Relevant Loss Functions in Meta-Reinforcement Learning and Online LQR

    Authors: Jaeuk Shin, Giho Kim, Howon Lee, Joonho Han, Insoon Yang

    Abstract: Designing a competent meta-reinforcement learning (meta-RL) algorithm in terms of data usage remains a central challenge to be tackled for its successful real-world applications. In this paper, we propose a sample-efficient meta-RL algorithm that learns a model of the system or environment at hand in a task-directed manner. As opposed to the standard model-based approaches to meta-RL, our method e… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  12. arXiv:2310.12574  [pdf

    eess.IV cs.CV

    A reproducible 3D convolutional neural network with dual attention module (3D-DAM) for Alzheimer's disease classification

    Authors: Gia Minh Hoang, Youngjoo Lee, Jae Gwan Kim

    Abstract: Alzheimer's disease is one of the most common types of neurodegenerative disease, characterized by the accumulation of amyloid-beta plaque and tau tangles. Recently, deep learning approaches have shown promise in Alzheimer's disease diagnosis. In this study, we propose a reproducible model that utilizes a 3D convolutional neural network with a dual attention module for Alzheimer's disease classifi… ▽ More

    Submitted 2 July, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

  13. arXiv:2307.07409  [pdf, other

    cs.CL cs.AI eess.IV

    KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization

    Authors: Gangwoo Kim, Hajung Kim, Lei Ji, Seongsu Bae, Chanhwi Kim, Mujeen Sung, Hyunjae Kim, Kun Yan, Eric Chang, Jaewoo Kang

    Abstract: In this paper, we introduce CheXOFA, a new pre-trained vision-language model (VLM) for the chest X-ray domain. Our model is initially pre-trained on various multimodal datasets within the general domain before being transferred to the chest X-ray domain. Following a prominent VLM, we unify various domain-specific tasks into a simple sequence-to-sequence schema. It enables the model to effectively… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: Published at BioNLP workshop @ ACL 2023

  14. arXiv:2306.13361  [pdf, other

    physics.optics cs.CV eess.IV

    Neural 360$^\circ$ Structured Light with Learned Metasurfaces

    Authors: Eunsue Choi, Gyeongtae Kim, Jooyeong Yun, Yujin Jeon, Junsuk Rho, Seung-Hwan Baek

    Abstract: Structured light has proven instrumental in 3D imaging, LiDAR, and holographic light projection. Metasurfaces, comprised of sub-wavelength-sized nanostructures, facilitate 180$^\circ$ field-of-view (FoV) structured light, circumventing the restricted FoV inherent in traditional optics like diffractive optical elements. However, extant metasurface-facilitated structured light exhibits sub-optimal p… ▽ More

    Submitted 27 June, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

  15. arXiv:2306.04137  [pdf, other

    cs.MA eess.SY

    Multi-Agent Reinforcement Learning for Cooperative Air Transportation Services in City-Wide Autonomous Urban Air Mobility

    Authors: Chanyoung Park, Gyu Seon Kim, Soohyun Park, Soyi Jung, Joongheon Kim

    Abstract: The development of urban-air-mobility (UAM) is rapidly progressing with spurs, and the demand for efficient transportation management systems is a rising need due to the multifaceted environmental uncertainties. Thus, this paper proposes a novel air transportation service management algorithm based on multi-agent deep reinforcement learning (MADRL) to address the challenges of multi-UAM cooperatio… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: 15 pages, 14 figures

  16. arXiv:2306.00680  [pdf, other

    cs.SD cs.AI eess.AS

    Encoder-decoder multimodal speaker change detection

    Authors: Jee-weon Jung, Soonshin Seo, Hee-Soo Heo, Geonmin Kim, You Jin Kim, Young-ki Kwon, Minjae Lee, Bong-Jin Lee

    Abstract: The task of speaker change detection (SCD), which detects points where speakers change in an input, is essential for several applications. Several studies solved the SCD task using audio inputs only and have shown limited performance. Recently, multimodal SCD (MMSCD) models, which utilise text modality in addition to audio, have shown improved performance. In this study, the proposed model are bui… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 5 pages, accepted for presentation at INTERSPEECH 2023

  17. arXiv:2210.08997  [pdf, other

    cs.CV cs.LG eess.IV

    AIM 2022 Challenge on Instagram Filter Removal: Methods and Results

    Authors: Furkan Kınlı, Sami Menteş, Barış Özcan, Furkan Kıraç, Radu Timofte, Yi Zuo, Zitao Wang, Xiaowen Zhang, Yu Zhu, Chenghua Li, Cong Leng, Jian Cheng, Shuai Liu, Chaoyu Feng, Furui Bai, Xiaotao Wang, Lei Lei, Tianzhi Ma, Zihan Gao, Wenxin He, Woon-Ha Yeo, Wang-Taek Oh, Young-Il Kim, Han-Cheol Ryu, Gang He , et al. (8 additional authors not shown)

    Abstract: This paper introduces the methods and the results of AIM 2022 challenge on Instagram Filter Removal. Social media filters transform the images by consecutive non-linear operations, and the feature maps of the original content may be interpolated into a different domain. This reduces the overall performance of the recent deep learning strategies. The main goal of this challenge is to produce realis… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: 14 pages, 9 figures, Challenge report of AIM 2022 Instagram Filter Removal Challenge in conjunction with ECCV 2022

  18. arXiv:2207.01520  [pdf, other

    eess.IV cs.CV

    Adaptive GLCM sampling for transformer-based COVID-19 detection on CT

    Authors: Okchul Jung, Dong Un Kang, Gwanghyun Kim, Se Young Chun

    Abstract: The world has suffered from COVID-19 (SARS-CoV-2) for the last two years, causing much damage and change in people's daily lives. Thus, automated detection of COVID-19 utilizing deep learning on chest computed tomography (CT) scans became promising, which helps correct diagnosis efficiently. Recently, transformer-based COVID-19 detection method on CT is proposed to utilize 3D information in CT vol… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: 6 pages

  19. arXiv:2205.12633  [pdf, other

    cs.CV eess.IV

    NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

    Authors: Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang , et al. (68 additional authors not shown)

    Abstract: This paper reviews the challenge on constrained high dynamic range (HDR) imaging that was part of the New Trends in Image Restoration and Enhancement (NTIRE) workshop, held in conjunction with CVPR 2022. This manuscript focuses on the competition set-up, datasets, the proposed methods and their results. The challenge aims at estimating an HDR image from multiple respective low dynamic range (LDR)… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: CVPR Workshops 2022. 15 pages, 21 figures, 2 tables

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022

  20. arXiv:2205.01304  [pdf, other

    eess.AS cs.SD

    Efficient dynamic filter for robust and low computational feature extraction

    Authors: Donghyeon Kim, Gwantae Kim, Bokyeung Lee, Jeong-gi Kwak, David K. Han, Hanseok Ko

    Abstract: Unseen noise signal which is not considered in a model training process is difficult to anticipate and would lead to performance degradation. Various methods have been investigated to mitigate unseen noise. In our previous work, an Instance-level Dynamic Filter (IDF) and a Pixel Dynamic Filter (PDF) were proposed to extract noise-robust features. However, the performance of the dynamic filter migh… ▽ More

    Submitted 20 October, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

    Comments: Accept to SLT2022

  21. Fetal Brain Tissue Annotation and Segmentation Challenge Results

    Authors: Kelly Payette, Hongwei Li, Priscille de Dumast, Roxane Licandro, Hui Ji, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Hao Liu, Yuchen Pei, Lisheng Wang, Ying Peng, Juanying Xie, Huiquan Zhang, Guiming Dong, Hao Fu, Guotai Wang, ZunHyan Rieu, Donghyeon Kim, Hyun Gi Kim, Davood Karimi, Ali Gholipour, Helena R. Torres, Bruno Oliveira, João L. Vilaça , et al. (33 additional authors not shown)

    Abstract: In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the developing human brain. Automatic segmentation of the developing fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variabili… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: Results from FeTA Challenge 2021, held at MICCAI; Manuscript submitted

  22. arXiv:2202.06431  [pdf, other

    eess.IV cs.CV cs.LG

    AI can evolve without labels: self-evolving vision transformer for chest X-ray diagnosis through knowledge distillation

    Authors: Sangjoon Park, Gwanghyun Kim, Yujin Oh, Joon Beom Seo, Sang Min Lee, Jin Hwan Kim, Sungjun Moon, Jae-Kwang Lim, Chang Min Park, Jong Chul Ye

    Abstract: Although deep learning-based computer-aided diagnosis systems have recently achieved expert-level performance, developing a robust deep learning model requires large, high-quality data with manual annotation, which is expensive to obtain. This situation poses the problem that the chest x-rays collected annually in hospitals cannot be used due to the lack of manual labeling by experts, especially i… ▽ More

    Submitted 13 February, 2022; originally announced February 2022.

    Comments: 24 pages

  23. arXiv:2201.06735  [pdf

    eess.SP

    AI Augmented Digital Metal Component

    Authors: Eunhyeok Seo, Hyokyung Sung, Hayeol Kim, Taekyeong Kim, Sangeun Park, Min Sik Lee, Seung Ki Moon, Jung Gi Kim, Hayoung Chung, Seong-Kyum Choi, Ji-hun Yu, Kyung Tae Kim, Seong Jin Park, Namhun Kim, Im Doo Jung

    Abstract: The aim of this work is to propose a new paradigm that imparts intelligence to metal parts with the fusion of metal additive manufacturing and artificial intelligence (AI). Our digital metal part classifies the status with real time data processing with convolutional neural network (CNN). The training data for the CNN is collected from a strain gauge embedded in metal parts by laser powder bed fus… ▽ More

    Submitted 17 January, 2022; originally announced January 2022.

    Comments: 46 pages

  24. CrossMoDA 2021 challenge: Benchmark of Cross-Modality Domain Adaptation techniques for Vestibular Schwannoma and Cochlea Segmentation

    Authors: Reuben Dorent, Aaron Kujawa, Marina Ivory, Spyridon Bakas, Nicola Rieke, Samuel Joutard, Ben Glocker, Jorge Cardoso, Marc Modat, Kayhan Batmanghelich, Arseniy Belkov, Maria Baldeon Calisto, Jae Won Choi, Benoit M. Dawant, Hexin Dong, Sergio Escalera, Yubo Fan, Lasse Hansen, Mattias P. Heinrich, Smriti Joshi, Victoriya Kashtanova, Hyeon Gyu Kim, Satoshi Kondo, Christian N. Kruse, Susana K. Lai-Yuen , et al. (15 additional authors not shown)

    Abstract: Domain Adaptation (DA) has recently raised strong interests in the medical imaging community. While a large variety of DA techniques has been proposed for image segmentation, most of these techniques have been validated either on private datasets or on small publicly available datasets. Moreover, these datasets mostly addressed single-class problems. To tackle these limitations, the Cross-Modality… ▽ More

    Submitted 14 December, 2022; v1 submitted 8 January, 2022; originally announced January 2022.

    Comments: In Medical Image Analysis

  25. arXiv:2111.04028  [pdf, other

    cs.CV eess.IV

    Style Transfer with Target Feature Palette and Attention Coloring

    Authors: Suhyeon Ha, Guisik Kim, Junseok Kwon

    Abstract: Style transfer has attracted a lot of attentions, as it can change a given image into one with splendid artistic styles while preserving the image structure. However, conventional approaches easily lose image details and tend to produce unpleasant artifacts during style transfer. In this paper, to solve these problems, a novel artistic stylization method with target feature palettes is proposed, w… ▽ More

    Submitted 7 November, 2021; originally announced November 2021.

  26. arXiv:2111.01338  [pdf, other

    eess.IV cs.AI cs.CV

    Federated Split Vision Transformer for COVID-19 CXR Diagnosis using Task-Agnostic Training

    Authors: Sangjoon Park, Gwanghyun Kim, Jeongsol Kim, Boah Kim, Jong Chul Ye

    Abstract: Federated learning, which shares the weights of the neural network across clients, is gaining attention in the healthcare sector as it enables training on a large corpus of decentralized data while maintaining data privacy. For example, this enables neural network training for COVID-19 diagnosis on chest X-ray (CXR) images without collecting patient CXR data across multiple hospitals. Unfortunatel… ▽ More

    Submitted 3 November, 2021; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: Accepted for NeurIPS 2021

  27. arXiv:2110.03326  [pdf, other

    cs.CL cs.SD eess.AS

    Back from the future: bidirectional CTC decoding using future information in speech recognition

    Authors: Namkyu Jung, Geonmin Kim, Han-Gyu Kim

    Abstract: In this paper, we propose a simple but effective method to decode the output of Connectionist Temporal Classifier (CTC) model using a bi-directional neural language model. The bidirectional language model uses the future as well as the past information in order to predict the next output in the sequence. The proposed method based on bi-directional beam search takes advantage of the CTC greedy deco… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: submitted to ICASSP 2022

  28. arXiv:2110.02791  [pdf, other

    cs.SD cs.CL eess.AS

    Spell my name: keyword boosted speech recognition

    Authors: Namkyu Jung, Geonmin Kim, Joon Son Chung

    Abstract: Recognition of uncommon words such as names and technical terminology is important to understanding conversations in context. However, the ability to recognise such words remains a challenge in modern automatic speech recognition (ASR) systems. In this paper, we propose a simple but powerful ASR decoding method that can better recognise these uncommon keywords, which in turn enables better reada… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

  29. arXiv:2109.09041  [pdf, other

    cs.RO eess.SY

    Online Distributed Trajectory Planning for Quadrotor Swarm with Feasibility Guarantee using Linear Safe Corridor

    Authors: Jungwon Park, Dabin Kim, Gyeong Chan Kim, Dahyun Oh, H. Jin Kim

    Abstract: This paper presents a new online multi-agent trajectory planning algorithm that guarantees to generate safe, dynamically feasible trajectories in a cluttered environment. The proposed algorithm utilizes a linear safe corridor (LSC) to formulate the distributed trajectory optimization problem with only feasible constraints, so it does not resort to slack variables or soft constraints to avoid optim… ▽ More

    Submitted 3 January, 2022; v1 submitted 18 September, 2021; originally announced September 2021.

    Comments: 8 pages, RA-L 2022 under review

  30. Infusing model predictive control into meta-reinforcement learning for mobile robots in dynamic environments

    Authors: Jaeuk Shin, Astghik Hakobyan, Mingyu Park, Yeoneung Kim, Gihun Kim, Insoon Yang

    Abstract: The successful operation of mobile robots requires them to adapt rapidly to environmental changes. To develop an adaptive decision-making tool for mobile robots, we propose a novel algorithm that combines meta-reinforcement learning (meta-RL) with model predictive control (MPC). Our method employs an off-policy meta-RL algorithm as a baseline to train a policy using transition samples generated by… ▽ More

    Submitted 7 July, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

    Comments: Accepted for publication in the IEEE Robotics and Automation Letters

    Journal ref: IEEE Robotics and Automation Letters, 2022

  31. arXiv:2104.07235  [pdf, other

    eess.IV cs.CV cs.LG

    Vision Transformer using Low-level Chest X-ray Feature Corpus for COVID-19 Diagnosis and Severity Quantification

    Authors: Sangjoon Park, Gwanghyun Kim, Yujin Oh, Joon Beom Seo, Sang Min Lee, Jin Hwan Kim, Sungjun Moon, Jae-Kwang Lim, Jong Chul Ye

    Abstract: Developing a robust algorithm to diagnose and quantify the severity of COVID-19 using Chest X-ray (CXR) requires a large number of well-curated COVID-19 datasets, which is difficult to collect under the global COVID-19 pandemic. On the other hand, CXR data with other findings are abundant. This situation is ideally suited for the Vision Transformer (ViT) architecture, where a lot of unlabeled data… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

    Comments: 13 pages

  32. arXiv:2104.06782  [pdf, other

    cs.CV eess.IV

    Visual Comfort Aware-Reinforcement Learning for Depth Adjustment of Stereoscopic 3D Images

    Authors: Hak Gu Kim, Minho Park, Sangmin Lee, Seongyeop Kim, Yong Man Ro

    Abstract: Depth adjustment aims to enhance the visual experience of stereoscopic 3D (S3D) images, which accompanied with improving visual comfort and depth perception. For a human expert, the depth adjustment procedure is a sequence of iterative decision making. The human expert iteratively adjusts the depth until he is satisfied with the both levels of visual comfort and the perceived depth. In this work,… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Comments: AAAI 2021

  33. arXiv:2104.06780  [pdf, other

    cs.CV eess.IV

    Towards a Better Understanding of VR Sickness: Physical Symptom Prediction for VR Contents

    Authors: Hak Gu Kim, Sangmin Lee, Seongyeop Kim, Heoun-taek Lim, Yong Man Ro

    Abstract: We address the black-box issue of VR sickness assessment (VRSA) by evaluating the level of physical symptoms of VR sickness. For the VR contents inducing the similar VR sickness level, the physical symptoms can vary depending on the characteristics of the contents. Most of existing VRSA methods focused on assessing the overall VR sickness score. To make better understanding of VR sickness, it is r… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Comments: AAAI 2021

  34. arXiv:2103.09022  [pdf, other

    eess.IV cs.CV cs.LG

    Missing Cone Artifacts Removal in ODT using Unsupervised Deep Learning in Projection Domain

    Authors: Hyungjin Chung, Jaeyoung Huh, Geon Kim, Yong Keun Park, Jong Chul Ye

    Abstract: Optical diffraction tomography (ODT) produces three dimensional distribution of refractive index (RI) by measuring scattering fields at various angles. Although the distribution of RI index is highly informative, due to the missing cone problem stemming from the limited-angle acquisition of holograms, reconstructions have very poor resolution along axial direction compared to the horizontal imagin… ▽ More

    Submitted 18 July, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

    Comments: This will appear in IEEE Trans. on Computational Imaging

  35. arXiv:2103.07062  [pdf, other

    eess.IV cs.CV cs.LG

    Severity Quantification and Lesion Localization of COVID-19 on CXR using Vision Transformer

    Authors: Gwanghyun Kim, Sangjoon Park, Yujin Oh, Joon Beom Seo, Sang Min Lee, Jin Hwan Kim, Sungjun Moon, Jae-Kwang Lim, Jong Chul Ye

    Abstract: Under the global pandemic of COVID-19, building an automated framework that quantifies the severity of COVID-19 and localizes the relevant lesion on chest X-ray images has become increasingly important. Although pixel-level lesion severity labels, e.g. lesion segmentation, can be the most excellent target to build a robust model, collecting enough data with such labels is difficult due to time and… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

    Comments: 8 pages

  36. arXiv:2103.07055  [pdf, other

    eess.IV cs.CV cs.LG

    Vision Transformer for COVID-19 CXR Diagnosis using Chest X-ray Feature Corpus

    Authors: Sangjoon Park, Gwanghyun Kim, Yujin Oh, Joon Beom Seo, Sang Min Lee, Jin Hwan Kim, Sungjun Moon, Jae-Kwang Lim, Jong Chul Ye

    Abstract: Under the global COVID-19 crisis, developing robust diagnosis algorithm for COVID-19 using CXR is hampered by the lack of the well-curated COVID-19 data set, although CXR data with other disease are abundant. This situation is suitable for vision transformer architecture that can exploit the abundant unlabeled data using pre-training. However, the direct use of existing vision transformer that use… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

    Comments: 10 pages

  37. arXiv:2102.08567  [pdf

    cs.CV cs.AI eess.IV

    Ensemble Transfer Learning of Elastography and B-mode Breast Ultrasound Images

    Authors: Sampa Misra, Seungwan Jeon, Ravi Managuli, Seiyon Lee, Gyuwon Kim, Seungchul Lee, Richard G Barr, Chulhong Kim

    Abstract: Computer-aided detection (CAD) of benign and malignant breast lesions becomes increasingly essential in breast ultrasound (US) imaging. The CAD systems rely on imaging features identified by the medical experts for their performance, whereas deep learning (DL) methods automatically extract features from the data. The challenge of the DL is the insufficiency of breast US images available to train t… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: 17 pages, 10 figures, 6 Tables

  38. arXiv:2010.13105  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Two-stage Textual Knowledge Distillation for End-to-End Spoken Language Understanding

    Authors: Seongbin Kim, Gyuwan Kim, Seongjin Shin, Sangmin Lee

    Abstract: End-to-end approaches open a new way for more accurate and efficient spoken language understanding (SLU) systems by alleviating the drawbacks of traditional pipeline systems. Previous works exploit textual information for an SLU model via pre-training with automatic speech recognition or fine-tuning with knowledge distillation. To utilize textual information more effectively, this work proposes a… ▽ More

    Submitted 10 June, 2021; v1 submitted 25 October, 2020; originally announced October 2020.

    Comments: ICASSP 2021; 5 pages, 1 figure

  39. arXiv:2010.01721  [pdf

    eess.IV

    Motion Correction of 3D Dynamic Contrast-Enhanced Ultrasound Imaging without Anatomical Bmode Images

    Authors: Jia-Shu Chen, Maged Goubran Ph. D., Gaeun Kim, Jurgen K. Willmann M. D., Michael Zeineh M. D., Ph. D., Dimitre Hristov Ph. D., Ahmed El Kaffas Ph. D

    Abstract: In conventional 2D DCE-US, motion correction algorithms take advantage of accompanying side-by-side anatomical Bmode images that contain time-stable features. However, current commercial models of 3D DCE-US do not provide side-by-side Bmode images, which makes motion correction challenging. This work introduces a novel motion correction (MC) algorithm for 3D DCE-US and assesses its efficacy when h… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

  40. arXiv:2009.13777  [pdf

    eess.IV physics.bio-ph physics.optics

    DeepRegularizer: Rapid Resolution Enhancement of Tomographic Imaging using Deep Learning

    Authors: DongHun Ryu, Dongmin Ryu, YoonSeok Baek, Hyungjoo Cho, Geon Kim, Young Seo Kim, Yongki Lee, Yoosik Kim, Jong Chul Ye, Hyun-Seok Min, YongKeun Park

    Abstract: Optical diffraction tomography measures the three-dimensional refractive index map of a specimen and visualizes biochemical phenomena at the nanoscale in a non-destructive manner. One major drawback of optical diffraction tomography is poor axial resolution due to limited access to the three-dimensional optical transfer function. This missing cone problem has been addressed through regularization… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

  41. arXiv:2009.09282  [pdf, other

    eess.IV cs.CV cs.LG

    Reducing false-positive biopsies with deep neural networks that utilize local and global information in screening mammograms

    Authors: Nan Wu, Zhe Huang, Yiqiu Shen, Jungkyu Park, Jason Phang, Taro Makino, S. Gene Kim, Kyunghyun Cho, Laura Heacock, Linda Moy, Krzysztof J. Geras

    Abstract: Breast cancer is the most common cancer in women, and hundreds of thousands of unnecessary biopsies are done around the world at a tremendous cost. It is crucial to reduce the rate of biopsies that turn out to be benign tissue. In this study, we build deep neural networks (DNNs) to classify biopsied lesions as being either malignant or benign, with the goal of using these networks as second reader… ▽ More

    Submitted 19 September, 2020; originally announced September 2020.

  42. arXiv:2008.12493  [pdf, other

    eess.IV cs.CV

    DALE : Dark Region-Aware Low-light Image Enhancement

    Authors: Dokyeong Kwon, Guisik Kim, Junseok Kwon

    Abstract: In this paper, we present a novel low-light image enhancement method called dark region-aware low-light image enhancement (DALE), where dark regions are accurately recognized by the proposed visual attention module and their brightness are intensively enhanced. Our method can estimate the visual attention in an efficient manner using super-pixels without any complicated process. Thus, the method c… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

    Comments: 12 pages, 7 figures, The 31st British Machine Vision Conference

  43. arXiv:2008.01950  [pdf

    eess.SP cs.NE

    Area-wide traffic signal control based on a deep graph Q-Network (DGQN) trained in an asynchronous manner

    Authors: Gyeongjun Kim, Keemin Sohn

    Abstract: Reinforcement learning (RL) algorithms have been widely applied in traffic signal studies. There are, however, several problems in jointly controlling traffic lights for a large transportation network. First, the action space exponentially explodes as the number of intersections to be jointly controlled increases. Although a multi-agent RL algorithm has been used to solve the curse of dimensionali… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

    Comments: 34 pages, 10 figures, and 4 tables

    MSC Class: 68T05 (Primary) ACM Class: I.2.6

  44. arXiv:2005.01996  [pdf, other

    eess.IV cs.CV

    NTIRE 2020 Challenge on Real-World Image Super-Resolution: Methods and Results

    Authors: Andreas Lugmayr, Martin Danelljan, Radu Timofte, Namhyuk Ahn, Dongwoon Bai, Jie Cai, Yun Cao, Junyang Chen, Kaihua Cheng, SeYoung Chun, Wei Deng, Mostafa El-Khamy, Chiu Man Ho, Xiaozhong Ji, Amin Kheradmand, Gwantae Kim, Hanseok Ko, Kanghyu Lee, Jungwon Lee, Hao Li, Ziluan Liu, Zhi-Song Liu, Shuai Liu, Yunhua Lu, Zibo Meng , et al. (21 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2020 challenge on real world super-resolution. It focuses on the participating methods and final results. The challenge addresses the real world setting, where paired true high and low-resolution images are unavailable. For training, only one set of source input images is therefore provided along with a set of unpaired high-quality target images. In Track 1: Image Proc… ▽ More

    Submitted 5 May, 2020; originally announced May 2020.

  45. arXiv:2005.01056  [pdf, other

    eess.IV cs.CV

    NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results

    Authors: Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Jing Liu, Kwangjin Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Dongliang He , et al. (38 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor 16 based on a set of prior examples of low and corresponding high resolution images. The goal is to obtain a network design capable to produce high resolution results with the best percept… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

    Comments: CVPRW 2020

  46. arXiv:2004.03842  [pdf, other

    cs.CV cs.LG cs.RO eess.SP

    Multi-Head Attention based Probabilistic Vehicle Trajectory Prediction

    Authors: Hayoung Kim, Dongchan Kim, Gihoon Kim, Jeongmin Cho, Kunsoo Huh

    Abstract: This paper presents online-capable deep learning model for probabilistic vehicle trajectory prediction. We propose a simple encoder-decoder architecture based on multi-head attention. The proposed model generates the distribution of the predicted trajectories for multiple vehicles in parallel. Our approach to model the interactions can learn to attend to a few influential vehicles in an unsupervis… ▽ More

    Submitted 4 July, 2020; v1 submitted 8 April, 2020; originally announced April 2020.

    Comments: 6 pages, 5 figures, 2020 IEEE Intelligent Vehicles Symposium (IV)

  47. arXiv:2002.07613  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization

    Authors: Yiqiu Shen, Nan Wu, Jason Phang, Jungkyu Park, Kangning Liu, Sudarshini Tyagi, Laura Heacock, S. Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

    Abstract: Medical images differ from natural images in significantly higher resolutions and smaller regions of interest. Because of these differences, neural network architectures that work well for natural images might not be applicable to medical image analysis. In this work, we extend the globally-aware multiple instance classifier, a framework we proposed to address these unique properties of medical im… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

  48. arXiv:1912.11027  [pdf, other

    eess.IV cs.CV cs.LG

    Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

    Authors: William Lotter, Abdul Rahman Diab, Bryan Haslam, Jiye G. Kim, Giorgia Grisot, Eric Wu, Kevin Wu, Jorge Onieva Onieva, Jerrold L. Boxerman, Meiyun Wang, Mack Bandler, Gopal Vijayaraghavan, A. Gregory Sorensen

    Abstract: Breast cancer remains a global challenge, causing over 1 million deaths globally in 2018. To achieve earlier breast cancer detection, screening x-ray mammography is recommended by health organizations worldwide and has been estimated to decrease breast cancer mortality by 20-40%. Nevertheless, significant false positive and false negative rates, as well as high interpretation costs, leave opportun… ▽ More

    Submitted 27 December, 2019; v1 submitted 23 December, 2019; originally announced December 2019.

  49. arXiv:1912.05470  [pdf, other

    physics.med-ph cs.CV eess.IV

    Human Gist Processing Augments Deep Learning Breast Cancer Risk Assessment

    Authors: Skylar W. Wurster, Arkadiusz Sitek, Jian Chen, Karla Evans, Gaeun Kim, Jeremy M. Wolfe

    Abstract: Radiologists can classify a mammogram as normal or abnormal at better than chance levels after less than a second's exposure to the images. In this work, we combine these radiologists' gist inputs into pre-trained machine learning models to validate that integrating gist with a CNN model can achieve an AUC (area under the curve) statistically significantly higher than either the gist perception of… ▽ More

    Submitted 27 November, 2019; originally announced December 2019.

  50. arXiv:1908.00615  [pdf, other

    eess.IV cs.CV stat.ML

    Improving localization-based approaches for breast cancer screening exam classification

    Authors: Thibault Févry, Jason Phang, Nan Wu, S. Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

    Abstract: We trained and evaluated a localization-based deep CNN for breast cancer screening exam classification on over 200,000 exams (over 1,000,000 images). Our model achieves an AUC of 0.919 in predicting malignancy in patients undergoing breast cancer screening, reducing the error rate of the baseline (Wu et al., 2019a) by 23%. In addition, the models generates bounding boxes for benign and malignant f… ▽ More

    Submitted 1 August, 2019; originally announced August 2019.

    Comments: MIDL 2019 [arXiv:1907.08612]

    Report number: MIDL/2019/ExtendedAbstract/HyxoAR_AK4