Skip to main content

Showing 1–50 of 67 results for author: Ko, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03958  [pdf, other

    cs.CL cs.CV

    Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge

    Authors: Young-Jun Lee, Dokyong Lee, Junyoung Youn, Kyeongjin Oh, Byungsoo Ko, Jonghwan Hyeon, Ho-Jin Choi

    Abstract: Humans share a wide variety of images related to their personal experiences within conversations via instant messaging tools. However, existing works focus on (1) image-sharing behavior in singular sessions, leading to limited long-term social interaction, and (2) a lack of personalized image-sharing behavior. In this work, we introduce Stark, a large-scale long-term multi-modal conversation datas… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Project website: https://stark-dataset.github.io

  2. arXiv:2406.09998  [pdf, other

    eess.AS cs.AI cs.LG cs.MM cs.SD

    Understanding Pedestrian Movement Using Urban Sensing Technologies: The Promise of Audio-based Sensors

    Authors: Chaeyeon Han, Pavan Seshadri, Yiwei Ding, Noah Posner, Bon Woo Koo, Animesh Agrawal, Alexander Lerch, Subhrajit Guhathakurta

    Abstract: While various sensors have been deployed to monitor vehicular flows, sensing pedestrian movement is still nascent. Yet walking is a significant mode of travel in many cities, especially those in Europe, Africa, and Asia. Understanding pedestrian volumes and flows is essential for designing safer and more attractive pedestrian infrastructure and for controlling periodic overcrowding. This study dis… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: submitted to Urban Informatics

  3. arXiv:2406.09442  [pdf

    physics.med-ph cs.LG physics.app-ph physics.bio-ph

    An insertable glucose sensor using a compact and cost-effective phosphorescence lifetime imager and machine learning

    Authors: Artem Goncharov, Zoltan Gorocs, Ridhi Pradhan, Brian Ko, Ajmal Ajmal, Andres Rodriguez, David Baum, Marcell Veszpremi, Xilin Yang, Maxime Pindrys, Tianle Zheng, Oliver Wang, Jessica C. Ramella-Roman, Michael J. McShane, Aydogan Ozcan

    Abstract: Optical continuous glucose monitoring (CGM) systems are emerging for personalized glucose management owing to their lower cost and prolonged durability compared to conventional electrochemical CGMs. Here, we report a computational CGM system, which integrates a biocompatible phosphorescence-based insertable biosensor and a custom-designed phosphorescence lifetime imager (PLI). This compact and cos… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 24 Pages, 4 Figures

  4. arXiv:2405.16685  [pdf

    cs.DC

    EdgeSphere: A Three-Tier Architecture for Cognitive Edge Computing

    Authors: Christian Makaya, Keith Grueneberg, Bongjun Ko, David Wood, Nirmit Desai, Xiping Wang

    Abstract: Computing at the edge is increasingly important as Internet of Things (IoT) devices at the edge generate massive amounts of data and pose challenges in transporting all that data to the Cloud where they can be analyzed. On the other hand, harnessing the edge data is essential for offering cognitive applications, if the challenges, such as device capabilities, connectivity, and heterogeneity can be… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  5. arXiv:2405.12648  [pdf, other

    cs.CV cs.AI

    Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency

    Authors: Hyeongjin Kim, Sangwon Kim, Dasom Ahn, Jong Taek Lee, Byoung Chul Ko

    Abstract: Scene graph generation (SGG) is an important task in image understanding because it represents the relationships between objects in an image as a graph structure, making it possible to understand the semantic relationships between objects intuitively. Previous SGG studies used a message-passing neural networks (MPNN) to update features, which can effectively reflect information about surrounding o… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML2024

  6. arXiv:2404.05256  [pdf, other

    cs.CV cs.AI

    StyleForge: Enhancing Text-to-Image Synthesis for Any Artistic Styles with Dual Binding

    Authors: Junseo Park, Beomseok Ko, Hyeryung Jang

    Abstract: Recent advancements in text-to-image models, such as Stable Diffusion, have showcased their ability to create visual images from natural language prompts. However, existing methods like DreamBooth struggle with capturing arbitrary art styles due to the abstract and multifaceted nature of stylistic attributes. We introduce Single-StyleForge, a novel approach for personalized text-to-image synthesis… ▽ More

    Submitted 17 July, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: 20 pages, 12 figuers

  7. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  8. arXiv:2312.05814  [pdf, other

    cs.AI cs.SD eess.AS

    Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks

    Authors: Seo-Hyun Lee, Young-Eun Lee, Soowon Kim, Byung-Kwan Ko, Jun-Young Kim, Seong-Whan Lee

    Abstract: Brain-to-speech technology represents a fusion of interdisciplinary applications encompassing fields of artificial intelligence, brain-computer interfaces, and speech synthesis. Neural representation learning based intention decoding and speech synthesis directly connects the neural activity to the means of human linguistic communication, which may greatly enhance the naturalness of communication.… ▽ More

    Submitted 26 February, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

    Comments: 4 pages

  9. arXiv:2311.01192  [pdf, other

    cs.CV

    Semantic Scene Graph Generation Based on an Edge Dual Scene Graph and Message Passing Neural Network

    Authors: Hyeongjin Kim, Sangwon Kim, Jong Taek Lee, Byoung Chul Ko

    Abstract: Along with generative AI, interest in scene graph generation (SGG), which comprehensively captures the relationships and interactions between objects in an image and creates a structured graph-based representation, has significantly increased in recent years. However, relying on object-centric and dichotomous relationships, existing SGG methods have a limited ability to accurately predict detailed… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  10. arXiv:2309.10825  [pdf, other

    eess.IV cs.LG q-bio.QM

    Latent Disentanglement in Mesh Variational Autoencoders Improves the Diagnosis of Craniofacial Syndromes and Aids Surgical Planning

    Authors: Simone Foti, Alexander J. Rickart, Bongjin Koo, Eimear O' Sullivan, Lara S. van de Lande, Athanasios Papaioannou, Roman Khonsari, Danail Stoyanov, N. u. Owase Jeelani, Silvia Schievano, David J. Dunaway, Matthew J. Clarkson

    Abstract: The use of deep learning to undertake shape analysis of the complexities of the human head holds great promise. However, there have traditionally been a number of barriers to accurate modelling, especially when operating on both a global and local level. In this work, we will discuss the application of the Swap Disentangled Variational Autoencoder (SD-VAE) with relevance to Crouzon, Apert and Muen… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  11. arXiv:2309.06531  [pdf, other

    eess.AS cs.SD

    ASPED: An Audio Dataset for Detecting Pedestrians

    Authors: Pavan Seshadri, Chaeyeon Han, Bon-Woo Koo, Noah Posner, Subhrajit Guhathakurta, Alexander Lerch

    Abstract: We introduce the new audio analysis task of pedestrian detection and present a new large-scale dataset for this task. While the preliminary results prove the viability of using audio approaches for pedestrian detection, they also show that this challenging task cannot be easily solved with standard approaches.

    Submitted 16 January, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: 4+1 pages, ICASSP 2024

  12. arXiv:2306.07274  [pdf, other

    cs.CV q-bio.BM

    CryoChains: Heterogeneous Reconstruction of Molecular Assembly of Semi-flexible Chains from Cryo-EM Images

    Authors: Bongjin Koo, Julien Martel, Ariana Peck, Axel Levy, Frédéric Poitevin, Nina Miolane

    Abstract: Cryogenic electron microscopy (cryo-EM) has transformed structural biology by allowing to reconstruct 3D biomolecular structures up to near-atomic resolution. However, the 3D reconstruction process remains challenging, as the 3D structures may exhibit substantial shape variations, while the 2D image acquisition suffers from a low signal-to-noise ratio, requiring to acquire very large datasets that… ▽ More

    Submitted 15 July, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

  13. arXiv:2303.08906  [pdf, other

    cs.CV

    VVS: Video-to-Video Retrieval with Irrelevant Frame Suppression

    Authors: Won Jo, Geuntaek Lim, Gwangjin Lee, Hyunwoo Kim, Byungsoo Ko, Yukyung Choi

    Abstract: In content-based video retrieval (CBVR), dealing with large-scale collections, efficiency is as important as accuracy; thus, several video-level feature-based studies have actively been conducted. Nevertheless, owing to the severe difficulty of embedding a lengthy and untrimmed video into a single feature, these studies have been insufficient for accurate retrieval compared to frame-level feature-… ▽ More

    Submitted 19 December, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: AAAI-24

  14. arXiv:2302.12798  [pdf, other

    cs.CV cs.GR cs.LG

    3D Generative Model Latent Disentanglement via Local Eigenprojection

    Authors: Simone Foti, Bongjin Koo, Danail Stoyanov, Matthew J. Clarkson

    Abstract: Designing realistic digital humans is extremely complex. Most data-driven generative models used to simplify the creation of their underlying geometric shape do not offer control over the generation of local shape attributes. In this paper, we overcome this limitation by introducing a novel loss function grounded in spectral geometry and applicable to different neural-network-based generative mode… ▽ More

    Submitted 4 April, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

    Comments: Computer Graphics Forum 2023

  15. arXiv:2212.10926  [pdf, other

    cs.ET eess.SP

    The Internet of Bio-Nano Things in Blood Vessels: System Design and Prototypes

    Authors: Changmin Lee, Bon-Hong Koo, Chan-Byoung Chae, Robert Schober

    Abstract: In this paper, we investigate the Internet of Bio-Nano Things (IoBNT) which relates to networks formed by molecular communications. By providing a means of communication through the ubiquitously connected blood vessels (arteries, veins, and capillaries), molecular communication-based IoBNT enables a host of new eHealth applications. For example, an organ monitoring sensor can transfer internal bod… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  16. arXiv:2212.05638  [pdf, other

    cs.CV

    Cross-Modal Learning with 3D Deformable Attention for Action Recognition

    Authors: Sangwon Kim, Dasom Ahn, Byoung Chul Ko

    Abstract: An important challenge in vision-based action recognition is the embedding of spatiotemporal features with two or more heterogeneous modalities into a single feature. In this study, we propose a new 3D deformable transformer for action recognition with adaptive spatiotemporal receptive fields and a cross-modal learning scheme. The 3D deformable transformer consists of three attention modules: 3D d… ▽ More

    Submitted 17 August, 2023; v1 submitted 11 December, 2022; originally announced December 2022.

    Comments: Accepted by ICCV2023

  17. arXiv:2212.04119  [pdf, other

    cs.CV cs.CL

    DialogCC: An Automated Pipeline for Creating High-Quality Multi-Modal Dialogue Dataset

    Authors: Young-Jun Lee, Byungsoo Ko, Han-Gyu Kim, Jonghwan Hyeon, Ho-Jin Choi

    Abstract: As sharing images in an instant message is a crucial factor, there has been active research on learning an image-text multi-modal dialogue models. However, training a well-generalized multi-modal dialogue model remains challenging due to the low quality and limited diversity of images per dialogue in existing multi-modal dialogue datasets. In this paper, we propose an automated pipeline to constru… ▽ More

    Submitted 29 March, 2024; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: NAACL 2024

  18. arXiv:2212.04114  [pdf, other

    cs.CV

    Group Generalized Mean Pooling for Vision Transformer

    Authors: Byungsoo Ko, Han-Gyu Kim, Byeongho Heo, Sangdoo Yun, Sanghyuk Chun, Geonmo Gu, Wonjae Kim

    Abstract: Vision Transformer (ViT) extracts the final representation from either class token or an average of all patch tokens, following the architecture of Transformer in Natural Language Processing (NLP) or Convolutional Neural Networks (CNNs) in computer vision. However, studies for the best way of aggregating the patch tokens are still limited to average pooling, while widely-used pooling strategies, s… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

  19. arXiv:2212.02047  [pdf, other

    cs.HC

    Towards Neural Decoding of Imagined Speech based on Spoken Speech

    Authors: Seo-Hyun Lee, Young-Eun Lee, Soowon Kim, Byung-Kwan Ko, Seong-Whan Lee

    Abstract: Decoding imagined speech from human brain signals is a challenging and important issue that may enable human communication via brain signals. While imagined speech can be the paradigm for silent communication via brain signals, it is always hard to collect enough stable data to train the decoding model. Meanwhile, spoken speech data is relatively easy and to obtain, implying the significance of ut… ▽ More

    Submitted 14 February, 2023; v1 submitted 5 December, 2022; originally announced December 2022.

    Comments: 4 pages, 2 figures

  20. arXiv:2210.07503  [pdf, other

    cs.CV cs.AI

    STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action Recognition

    Authors: Dasom Ahn, Sangwon Kim, Hyunsu Hong, Byoung Chul Ko

    Abstract: In action recognition, although the combination of spatio-temporal videos and skeleton features can improve the recognition performance, a separate model and balancing feature representation for cross-modal data are required. To solve these problems, we propose Spatio-TemporAl cRoss (STAR)-transformer, which can effectively represent two cross-modal features as a recognizable vector. First, from t… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: Accepted by WACV 2023

    MSC Class: 68T07

  21. arXiv:2210.02254  [pdf, other

    cs.CV

    Granularity-aware Adaptation for Image Retrieval over Multiple Tasks

    Authors: Jon Almazán, Byungsoo Ko, Geonmo Gu, Diane Larlus, Yannis Kalantidis

    Abstract: Strong image search models can be learned for a specific domain, ie. set of labels, provided that some labeled images of that domain are available. A practical visual search model, however, should be versatile enough to solve multiple retrieval tasks simultaneously, even if those cover very different specialized domains. Additionally, it should be able to benefit from even unlabeled images from th… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: ECCV 2022

  22. arXiv:2209.15121  [pdf, other

    q-bio.BM cs.CV eess.IV physics.chem-ph

    Heterogeneous reconstruction of deformable atomic models in Cryo-EM

    Authors: Youssef Nashed, Ariana Peck, Julien Martel, Axel Levy, Bongjin Koo, Gordon Wetzstein, Nina Miolane, Daniel Ratner, Frédéric Poitevin

    Abstract: Cryogenic electron microscopy (cryo-EM) provides a unique opportunity to study the structural heterogeneity of biomolecules. Being able to explain this heterogeneity with atomic models would help our understanding of their functional mechanisms but the size and ruggedness of the structural space (the space of atomic 3D cartesian coordinates) presents an immense challenge. Here, we describe a heter… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: 8 pages, 1 figure

  23. arXiv:2208.04545  [pdf, ps, other

    cs.IT eess.SP

    Massive MIMO Channel Prediction Using Machine Learning: Power of Domain Transformation

    Authors: Beomsoo Ko, Hwanjin Kim, Junil Choi

    Abstract: To compensate the loss from outdated channel state information in wideband massive multiple-input multipleoutput (MIMO) systems, channel prediction can be performed by leveraging the temporal correlation of wireless channels. Machine learning (ML)-based channel predictors for massive MIMO systems were designed recently; however, the time overhead to collect a large amount of training data directly… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

  24. arXiv:2208.04541  [pdf, ps, other

    cs.IT

    Coverage Increase at THz Frequencies: A Cooperative Rate-Splitting Approach

    Authors: Hyesang Cho, Beomsoo Ko, Bruno Clerckx, Junil Choi

    Abstract: Numerous studies claim that terahertz (THz) communication will be an essential piece of sixth-generation wireless communication systems. Its promising potential also comes with major challenges, in particular the reduced coverage due to harsh propagation loss, hardware constraints, and blockage vulnerability. To increase the coverage of THz communication, we revisit cooperative communication. We p… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

    Comments: 13 pages, 8 figures, submitted to IEEE Transactions on Wireless Communications (TWC)

  25. arXiv:2206.12059  [pdf

    eess.AS cs.SD

    Data Augmentation and Squeeze-and-Excitation Network on Multiple Dimension for Sound Event Localization and Detection in Real Scenes

    Authors: Byeong-Yun Ko, Hyeonuk Nam, Seong-Hu Kim, Deokki Min, Seung-Deok Choi, Yong-Hwa Park

    Abstract: Performance of sound event localization and detection (SELD) in real scenes is limited by small size of SELD dataset, due to difficulty in obtaining sufficient amount of realistic multi-channel audio data recordings with accurate label. We used two main strategies to solve problems arising from the small real SELD dataset. First, we applied various data augmentation methods on all data dimensions:… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

    Comments: Technical Report submitted for DCASE2022 Challenge Task3

  26. arXiv:2206.00518  [pdf, other

    cs.LG cs.AI

    Efficient Scheduling of Data Augmentation for Deep Reinforcement Learning

    Authors: Byungchan Ko, Jungseul Ok

    Abstract: In deep reinforcement learning (RL), data augmentation is widely considered as a tool to induce a set of useful priors about semantic consistency and improve sample efficiency and generalization performance. However, even when the prior is useful for generalization, distilling it to RL agent often interferes with RL training and degenerates sample efficiency. Meanwhile, the agent is forgetful of t… ▽ More

    Submitted 1 March, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2102.08581

    Journal ref: Neurips2022

  27. arXiv:2203.14463  [pdf, other

    cs.CV cs.CL

    Large-scale Bilingual Language-Image Contrastive Learning

    Authors: Byungsoo Ko, Geonmo Gu

    Abstract: This paper is a technical report to share our experience and findings building a Korean and English bilingual multimodal model. While many of the multimodal datasets focus on English and multilingual multimodal research uses machine-translated texts, employing such machine-translated texts is limited to describing unique expressions, cultural information, and proper noun in languages other than En… ▽ More

    Submitted 14 April, 2022; v1 submitted 27 March, 2022; originally announced March 2022.

    Comments: Accepted by ICLRW2022

  28. arXiv:2203.03166  [pdf

    eess.AS cs.SD eess.SP

    HRTF measurement for accurate sound localization cues

    Authors: Gyeong-Tae Lee, Sang-Min Choi, Byeong-Yun Ko, Yong-Hwa Park

    Abstract: A new database of head-related transfer functions (HRTFs) for accurate sound source localization is presented through precise measurement and post-processing in terms of improved frequency bandwidth and causality of head-related impulse responses (HRIRs) for accurate spectral cue (SC) and interaural time difference (ITD), respectively. The improvement effects of the proposed methods on binaural so… ▽ More

    Submitted 5 April, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: 39 pages, 27 figures, and 1 table

  29. arXiv:2112.08816  [pdf, other

    cs.CV cs.IR

    Deep Hash Distillation for Image Retrieval

    Authors: Young Kyun Jang, Geonmo Gu, Byungsoo Ko, Isaac Kang, Nam Ik Cho

    Abstract: In hash-based image retrieval systems, degraded or transformed inputs usually generate different codes from the original, deteriorating the retrieval accuracy. To mitigate this issue, data augmentation can be applied during training. However, even if augmented samples of an image are similar in real feature space, the quantization can scatter them far away in Hamming space. This results in represe… ▽ More

    Submitted 13 July, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: ECCV2022

  30. arXiv:2111.12448  [pdf, other

    cs.CV cs.GR cs.LG

    3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces

    Authors: Simone Foti, Bongjin Koo, Danail Stoyanov, Matthew J. Clarkson

    Abstract: Learning a disentangled, interpretable, and structured latent representation in 3D generative models of faces and bodies is still an open problem. The problem is particularly acute when control over identity features is required. In this paper, we propose an intuitive yet effective self-supervised approach to train a 3D shape variational autoencoder (VAE) which encourages a disentangled latent rep… ▽ More

    Submitted 23 March, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: Accepted for publication at CVPR2022

  31. arXiv:2111.09963  [pdf, other

    cs.IR cs.AI cs.LG

    Beyond NDCG: behavioral testing of recommender systems with RecList

    Authors: Patrick John Chia, Jacopo Tagliabue, Federico Bianchi, Chloe He, Brian Ko

    Abstract: As with most Machine Learning systems, recommender systems are typically evaluated through performance metrics computed over held-out data points. However, real-world behavior is undoubtedly nuanced: ad hoc error analysis and deployment-specific tests must be employed to ensure the desired quality in actual deployments. In this paper, we propose RecList, a behavioral-based testing methodology. Rec… ▽ More

    Submitted 27 March, 2022; v1 submitted 18 November, 2021; originally announced November 2021.

    Comments: Paper accepted to the WebConf 2022

  32. arXiv:2107.03649  [pdf

    eess.AS cs.SD

    Heavily Augmented Sound Event Detection utilizing Weak Predictions

    Authors: Hyeonuk Nam, Byeong-Yun Ko, Gyeong-Tae Lee, Seong-Hu Kim, Won-Ho Jung, Sang-Min Choi, Yong-Hwa Park

    Abstract: The performances of Sound Event Detection (SED) systems are greatly limited by the difficulty in generating large strongly labeled dataset. In this work, we used two main approaches to overcome the lack of strongly labeled data. First, we applied heavy data augmentation on input features. Data augmentation methods used include not only conventional methods used in speech/audio domains but also our… ▽ More

    Submitted 14 September, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

    Comments: Won 3rd place on IEEE DCASE 2021 Task 4

  33. arXiv:2107.01793  [pdf, other

    cs.IT cs.ET

    MIMO Operations in Molecular Communications: Theory, Prototypes, and Open Challenges

    Authors: Bon-Hong Koo, Changmin Lee, Ali E. Pusane, Tuna Tugcu, Chan-Byoung Chae

    Abstract: The Internet of Bio-nano Things is a significant development for next generation communication technologies. Because conventional wireless communication technologies face challenges in realizing new applications (e.g., in-body area networks for health monitoring) and necessitate the substitution of information carriers, researchers have shifted their interest to molecular communications (MC). Alth… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Comments: 7 pages, 4 figures, accepted in Communications Magazine

  34. arXiv:2107.00414  [pdf, other

    cs.CL

    MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting

    Authors: Anne Lauscher, Brandon Ko, Bailey Kuehl, Sophie Johnson, David Jurgens, Arman Cohan, Kyle Lo

    Abstract: Citation context analysis (CCA) is an important task in natural language processing that studies how and why scholars discuss each others' work. Despite decades of study, traditional frameworks for CCA have largely relied on overly-simplistic assumptions of how authors cite, which ignore several important phenomena. For instance, scholarly papers often contain rich discussions of cited work that s… ▽ More

    Submitted 31 July, 2021; v1 submitted 1 July, 2021; originally announced July 2021.

  35. arXiv:2106.00186  [pdf, other

    cs.CV cs.LG

    Towards Light-weight and Real-time Line Segment Detection

    Authors: Geonmo Gu, Byungsoo Ko, SeoungHyun Go, Sung-Hyun Lee, Jingeun Lee, Minchul Shin

    Abstract: Previous deep learning-based line segment detection (LSD) suffers from the immense model size and high computational cost for line prediction. This constrains them from real-time inference on computationally restricted environments. In this paper, we propose a real-time and light-weight line segment detector for resource-constrained environments named Mobile LSD (M-LSD). We design an extremely eff… ▽ More

    Submitted 26 April, 2022; v1 submitted 31 May, 2021; originally announced June 2021.

    Comments: Accepted by AAAI2022

  36. arXiv:2104.03015  [pdf, other

    cs.CV

    RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network

    Authors: Minchul Shin, Yoonjae Cho, Byungsoo Ko, Geonmo Gu

    Abstract: In this paper, we study the compositional learning of images and texts for image retrieval. The query is given in the form of an image and text that describes the desired modifications to the image; the goal is to retrieve the target image that satisfies the given modifications and resembles the query by composing information in both the text and image modalities. To remedy this, we propose a nove… ▽ More

    Submitted 25 October, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

  37. arXiv:2103.16940  [pdf, other

    cs.CV cs.IR cs.LG

    Learning with Memory-based Virtual Classes for Deep Metric Learning

    Authors: Byungsoo Ko, Geonmo Gu, Han-Gyu Kim

    Abstract: The core of deep metric learning (DML) involves learning visual similarities in high-dimensional embedding space. One of the main challenges is to generalize from seen classes of training data to unseen classes of test data. Recent works have focused on exploiting past embeddings to increase the number of instances for the seen classes. Such methods achieve performance improvement via augmentation… ▽ More

    Submitted 8 October, 2021; v1 submitted 31 March, 2021; originally announced March 2021.

    Comments: Accepted by ICCV2021

  38. arXiv:2103.15454  [pdf, other

    cs.CV cs.IR cs.LG

    Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning

    Authors: Geonmo Gu, Byungsoo Ko, Han-Gyu Kim

    Abstract: One of the main purposes of deep metric learning is to construct an embedding space that has well-generalized embeddings on both seen (training) classes and unseen (test) classes. Most existing works have tried to achieve this using different types of metric objectives and hard sample mining strategies with given training data. However, learning with only the training data can be overfitted to the… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

    Comments: Accepted by AAAI2021

  39. arXiv:2102.08581  [pdf, other

    cs.LG cs.AI

    Efficient Scheduling of Data Augmentation for Deep Reinforcement Learning

    Authors: Byungchan Ko, Jungseul Ok

    Abstract: In deep reinforcement learning (RL), data augmentation is widely considered as a tool to induce a set of useful priors about semantic consistency and improve sample efficiency and generalization performance. However, even when the prior is useful for generalization, distilling it to RL agent often interferes with RL training and degenerates sample efficiency. Meanwhile, the agent is forgetful of t… ▽ More

    Submitted 18 October, 2022; v1 submitted 17 February, 2021; originally announced February 2021.

    Journal ref: Neurips 2022

  40. arXiv:2011.12713  [pdf

    cs.CR cs.LG eess.SP

    A Secure Deep Probabilistic Dynamic Thermal Line Rating Prediction

    Authors: N. Safari, S. M. Mazhari, C. Y. Chung, S. B. Ko

    Abstract: Accurate short-term prediction of overhead line (OHL) transmission ampacity can directly affect the efficiency of power system operation and planning. Any overestimation of the dynamic thermal line rating (DTLR) can lead to lifetime degradation and failure of OHLs, safety hazards, etc. This paper presents a secure yet sharp probabilistic prediction model for the hour-ahead forecasting of the DTLR.… ▽ More

    Submitted 21 November, 2020; originally announced November 2020.

    Comments: The work is accepted for publication in Journal of Modern Power Systems and Clean Energy

  41. arXiv:2011.04512  [pdf, other

    cs.CL cs.LG

    Auxiliary Sequence Labeling Tasks for Disfluency Detection

    Authors: Dongyub Lee, Byeongil Ko, Myeong Cheol Shin, Taesun Whang, Daniel Lee, Eun Hwa Kim, EungGyun Kim, Jaechoon Jo

    Abstract: Detecting disfluencies in spontaneous speech is an important preprocessing step in natural language processing and speech recognition applications. Existing works for disfluency detection have focused on designing a single objective only for disfluency detection, while auxiliary objectives utilizing linguistic information of a word such as named entity or part-of-speech information can be effectiv… ▽ More

    Submitted 5 April, 2021; v1 submitted 23 October, 2020; originally announced November 2020.

    Comments: Submitted to INTERSPEECH 2021

  42. Intraoperative Liver Surface Completion with Graph Convolutional VAE

    Authors: Simone Foti, Bongjin Koo, Thomas Dowrick, Joao Ramalhinho, Moustafa Allam, Brian Davidson, Danail Stoyanov, Matthew J. Clarkson

    Abstract: In this work we propose a method based on geometric deep learning to predict the complete surface of the liver, given a partial point cloud of the organ obtained during the surgical laparoscopic procedure. We introduce a new data augmentation technique that randomly perturbs shapes in their frequency domain to compensate the limited size of our dataset. The core of our method is a variational auto… ▽ More

    Submitted 12 July, 2021; v1 submitted 8 September, 2020; originally announced September 2020.

  43. arXiv:2005.12739  [pdf, other

    cs.CV cs.IR cs.LG

    An Effective Pipeline for a Real-world Clothes Retrieval System

    Authors: Yang-Ho Ji, HeeJae Jun, Insik Kim, Jongtack Kim, Youngjoon Kim, Byungsoo Ko, Hyong-Keun Kook, Jingeun Lee, Sangwon Lee, Sanghyuk Park

    Abstract: In this paper, we propose an effective pipeline for clothes retrieval system which has sturdiness on large-scale real-world fashion data. Our proposed method consists of three components: detection, retrieval, and post-processing. We firstly conduct a detection task for precise retrieval on target clothes, then retrieve the corresponding items with the metric learning-based model. To improve the r… ▽ More

    Submitted 26 May, 2020; originally announced May 2020.

    Comments: 2nd place solution on DeepFashion2 clothes retrieval challenge in CVPR2020 workshop (CVFAD)

  44. arXiv:2005.03510  [pdf, other

    cs.CL cs.LG stat.ML

    Reference and Document Aware Semantic Evaluation Methods for Korean Language Summarization

    Authors: Dongyub Lee, Myeongcheol Shin, Taesun Whang, Seungwoo Cho, Byeongil Ko, Daniel Lee, Eunggyun Kim, Jaechoon Jo

    Abstract: Text summarization refers to the process that generates a shorter form of text from the source document preserving salient information. Many existing works for text summarization are generally evaluated by using recall-oriented understudy for gisting evaluation (ROUGE) scores. However, as ROUGE scores are computed based on n-gram overlap, they do not reflect semantic meaning correspondences betwee… ▽ More

    Submitted 1 November, 2020; v1 submitted 29 April, 2020; originally announced May 2020.

    Comments: COLING 2020

  45. arXiv:2003.02546  [pdf, other

    cs.CV cs.IR cs.LG

    Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning

    Authors: Byungsoo Ko, Geonmo Gu

    Abstract: Learning the distance metric between pairs of samples has been studied for image retrieval and clustering. With the remarkable success of pair-based metric learning losses, recent works have proposed the use of generated synthetic points on metric learning losses for augmentation and generalization. However, these methods require additional generative networks along with the main network, which ca… ▽ More

    Submitted 23 April, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

    Comments: Accepted by CVPR 2020

  46. arXiv:2002.06328  [pdf

    cs.SD cs.LG eess.AS

    Many-to-Many Voice Conversion using Conditional Cycle-Consistent Adversarial Networks

    Authors: Shindong Lee, BongGu Ko, Keonnyeong Lee, In-Chul Yoo, Dongsuk Yook

    Abstract: Voice conversion (VC) refers to transforming the speaker characteristics of an utterance without altering its linguistic contents. Many works on voice conversion require to have parallel training data that is highly expensive to acquire. Recently, the cycle-consistent adversarial network (CycleGAN), which does not require parallel training data, has been applied to voice conversion, showing the st… ▽ More

    Submitted 15 February, 2020; originally announced February 2020.

  47. arXiv:2001.11658  [pdf, other

    cs.CV cs.IR

    Symmetrical Synthesis for Deep Metric Learning

    Authors: Geonmo Gu, Byungsoo Ko

    Abstract: Deep metric learning aims to learn embeddings that contain semantic similarity information among data points. To learn better embeddings, methods to generate synthetic hard samples have been proposed. Existing methods of synthetic hard sample generation are adopting autoencoders or generative adversarial networks, but this leads to more hyper-parameters, harder optimization, and slower training sp… ▽ More

    Submitted 23 April, 2020; v1 submitted 30 January, 2020; originally announced January 2020.

    Comments: Accepted by AAAI 2020

  48. arXiv:2001.08300  [pdf, other

    cs.LG cs.DC stat.ML

    Overcoming Noisy and Irrelevant Data in Federated Learning

    Authors: Tiffany Tuor, Shiqiang Wang, Bong Jun Ko, Changchang Liu, Kin K. Leung

    Abstract: Many image and vision applications require a large amount of data for model training. Collecting all such data at a central location can be challenging due to data privacy and communication bandwidth restrictions. Federated learning is an effective way of training a machine learning model in a distributed manner from local data collected by client devices, which does not require exchanging the raw… ▽ More

    Submitted 22 June, 2020; v1 submitted 22 January, 2020; originally announced January 2020.

    Comments: Accepted version in the 25th International Conference on Pattern Recognition (ICPR)

  49. arXiv:2001.04721   

    cs.AI cs.LG

    Interpretation and Simplification of Deep Forest

    Authors: Sangwon Kim, Mira Jeong, Byoung Chul Ko

    Abstract: This paper proposes a new method for interpreting and simplifying a black box model of a deep random forest (RF) using a proposed rule elimination. In deep RF, a large number of decision trees are connected to multiple layers, thereby making an analysis difficult. It has a high performance similar to that of a deep neural network (DNN), but achieves a better generalizability. Therefore, in this st… ▽ More

    Submitted 11 December, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: Major issues on the experiments

  50. arXiv:1909.12326  [pdf, other

    cs.LG cs.DC stat.ML

    Model Pruning Enables Efficient Federated Learning on Edge Devices

    Authors: Yuang Jiang, Shiqiang Wang, Victor Valls, Bong Jun Ko, Wei-Han Lee, Kin K. Leung, Leandros Tassiulas

    Abstract: Federated learning (FL) allows model training from local data collected by edge/mobile devices while preserving data privacy, which has wide applicability to image and vision applications. A challenge is that client devices in FL usually have much more limited computation and communication resources compared to servers in a datacenter. To overcome this challenge, we propose PruneFL -- a novel FL a… ▽ More

    Submitted 6 April, 2022; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: Accepted for publication in IEEE Transactions on Neural Networks and Learning Systems (TNNLS)