Zum Hauptinhalt springen

Showing 1–50 of 213 results for author: Park, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.01320  [pdf, ps, other

    eess.SP cs.IT

    Generalized Reduced-WMMSE Approach for Cell-Free Massive MIMO With Per-AP Power Constraints

    Authors: Wonsik Yoo, Daesung Yu, Hoon Lee, Seok-Hwan Park

    Abstract: The optimization of cooperative beamforming vectors in cell-free massive MIMO (mMIMO) systems is presented where multi-antenna access points (APs) support downlink data transmission of multiple users. Albeit the successes of the weighted minimum mean squared error (WMMSE) algorithm and their variants, they lack careful investigations about computational complexity that scales with the number of an… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: accepted for publication in IEEE Wireless Communications Letters

  2. arXiv:2407.16779  [pdf, other

    eess.SY

    Learning Networked Dynamical System Models with Weak Form and Graph Neural Networks

    Authors: Yin Yu, Daning Huang, Seho Park, Herschel C. Pangborn

    Abstract: This paper presents a sequence of two approaches for the data-driven control-oriented modeling of networked systems, i.e., the systems that involve many interacting dynamical components. First, a novel deep learning approach named the weak Latent Dynamics Model (wLDM) is developed for learning generic nonlinear dynamics with control. Leveraging the weak form, the wLDM enables more numerically stab… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  3. arXiv:2407.06614  [pdf, other

    eess.IV cs.CV

    Implicit Regression in Subspace for High-Sensitivity CEST Imaging

    Authors: Chu Chen, Yang Liu, Se Weon Park, Jizhou Li, Kannie W. Y. Chan, Raymond H. F. Chan

    Abstract: Chemical Exchange Saturation Transfer (CEST) MRI demonstrates its capability in significantly enhancing the detection of proteins and metabolites with low concentrations through exchangeable protons. The clinical application of CEST, however, is constrained by its low contrast and low signal-to-noise ratio (SNR) in the acquired data. Denoising, as one of the post-processing stages for CEST data, c… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  4. arXiv:2407.06328  [pdf, other

    cs.MA eess.SY

    Learning Equilibrium with Estimated Payoffs in Population Games

    Authors: Shinkyu Park

    Abstract: We study a multi-agent decision problem in population games, where agents select from multiple available strategies and continually revise their selections based on the payoffs associated with these strategies. Unlike conventional population game formulations, we consider a scenario where agents must estimate the payoffs through local measurements and communication with their neighbors. By employi… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  5. arXiv:2407.05527  [pdf, other

    cs.CV cs.LG eess.IV

    Rethinking Image Skip Connections in StyleGAN2

    Authors: Seung Park, Yong-Goo Shin

    Abstract: Various models based on StyleGAN have gained significant traction in the field of image synthesis, attributed to their robust training stability and superior performances. Within the StyleGAN framework, the adoption of image skip connection is favored over the traditional residual connection. However, this preference is just based on empirical observations; there has not been any in-depth mathemat… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  6. arXiv:2406.16994  [pdf, other

    eess.SP cs.AI

    Quantum Multi-Agent Reinforcement Learning for Cooperative Mobile Access in Space-Air-Ground Integrated Networks

    Authors: Gyu Seon Kim, Yeryeong Cho, Jaehyun Chung, Soohyun Park, Soyi Jung, Zhu Han, Joongheon Kim

    Abstract: Achieving global space-air-ground integrated network (SAGIN) access only with CubeSats presents significant challenges such as the access sustainability limitations in specific regions (e.g., polar regions) and the energy efficiency limitations in CubeSats. To tackle these problems, high-altitude long-endurance unmanned aerial vehicles (HALE-UAVs) can complement these CubeSat shortcomings for prov… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 17 pages, 22 figures

  7. arXiv:2406.15819  [pdf, other

    cs.LG cs.IT cs.NI eess.SP

    Automatic AI Model Selection for Wireless Systems: Online Learning via Digital Twinning

    Authors: Qiushuo Hou, Matteo Zecchin, Sangwoo Park, Yunlong Cai, Guanding Yu, Kaushik Chowdhury, Osvaldo Simeone

    Abstract: In modern wireless network architectures, such as O-RAN, artificial intelligence (AI)-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The AI "apps" are selected on the basis of contextual information such as network conditions, topology, traffic statistics, and design goals. The mapping between context and AI model parameter… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: submitted for a journal publication

  8. arXiv:2405.19346  [pdf, other

    eess.SP cs.AI cs.LG

    Subject-Adaptive Transfer Learning Using Resting State EEG Signals for Cross-Subject EEG Motor Imagery Classification

    Authors: Sion An, Myeongkyun Kang, Soopil Kim, Philip Chikontwe, Li Shen, Sang Hyun Park

    Abstract: Electroencephalography (EEG) motor imagery (MI) classification is a fundamental, yet challenging task due to the variation of signals between individuals i.e., inter-subject variability. Previous approaches try to mitigate this using task-specific (TS) EEG signals from the target subject in training. However, recording TS EEG signals requires time and limits its applicability in various fields. In… ▽ More

    Submitted 9 July, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: Early Accepted at MICCAI 2024

  9. arXiv:2405.17071  [pdf, other

    eess.SP

    Reliable Sub-Nyquist Spectrum Sensing via Conformal Risk Control

    Authors: Hyojin Lee, Sangwoo Park, Osvaldo Simeone, Yonina C. Eldar, Joonhyuk Kang

    Abstract: Detecting occupied subbands is a key task for wireless applications such as unlicensed spectrum access. Recently, detection methods were proposed that extract per-subband features from sub-Nyquist baseband samples and then apply thresholding mechanisms based on held-out data. Such existing solutions can only provide guarantees in terms of false negative rate (FNR) in the asymptotic regime of large… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: submitted for a journal publication

  10. arXiv:2404.14547  [pdf, other

    cs.RO cs.AI eess.SY

    Integrating Disambiguation and User Preferences into Large Language Models for Robot Motion Planning

    Authors: Mohammed Abugurain, Shinkyu Park

    Abstract: This paper presents a framework that can interpret humans' navigation commands containing temporal elements and directly translate their natural language instructions into robot motion planning. Central to our framework is utilizing Large Language Models (LLMs). To enhance the reliability of LLMs in the framework and improve user experience, we propose methods to resolve the ambiguity in natural l… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  11. arXiv:2404.11350  [pdf, other

    cs.LG cs.AI eess.SP

    Calibrating Bayesian Learning via Regularization, Confidence Minimization, and Selective Inference

    Authors: Jiayi Huang, Sangwoo Park, Osvaldo Simeone

    Abstract: The application of artificial intelligence (AI) models in fields such as engineering is limited by the known difficulty of quantifying the reliability of an AI's decision. A well-calibrated AI model must correctly report its accuracy on in-distribution (ID) inputs, while also enabling the detection of out-of-distribution (OOD) inputs. A conventional approach to improve calibration is the applicati… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Under review

  12. arXiv:2404.03991  [pdf, other

    eess.IV cs.CV cs.LG

    Towards Efficient and Accurate CT Segmentation via Edge-Preserving Probabilistic Downsampling

    Authors: Shahzad Ali, Yu Rim Lee, Soo Young Park, Won Young Tak, Soon Ki Jung

    Abstract: Downsampling images and labels, often necessitated by limited resources or to expedite network training, leads to the loss of small objects and thin boundaries. This undermines the segmentation network's capacity to interpret images accurately and predict detailed labels, resulting in diminished performance compared to processing at original resolutions. This situation exemplifies the trade-off be… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 5 pages (4 figures, 1 table); This work has been submitted to the IEEE Signal Processing Letters. Copyright may be transferred without notice, after which this version may no longer be accessible

  13. arXiv:2404.01815  [pdf, other

    eess.SP cs.NE

    Neuromorphic Split Computing with Wake-Up Radios: Architecture and Design via Digital Twinning

    Authors: Jiechen Chen, Sangwoo Park, Petar Popovski, H. Vincent Poor, Osvaldo Simeone

    Abstract: Neuromorphic computing leverages the sparsity of temporal data to reduce processing energy by activating a small subset of neurons and synapses at each time step. When deployed for split computing in edge-based systems, remote neuromorphic processing units (NPUs) can reduce the communication power budget by communicating asynchronously using sparse impulse radio (IR) waveforms. This way, the input… ▽ More

    Submitted 3 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Under review

  14. arXiv:2403.11762  [pdf, other

    cs.IT eess.SP

    Full-Duplex MU-MIMO Systems with Coarse Quantization: How Many Bits Do We Need?

    Authors: Seunghyeong Yoo, Seokjun Park, Mintaek Oh, Namyoon Lee, Jinseok Choi

    Abstract: This paper investigates full-duplex (FD) multi-user multiple-input multiple-output (MU-MIMO) system design with coarse quantization. We first analyze the impact of self-interference (SI) on quantization in FD single-input single-output systems. The analysis elucidates that the minimum required number of analog-to-digital converter (ADC) bits is logarithmically proportional to the ratio of total re… ▽ More

    Submitted 18 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

  15. arXiv:2403.11104  [pdf, other

    eess.SY

    Deep Neural Network NMPC for Computationally Tractable Optimal Power Management of Hybrid Electric Vehicle

    Authors: Suyong Park, Duc Giap Nguyen, Jinrak Park, Dohee Kim, Jeong Soo Eo, Kyoungseok Han

    Abstract: This study presents a method for deep neural network nonlinear model predictive control (DNN-MPC) to reduce computational complexity, and we show its practical utility through its application in optimizing the energy management of hybrid electric vehicles (HEVs). For optimal power management of HEVs, we first design the online NMPC to collect the data set, and the deep neural network is trained to… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 6 pages, 10 figures, 3 tables, 2024 ACC conference (accepted)

  16. arXiv:2403.09570  [pdf, other

    cs.LG cs.IT eess.SP

    Multi-Fidelity Bayesian Optimization With Across-Task Transferable Max-Value Entropy Search

    Authors: Yunchuan Zhang, Sangwoo Park, Osvaldo Simeone

    Abstract: In many applications, ranging from logistics to engineering, a designer is faced with a sequence of optimization tasks for which the objectives are in the form of black-box functions that are costly to evaluate. For example, the designer may need to tune the hyperparameters of neural network models for different learning tasks over time. Rather than evaluating the objective function for each candi… ▽ More

    Submitted 24 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: 12 pages, 8 figures, submitted to IEEE for peer review

  17. arXiv:2402.12412  [pdf, other

    cs.HC cs.AI cs.MM eess.SP

    Dynamic and Super-Personalized Media Ecosystem Driven by Generative AI: Unpredictable Plays Never Repeating The Same

    Authors: Sungjun Ahn, Hyun-Jeong Yim, Youngwan Lee, Sung-Ik Park

    Abstract: This paper introduces a media service model that exploits artificial intelligence (AI) video generators at the receive end. This proposal deviates from the traditional multimedia ecosystem, completely relying on in-house production, by shifting part of the content creation onto the receiver. We bring a semantic process into the framework, allowing the distribution network to provide service elemen… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 13 pages, 7 figures

  18. arXiv:2402.04718  [pdf, other

    eess.SY

    Adaptive Smooth Control via Nonsingular Fast Terminal Sliding Mode for Distributed Space Telescope Demonstration Mission by CubeSat Formation Flying

    Authors: Soobin Jeon, Hancheol Cho, Sang-Young Park

    Abstract: This paper investigates the efficiency of nonsingular fast terminal sliding mode and adaptive smooth control method for the distributed space telescope demonstration mission. The distributed space telescope has a flexible focal length that corresponds to the relative position of the formation flying concept. The precise formation flying technology by CubeSats enhances the utility of distributed sp… ▽ More

    Submitted 17 June, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: This manuscript is submitted to IEEE Transactions on Aerospace and Electronic Systems

  19. arXiv:2401.15475  [pdf, other

    eess.SY math.DS math.OC

    Epidemic Population Games And Perturbed Best Response Dynamics

    Authors: Shinkyu Park, Jair Certorio, Nuno C. Martins, Richard J. La

    Abstract: This paper proposes an approach to mitigate epidemic spread in a population of strategic agents by encouraging safer behaviors through carefully designed rewards. These rewards, which vary according to the state of the epidemic, are ascribed by a dynamic payoff mechanism we seek to design. We use a modified SIRS model to track how the epidemic progresses in response to the population's agents stra… ▽ More

    Submitted 22 February, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

  20. arXiv:2401.09802  [pdf, other

    eess.AS cs.CV cs.SD

    Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation

    Authors: Minsu Kim, Jeong Hun Yeo, Se Jin Park, Hyeongseop Rha, Yong Man Ro

    Abstract: This paper explores sentence-level multilingual Visual Speech Recognition (VSR) that can recognize different languages with a single trained model. As the massive multilingual modeling of visual data requires huge computational costs, we propose a novel training strategy, processing with visual speech units. Motivated by the recent success of the audio speech unit, we propose to use a visual speec… ▽ More

    Submitted 18 July, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: ACMMM 2024

  21. arXiv:2401.03124  [pdf, ps, other

    eess.SY

    To Balance or to Not? Battery Aging-Aware Active Cell Balancing for Electric Vehicles

    Authors: Enrico Fraccaroli, Seongik Jang, Logan Stach, Hoeseok Yang, Sangyoung Park, Samarjit Chakraborty

    Abstract: Due to manufacturing variabilities and temperature gradients within an electric vehicle's battery pack, the capacities of cells in it decrease differently over time. This reduces the usable capacity of the battery - the charge levels of one or more cells might be at the minimum threshold while most of the other cells have residual charge. Active cell balancing (i.e., transferring charge among cell… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: A preliminary version of this paper is due to appear in VLSI Design 2024

  22. arXiv:2312.09576  [pdf, other

    eess.IV cs.CV

    SegRap2023: A Benchmark of Organs-at-Risk and Gross Tumor Volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma

    Authors: Xiangde Luo, Jia Fu, Yunxin Zhong, Shuolin Liu, Bing Han, Mehdi Astaraki, Simone Bendazzoli, Iuliana Toma-Dasu, Yiwen Ye, Ziyang Chen, Yong Xia, Yanzhou Su, Jin Ye, Junjun He, Zhaohu Xing, Hongqiu Wang, Lei Zhu, Kaixiang Yang, Xin Fang, Zhiwei Wang, Chan Woong Lee, Sang Joon Park, Jaehee Chun, Constantin Ulrich, Klaus H. Maier-Hein , et al. (17 additional authors not shown)

    Abstract: Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: A challenge report of SegRap2023 (organized in conjunction with MICCAI2023)

  23. arXiv:2312.09461  [pdf, other

    eess.SP cs.HC cs.LG

    Improving Generalization of Drowsiness State Classification by Domain-Specific Normalization

    Authors: Dong-Young Kim, Dong-Kyun Han, Seo-Hyeon Park, Geun-Deok Jang, Seong-Whan Lee

    Abstract: Abnormal driver states, particularly have been major concerns for road safety, emphasizing the importance of accurate drowsiness detection to prevent accidents. Electroencephalogram (EEG) signals are recognized for their effectiveness in monitoring a driver's mental state by monitoring brain activities. However, the challenge lies in the requirement for prior calibration due to the variation of EE… ▽ More

    Submitted 14 November, 2023; originally announced December 2023.

    Comments: Submitted to 2024 12th IEEE International Winter Conference on Brain-Computer Interface

  24. arXiv:2312.04846  [pdf, other

    cs.SD eess.AS

    Sound Source Localization for a Source inside a Structure using Ac-CycleGAN

    Authors: Shunsuke Kita, Choong Sik Park, Yoshinobu Kajikawa

    Abstract: We propose a method for sound source localization (SSL) for a source inside a structure using Ac-CycleGAN under unpaired data conditions. The proposed method utilizes a large amount of simulated data and a small amount of actual experimental data to locate a sound source inside a structure in a real environment. An Ac-CycleGAN generator contributes to the transformation of simulated data into real… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  25. arXiv:2312.02512  [pdf, other

    cs.CV cs.AI cs.MM eess.AS

    AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation

    Authors: Jeongsoo Choi, Se Jin Park, Minsu Kim, Yong Man Ro

    Abstract: This paper proposes a novel direct Audio-Visual Speech to Audio-Visual Speech Translation (AV2AV) framework, where the input and output of the system are multimodal (i.e., audio and visual speech). With the proposed AV2AV, two key advantages can be brought: 1) We can perform real-like conversations with individuals worldwide in a virtual meeting by utilizing our own primary languages. In contrast… ▽ More

    Submitted 26 March, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: CVPR 2024. Code & Demo: https://choijeongsoo.github.io/av2av

  26. arXiv:2311.11169  [pdf

    eess.IV cs.AI cs.LG eess.SP

    Deep Coherence Learning: An Unsupervised Deep Beamformer for High Quality Single Plane Wave Imaging in Medical Ultrasound

    Authors: Hyunwoo Cho, Seongjun Park, Jinbum Kang, Yangmo Yoo

    Abstract: Plane wave imaging (PWI) in medical ultrasound is becoming an important reconstruction method with high frame rates and new clinical applications. Recently, single PWI based on deep learning (DL) has been studied to overcome lowered frame rates of traditional PWI with multiple PW transmissions. However, due to the lack of appropriate ground truth images, DL-based PWI still remains challenging for… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  27. arXiv:2311.04066  [pdf, other

    cs.CV cs.AI cs.MM cs.SD eess.AS

    Can CLIP Help Sound Source Localization?

    Authors: Sooyoung Park, Arda Senocak, Joon Son Chung

    Abstract: Large-scale pre-trained image-text models demonstrate remarkable versatility across diverse tasks, benefiting from their robust representational capabilities and effective multimodal alignment. We extend the application of these models, specifically CLIP, to the domain of sound source localization. Unlike conventional approaches, we employ the pre-trained CLIP model without explicit text input, re… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: WACV 2024

  28. arXiv:2311.01908  [pdf, other

    eess.IV cs.CV

    LLM-driven Multimodal Target Volume Contouring in Radiation Oncology

    Authors: Yujin Oh, Sangjoon Park, Hwa Kyung Byun, Yeona Cho, Ik Jae Lee, Jin Sung Kim, Jong Chul Ye

    Abstract: Target volume contouring for radiation therapy is considered significantly more challenging than the normal organ segmentation tasks as it necessitates the utilization of both image and text-based clinical information. Inspired by the recent advancement of large language models (LLMs) that can facilitate the integration of the textural information and images, here we present a novel LLM-driven mul… ▽ More

    Submitted 15 April, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

  29. arXiv:2310.14946  [pdf, other

    cs.MM cs.SD eess.AS

    Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model

    Authors: Joanna Hong, Se Jin Park, Yong Man Ro

    Abstract: We present a novel approach to multilingual audio-visual speech recognition tasks by introducing a single model on a multilingual dataset. Motivated by a human cognitive system where humans can intuitively distinguish different languages without any conscious effort or guidance, we propose a model that can capture which language is given as an input speech by distinguishing the inherent similariti… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings

  30. arXiv:2310.05934  [pdf, other

    cs.CV cs.AI cs.MM eess.IV

    DF-3DFace: One-to-Many Speech Synchronized 3D Face Animation with Diffusion

    Authors: Se Jin Park, Joanna Hong, Minsu Kim, Yong Man Ro

    Abstract: Speech-driven 3D facial animation has gained significant attention for its ability to create realistic and expressive facial animations in 3D space based on speech. Learning-based methods have shown promising progress in achieving accurate facial motion synchronized with speech. However, one-to-many nature of speech-to-3D facial synthesis has not been fully explored: while the lip accurately synch… ▽ More

    Submitted 23 August, 2023; originally announced October 2023.

  31. arXiv:2310.05538  [pdf, other

    eess.IV cs.CV cs.LG

    M3FPolypSegNet: Segmentation Network with Multi-frequency Feature Fusion for Polyp Localization in Colonoscopy Images

    Authors: Ju-Hyeon Nam, Seo-Hyeong Park, Nur Suriza Syazwany, Yerim Jung, Yu-Han Im, Sang-Chul Lee

    Abstract: Polyp segmentation is crucial for preventing colorectal cancer a common type of cancer. Deep learning has been used to segment polyps automatically, which reduces the risk of misdiagnosis. Localizing small polyps in colonoscopy images is challenging because of its complex characteristics, such as color, occlusion, and various shapes of polyps. To address this challenge, a novel frequency-based ful… ▽ More

    Submitted 9 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 5pages. 2023 IEEE International Conference on Image Processing (ICIP). IEEE, 2023

    MSC Class: 92C55

  32. Detection of Pedestrian Turning Motions to Enhance Indoor Map Matching Performance

    Authors: Seunghyeon Park, Taewon Kang, Seungjae Lee, Joon Hyo Rhee

    Abstract: A pedestrian navigation system (PNS) in indoor environments, where global navigation satellite system (GNSS) signal access is difficult, is necessary, particularly for search and rescue (SAR) operations in large buildings. This paper focuses on studying pedestrian walking behaviors to enhance the performance of indoor pedestrian dead reckoning (PDR) and map matching techniques. Specifically, our r… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: Submitted to ICTC 2023

  33. arXiv:2308.03251  [pdf, ps, other

    eess.SP cs.IT

    Joint Precoding and Fronthaul Compression for Cell-Free MIMO Downlink With Radio Stripes

    Authors: Sangwon Jo, Hoon Lee, Seok-Hwan Park

    Abstract: A sequential fronthaul network, referred to as radio stripes, is a promising fronthaul topology of cell-free MIMO systems. In this setup, a single cable suffices to connect access points (APs) to a central processor (CP). Thus, radio stripes are more effective than conventional star fronthaul topology which requires dedicated cables for each of APs. Most of works on radio stripes focused on the up… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

    Comments: To be presented at IEEE Globecom 2023, Kuala Lumpur, Malaysia, Dec. 2023

  34. arXiv:2307.04377  [pdf, other

    cs.SD eess.AS

    HCLAS-X: Hierarchical and Cascaded Lyrics Alignment System Using Multimodal Cross-Correlation

    Authors: Minsung Kang, Soochul Park, Keunwoo Choi

    Abstract: In this work, we address the challenge of lyrics alignment, which involves aligning the lyrics and vocal components of songs. This problem requires the alignment of two distinct modalities, namely text and audio. To overcome this challenge, we propose a model that is trained in a supervised manner, utilizing the cross-correlation matrix of latent representations between vocals and lyrics. Our syst… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  35. arXiv:2306.17815  [pdf, other

    cs.LG cs.IT eess.SP

    Bayesian Optimization with Formal Safety Guarantees via Online Conformal Prediction

    Authors: Yunchuan Zhang, Sangwoo Park, Osvaldo Simeone

    Abstract: Black-box zero-th order optimization is a central primitive for applications in fields as diverse as finance, physics, and engineering. In a common formulation of this problem, a designer sequentially attempts candidate solutions, receiving noisy feedback on the value of each attempt from the system. In this paper, we study scenarios in which feedback is also provided on the safety of the attempte… ▽ More

    Submitted 4 July, 2024; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: 15 pages, 10 figures, this work has been published in IEEE Journal of Selected Topics in Signal Processing

  36. arXiv:2306.16003  [pdf, other

    cs.GR cs.CV cs.SD eess.AS

    Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models

    Authors: Jeongsoo Choi, Minsu Kim, Se Jin Park, Yong Man Ro

    Abstract: In this paper, we present a method for reprogramming pre-trained audio-driven talking face synthesis models to operate in a text-driven manner. Consequently, we can easily generate face videos that articulate the provided textual sentences, eliminating the necessity of recording speech for each inference, as required in the audio-driven model. To this end, we propose to embed the input text into t… ▽ More

    Submitted 18 January, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: ICASSP 2024

  37. arXiv:2306.12978  [pdf, other

    cs.IT eess.SP

    Rate-Splitting Multiple Access for 6G Networks: Ten Promising Scenarios and Applications

    Authors: Jeonghun Park, Byungju Lee, Jinseok Choi, Hoon Lee, Namyoon Lee, Seok-Hwan Park, Kyoung-Jae Lee, Junil Choi, Sung Ho Chae, Sang-Woon Jeon, Kyung Sup Kwak, Bruno Clerckx, Wonjae Shin

    Abstract: In the upcoming 6G era, multiple access (MA) will play an essential role in achieving high throughput performances required in a wide range of wireless applications. Since MA and interference management are closely related issues, the conventional MA techniques are limited in that they cannot provide near-optimal performance in universal interference regimes. Recently, rate-splitting multiple acce… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 17 pages, 6 figures, submitted to IEEE Network Magazine

  38. arXiv:2306.07535  [pdf, other

    eess.SY

    Learning with Delayed Payoffs in Population Games using Kullback-Leibler Divergence Regularization

    Authors: Shinkyu Park, Naomi Ehrich Leonard

    Abstract: We study a multi-agent decision problem in large population games. Agents from multiple populations select strategies for repeated interactions with one another. At each stage of these interactions, agents use their decision-making model to revise their strategy selections based on payoffs determined by an underlying game. Their goal is to learn the strategies that correspond to the Nash equilibri… ▽ More

    Submitted 3 June, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

  39. arXiv:2306.04549  [pdf

    eess.SY eess.SP

    Dynamic Geometry-Based Stochastic Channel Modeling for Polarized MIMO Systems with Moving Scatterers

    Authors: Hamed Radpour, Laxmikant Minz, Seong-Ook Park, Duck-Yong Kim, Young-Chan Moon

    Abstract: This paper introduces a four-dimensional (4D) geometry-based stochastic model (GBSM) for polarized multiple-input multiple-output (MIMO) systems with moving scatterers. We propose a novel motion path model with high degrees of freedom based on the Brownian Motion (BM) random process for randomly moving scatterers. This model is capable of analyzing the effect of both deterministically and randomly… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: 10 pages, 10 Figures

  40. arXiv:2306.04137  [pdf, other

    cs.MA eess.SY

    Multi-Agent Reinforcement Learning for Cooperative Air Transportation Services in City-Wide Autonomous Urban Air Mobility

    Authors: Chanyoung Park, Gyu Seon Kim, Soohyun Park, Soyi Jung, Joongheon Kim

    Abstract: The development of urban-air-mobility (UAM) is rapidly progressing with spurs, and the demand for efficient transportation management systems is a rising need due to the multifaceted environmental uncertainties. Thus, this paper proposes a novel air transportation service management algorithm based on multi-agent deep reinforcement learning (MADRL) to address the challenges of multi-UAM cooperatio… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: 15 pages, 14 figures

  41. arXiv:2306.02278  [pdf, other

    eess.SY

    Payoff Mechanism Design for Coordination in Multi-Agent Task Allocation Games

    Authors: Shinkyu Park, Julian Barreiro-Gomez

    Abstract: We investigate a multi-agent decision-making problem where a large population of agents is responsible for carrying out a set of assigned tasks. The amount of jobs in each task varies over time governed by a dynamical system model. Each agent needs to select one of the available strategies to take on one or more tasks. Since each strategy allows an agent to perform multiple tasks at a time, possib… ▽ More

    Submitted 18 September, 2023; v1 submitted 4 June, 2023; originally announced June 2023.

  42. arXiv:2305.19556  [pdf, other

    cs.CV cs.AI cs.SD eess.AS eess.IV

    Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation

    Authors: Se Jin Park, Minsu Kim, Jeongsoo Choi, Yong Man Ro

    Abstract: Talking face generation is the challenging task of synthesizing a natural and realistic face that requires accurate synchronization with a given audio. Due to co-articulation, where an isolated phone is influenced by the preceding or following phones, the articulation of a phone varies upon the phonetic context. Therefore, modeling lip motion with the phonetic context can generate more spatio-temp… ▽ More

    Submitted 1 April, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted at ICASSP 2024

  43. arXiv:2305.16699  [pdf, other

    eess.AS cs.AI cs.LG

    Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis

    Authors: Seongyeon Park, Bohyung Kim, Tae-hyun Oh

    Abstract: Recently, zero-shot TTS and VC methods have gained attention due to their practicality of being able to generate voices even unseen during training. Among these methods, zero-shot modifications of the VITS model have shown superior performance, while having useful properties inherited from VITS. However, the performance of VITS and VITS-based zero-shot models vary dramatically depending on how the… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Interspeech 2023

  44. arXiv:2305.15417  [pdf, other

    eess.IV cs.CV cs.LG

    Entropy-Aware Similarity for Balanced Clustering: A Case Study with Melanoma Detection

    Authors: Seok Bin Son, Soohyun Park, Joongheon Kim

    Abstract: Clustering data is an unsupervised learning approach that aims to divide a set of data points into multiple groups. It is a crucial yet demanding subject in machine learning and data mining. Its successful applications span various fields. However, conventional clustering techniques necessitate the consideration of balance significance in specific applications. Therefore, this paper addresses the… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  45. arXiv:2305.09986  [pdf, other

    eess.IV cs.CV cs.LG

    A robust multi-domain network for short-scanning amyloid PET reconstruction

    Authors: Hyoung Suk Park, Young Jin Jeong, Kiwan Jeon

    Abstract: This paper presents a robust multi-domain network designed to restore low-quality amyloid PET images acquired in a short period of time. The proposed method is trained on pairs of PET images from short (2 minutes) and standard (20 minutes) scanning times, sourced from multiple domains. Learning relevant image features between these domains with a single network is challenging. Our key contribution… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: 21 pages, 7 figures, 3 tables

    MSC Class: 92C55; 68T05; 15A29; 65F22

  46. arXiv:2305.07504  [pdf, other

    cs.LG eess.SP

    Calibration-Aware Bayesian Learning

    Authors: Jiayi Huang, Sangwoo Park, Osvaldo Simeone

    Abstract: Deep learning models, including modern systems like large language models, are well known to offer unreliable estimates of the uncertainty of their decisions. In order to improve the quality of the confidence levels, also known as calibration, of a model, common approaches entail the addition of either data-dependent or data-independent regularization terms to the training loss. Data-dependent reg… ▽ More

    Submitted 12 April, 2024; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: submitted for conference publication

  47. arXiv:2305.06813  [pdf, other

    eess.IV cs.CV

    Generation of Structurally Realistic Retinal Fundus Images with Diffusion Models

    Authors: Sojung Go, Younghoon Ji, Sang Jun Park, Soochahn Lee

    Abstract: We introduce a new technique for generating retinal fundus images that have anatomically accurate vascular structures, using diffusion models. We generate artery/vein masks to create the vascular structure, which we then condition to produce retinal fundus images. The proposed method can generate high-quality images with more realistic vascular structures and can create a diverse range of images b… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: 9 pages, 6 figures

  48. arXiv:2305.04877  [pdf

    physics.optics eess.SP quant-ph

    Coherently amplified ultrafast imaging using a free-electron interferometer

    Authors: Tomer Bucher, Harel Nahari, Hanan Herzig Sheinfux, Ron Ruimy, Arthur Niedermayr, Raphael Dahan, Qinghui Yan, Yuval Adiv, Michael Yannai, Jialin Chen, Yaniv Kurman, Sang Tae Park, Daniel J. Masiel, Eli Janzen, James H. Edgar, Fabrizio Carbone, Guy Bartal, Shai Tsesses, Frank H. L. Koppens, Giovanni Maria Vanacore, Ido Kaminer

    Abstract: Accessing the low-energy non-equilibrium dynamics of materials and their polaritons with simultaneous high spatial and temporal resolution has been a bold frontier of electron microscopy in recent years. One of the main challenges lies in the ability to retrieve extremely weak signals while simultaneously disentangling amplitude and phase information. Here, we present Free-Electron Ramsey Imaging… ▽ More

    Submitted 16 July, 2024; v1 submitted 8 May, 2023; originally announced May 2023.

  49. arXiv:2305.02124  [pdf

    cond-mat.mtrl-sci eess.IV

    Adaptative Diffraction Image Registration for 4D-STEM to optimize ACOM Pattern Matching

    Authors: Nicolas Folastre, Junhao Cao, Gozde Oney, Sunkyu Park, Arash Jamali, Christian Masquelier, Laurence Croguennec, Muriel Veron, Edgar F. Rauch, Arnaud Demortière

    Abstract: The technique known as 4D-STEM has recently emerged as a powerful tool for the local characterization of crystalline structures in materials, such as cathode materials for Li-ion batteries or perovskite materials for photovoltaics. However, the use of new detectors optimized for electron diffraction patterns and other advanced techniques requires constant adaptation of methodologies to address the… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: 22 pages (13 pages SI), 7 figures (10 figures SI)

  50. arXiv:2304.04027  [pdf, other

    eess.IV cs.CV cs.LG

    NeBLa: Neural Beer-Lambert for 3D Reconstruction of Oral Structures from Panoramic Radiographs

    Authors: Sihwa Park, Seongjun Kim, Doeyoung Kwon, Yohan Jang, In-Seok Song, Seung Jun Baek

    Abstract: Panoramic radiography (Panoramic X-ray, PX) is a widely used imaging modality for dental examination. However, PX only provides a flattened 2D image, lacking in a 3D view of the oral structure. In this paper, we propose NeBLa (Neural Beer-Lambert) to estimate 3D oral structures from real-world PX. NeBLa tackles full 3D reconstruction for varying subjects (patients) where each reconstruction is bas… ▽ More

    Submitted 6 February, 2024; v1 submitted 8 April, 2023; originally announced April 2023.

    Comments: 18 pages, 16 figures, Accepted to AAAI 2024