Zum Hauptinhalt springen

Showing 1–50 of 82 results for author: Cho, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2408.07897  [pdf, other

    cs.LG cs.IR cs.MA eess.SY

    The Nah Bandit: Modeling User Non-compliance in Recommendation Systems

    Authors: Tianyue Zhou, Jung-Hoon Cho, Cathy Wu

    Abstract: Recommendation systems now pervade the digital world, ranging from advertising to entertainment. However, it remains challenging to implement effective recommendation systems in the physical world, such as in mobility or health. This work focuses on a key challenge: in the physical world, it is often easy for the user to opt out of taking any recommendation if they are not to her liking, and to fa… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 12 pages, 8 figures, under review

  2. arXiv:2406.14954  [pdf, other

    eess.IV cs.CV

    A Unified Framework for Synthesizing Multisequence Brain MRI via Hybrid Fusion

    Authors: Jihoon Cho, Jonghye Woo, Jinah Park

    Abstract: Multisequence Magnetic Resonance Imaging (MRI) provides a reliable diagnosis in clinical applications through complementary information within sequences. However, in practice, the absence of certain MR sequences is a common problem that can lead to inconsistent analysis results. In this work, we propose a novel unified framework for synthesizing multisequence MR images, called Hybrid Fusion GAN (H… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 11 pages, 7 figures

  3. arXiv:2406.12998  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    Articulatory Encodec: Coding Speech through Vocal Tract Kinematics

    Authors: Cheol Jun Cho, Peter Wu, Tejas S. Prabhune, Dhruv Agarwal, Gopala K. Anumanchipalli

    Abstract: Vocal tract articulation is a natural, grounded control space of speech production. The spatiotemporal coordination of articulators combined with the vocal source shapes intelligible speech sounds to enable effective spoken communication. Based on this physiological grounding of speech, we propose a new framework of neural encoding-decoding of speech -- Articulatory Encodec. Articulatory Encodec c… ▽ More

    Submitted 20 August, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2406.11427  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer

    Authors: Keon Lee, Dong Won Kim, Jaehyeon Kim, Jaewoong Cho

    Abstract: Large-scale diffusion models have shown outstanding generative abilities across multiple modalities including images, videos, and audio. However, text-to-speech (TTS) systems typically involve domain-specific modeling factors (e.g., phonemes and phoneme-level durations) to ensure precise temporal alignments between text and speech, which hinders the efficiency and scalability of diffusion models f… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2404.16137  [pdf, ps, other

    cs.IT cs.LG eess.SP

    Learned Pulse Shaping Design for PAPR Reduction in DFT-s-OFDM

    Authors: Fabrizio Carpi, Soheil Rostami, Joonyoung Cho, Siddharth Garg, Elza Erkip, Charlie Jianzhong Zhang

    Abstract: High peak-to-average power ratio (PAPR) is one of the main factors limiting cell coverage for cellular systems, especially in the uplink direction. Discrete Fourier transform spread orthogonal frequency-domain multiplexing (DFT-s-OFDM) with spectrally-extended frequency-domain spectrum shaping (FDSS) is one of the efficient techniques deployed to lower the PAPR of the uplink waveforms. In this wor… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 5 pages, under review

  6. arXiv:2404.02781  [pdf, other

    eess.AS cs.SD

    CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech

    Authors: Jaehyeon Kim, Keon Lee, Seungjun Chung, Jaewoong Cho

    Abstract: With the emergence of neural audio codecs, which encode multiple streams of discrete tokens from audio, large language models have recently gained attention as a promising approach for zero-shot Text-to-Speech (TTS) synthesis. Despite the ongoing rush towards scaling paradigms, audio tokenization ironically amplifies the scalability challenge, stemming from its long sequence length and the complex… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: ICLR 2024

  7. arXiv:2402.00375  [pdf, other

    eess.IV cs.CV

    Disentangled Multimodal Brain MR Image Translation via Transformer-based Modality Infuser

    Authors: Jihoon Cho, Xiaofeng Liu, Fangxu Xing, Jinsong Ouyang, Georges El Fakhri, Jinah Park, Jonghye Woo

    Abstract: Multimodal Magnetic Resonance (MR) Imaging plays a crucial role in disease diagnosis due to its ability to provide complementary information by analyzing a relationship between multimodal images on the same subject. Acquiring all MR modalities, however, can be expensive, and, during a scanning session, certain MR images may be missed depending on the study protocol. The typical solution would be t… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 6 pages

  8. arXiv:2401.12004  [pdf

    eess.IV cs.LG eess.SP

    NLCG-Net: A Model-Based Zero-Shot Learning Framework for Undersampled Quantitative MRI Reconstruction

    Authors: Xinrui Jiang, Yohan Jun, Jaejin Cho, Mengze Gao, Xingwang Yong, Berkin Bilgic

    Abstract: Typical quantitative MRI (qMRI) methods estimate parameter maps after image reconstructing, which is prone to biases and error propagation. We propose a Nonlinear Conjugate Gradient (NLCG) optimizer for model-based T2/T1 estimation, which incorporates U-Net regularization trained in a scan-specific manner. This end-to-end method directly estimates qMRI maps from undersampled k-space data using mon… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 8 pages, 5 figures, submitted to International Society for Magnetic Resonance in Medicine 2024

  9. arXiv:2312.12810  [pdf, other

    eess.AS cs.SD

    Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and Detection

    Authors: Jiachen Lian, Carly Feng, Naasir Farooqi, Steve Li, Anshul Kashyap, Cheol Jun Cho, Peter Wu, Robbie Netzorg, Tingle Li, Gopala Krishna Anumanchipalli

    Abstract: Dysfluent speech modeling requires time-accurate and silence-aware transcription at both the word-level and phonetic-level. However, current research in dysfluency modeling primarily focuses on either transcription or detection, and the performance of each aspect remains limited. In this work, we present an unconstrained dysfluency modeling (UDM) approach that addresses both transcription and dete… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 2023 ASRU

  10. arXiv:2312.09436  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Temporal Transfer Learning for Traffic Optimization with Coarse-grained Advisory Autonomy

    Authors: Jung-Hoon Cho, Sirui Li, Jeongyun Kim, Cathy Wu

    Abstract: The recent development of connected and automated vehicle (CAV) technologies has spurred investigations to optimize dense urban traffic to maximize vehicle speed and throughput. This paper explores advisory autonomy, in which real-time driving advisories are issued to the human drivers, thus achieving near-term performance of automated vehicles. Due to the complexity of traffic systems, recent stu… ▽ More

    Submitted 1 August, 2024; v1 submitted 27 November, 2023; originally announced December 2023.

    Comments: 18 pages, 12 figures

  11. arXiv:2311.03682  [pdf, ps, other

    eess.SY cs.SI math.OC

    Incentive Design for Eco-driving in Urban Transportation Networks

    Authors: M. Umar B. Niazi, Jung-Hoon Cho, Munther A. Dahleh, Roy Dong, Cathy Wu

    Abstract: Eco-driving emerges as a cost-effective and efficient strategy to mitigate greenhouse gas emissions in urban transportation networks. Acknowledging the persuasive influence of incentives in shaping driver behavior, this paper presents the `eco-planner,' a digital platform devised to promote eco-driving practices in urban transportation. At the outset of their trips, users provide the platform with… ▽ More

    Submitted 16 May, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

  12. arXiv:2311.01308  [pdf

    eess.IV cs.CV

    Hybrid-Fusion Transformer for Multisequence MRI

    Authors: Jihoon Cho, Jinah Park

    Abstract: Medical segmentation has grown exponentially through the advent of a fully convolutional network (FCN), and we have now reached a turning point through the success of Transformer. However, the different characteristics of the modality have not been fully integrated into Transformer for medical segmentation. In this work, we propose the novel hybrid fusion Transformer (HFTrans) for multisequence MR… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: 10 pages, 4 figures

  13. arXiv:2310.10803  [pdf, other

    cs.CL eess.AS

    SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in HuBERT

    Authors: Cheol Jun Cho, Abdelrahman Mohamed, Shang-Wen Li, Alan W Black, Gopala K. Anumanchipalli

    Abstract: Data-driven unit discovery in self-supervised learning (SSL) of speech has embarked on a new era of spoken language processing. Yet, the discovered units often remain in phonetic space and the units beyond phonemes are largely underexplored. Here, we demonstrate that a syllabic organization emerges in learning sentence-level representation of speech. In particular, we adopt "self-distillation" obj… ▽ More

    Submitted 16 January, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  14. arXiv:2310.10788  [pdf, other

    eess.AS cs.CL

    Self-Supervised Models of Speech Infer Universal Articulatory Kinematics

    Authors: Cheol Jun Cho, Abdelrahman Mohamed, Alan W Black, Gopala K. Anumanchipalli

    Abstract: Self-Supervised Learning (SSL) based models of speech have shown remarkable performance on a range of downstream tasks. These state-of-the-art models have remained blackboxes, but many recent studies have begun "probing" models like HuBERT, to correlate their internal representations to different aspects of speech. In this paper, we show "inference of articulatory kinematics" as fundamental proper… ▽ More

    Submitted 16 January, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  15. arXiv:2309.05287  [pdf, other

    cs.SD cs.AI eess.AS

    Addressing Feature Imbalance in Sound Source Separation

    Authors: Jaechang Kim, Jeongyeon Hwang, Soheun Yi, Jaewoong Cho, Jungseul Ok

    Abstract: Neural networks often suffer from a feature preference problem, where they tend to overly rely on specific features to solve a task while disregarding other features, even if those neglected features are essential for the task. Feature preference problems have primarily been investigated in classification task. However, we observe that feature preference occurs in high-dimensional regression task,… ▽ More

    Submitted 4 October, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

  16. arXiv:2309.00769  [pdf, other

    eess.IV cs.CV

    Full Reference Video Quality Assessment for Machine Learning-Based Video Codecs

    Authors: Abrar Majeedi, Babak Naderi, Yasaman Hosseinkashi, Juhee Cho, Ruben Alvarez Martinez, Ross Cutler

    Abstract: Machine learning-based video codecs have made significant progress in the past few years. A critical area in the development of ML-based video codecs is an accurate evaluation metric that does not require an expensive and slow subjective test. We show that existing evaluation metrics that were designed and trained on DSP-based video codecs are not highly correlated to subjective opinion when used… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

  17. arXiv:2308.06443  [pdf, other

    cs.LG eess.AS

    Neural Latent Aligner: Cross-trial Alignment for Learning Representations of Complex, Naturalistic Neural Data

    Authors: Cheol Jun Cho, Edward F. Chang, Gopala K. Anumanchipalli

    Abstract: Understanding the neural implementation of complex human behaviors is one of the major goals in neuroscience. To this end, it is crucial to find a true representation of the neural data, which is challenging due to the high complexity of behaviors and the low signal-to-ratio (SNR) of the signals. Here, we propose a novel unsupervised learning framework, Neural Latent Aligner (NLA), to find well-co… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: Accepted at ICML 2023

    Journal ref: Proceedings of the 40th International Conference on Machine Learning (2023), PMLR 202:5661-5676

  18. arXiv:2308.05103  [pdf, other

    eess.IV cs.LG

    Improved Multi-Shot Diffusion-Weighted MRI with Zero-Shot Self-Supervised Learning Reconstruction

    Authors: Jaejin Cho, Yohan Jun, Xiaoqing Wang, Caique Kobayashi, Berkin Bilgic

    Abstract: Diffusion MRI is commonly performed using echo-planar imaging (EPI) due to its rapid acquisition time. However, the resolution of diffusion-weighted images is often limited by magnetic field inhomogeneity-related artifacts and blurring induced by T2- and T2*-relaxation effects. To address these limitations, multi-shot EPI (msEPI) combined with parallel imaging techniques is frequently employed. Ne… ▽ More

    Submitted 22 September, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

    Comments: 10 pages, 4 figures

  19. arXiv:2307.04604  [pdf, other

    cs.SD cs.LG eess.AS eess.SP

    EchoVest: Real-Time Sound Classification and Depth Perception Expressed through Transcutaneous Electrical Nerve Stimulation

    Authors: Jesse Choe, Siddhant Sood, Ryan Park

    Abstract: Over 1.5 billion people worldwide live with hearing impairment. Despite various technologies that have been created for individuals with such disabilities, most of these technologies are either extremely expensive or inaccessible for everyday use in low-medium income countries. In order to combat this issue, we have developed a new assistive device, EchoVest, for blind/deaf people to intuitively b… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  20. arXiv:2307.01410  [pdf

    eess.IV eess.SP

    Zero-DeepSub: Zero-Shot Deep Subspace Reconstruction for Rapid Multiparametric Quantitative MRI Using 3D-QALAS

    Authors: Yohan Jun, Yamin Arefeen, Jaejin Cho, Shohei Fujita, Xiaoqing Wang, P. Ellen Grant, Borjan Gagoski, Camilo Jaimes, Michael S. Gee, Berkin Bilgic

    Abstract: Purpose: To develop and evaluate methods for 1) reconstructing 3D-quantification using an interleaved Look-Locker acquisition sequence with T2 preparation pulse (3D-QALAS) time-series images using a low-rank subspace method, which enables accurate and rapid T1 and T2 mapping, and 2) improving the fidelity of subspace QALAS by combining scan-specific deep-learning-based reconstruction and subspace… ▽ More

    Submitted 23 January, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: 24 figures, 4 tables

  21. arXiv:2306.00838  [pdf, other

    q-bio.OT eess.IV

    The Brain Tumor Segmentation (BraTS-METS) Challenge 2023: Brain Metastasis Segmentation on Pre-treatment MRI

    Authors: Ahmed W. Moawad, Anastasia Janas, Ujjwal Baid, Divya Ramakrishnan, Rachit Saluja, Nader Ashraf, Leon Jekel, Raisa Amiruddin, Maruf Adewole, Jake Albrecht, Udunna Anazodo, Sanjay Aneja, Syed Muhammad Anwar, Timothy Bergquist, Evan Calabrese, Veronica Chiang, Verena Chung, Gian Marco Marco Conte, Farouk Dako, James Eddy, Ivan Ezhov, Ariana Familiar, Keyvan Farahani, Juan Eugenio Iglesias, Zhifan Jiang , et al. (206 additional authors not shown)

    Abstract: The translation of AI-generated brain metastases (BM) segmentation into clinical practice relies heavily on diverse, high-quality annotated medical imaging datasets. The BraTS-METS 2023 challenge has gained momentum for testing and benchmarking algorithms using rigorously annotated internationally compiled real-world datasets. This study presents the results of the segmentation challenge and chara… ▽ More

    Submitted 17 June, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

  22. arXiv:2305.01152  [pdf, other

    eess.SP cs.IT

    Measurement-based Close-in Path Loss Modeling with Diffraction for Rural Long-distance Communications

    Authors: Jaedon Park, Hong-Bae Jeon, Jungho Cho, Chan-Byoung Chae

    Abstract: In this letter, we investigate rural large-scale path loss models based on the measurements in a central area of South Korea (rural area) in spring. In particular, we develop new close-in (CI) path loss models incorporating a diffraction component. The transmitter used in the measurement system is located on a hill and utilizes omnidirectional antennas operating at 1400 and 2250 MHz frequencies. T… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

    Comments: 5 pages, 5 figures

  23. Perspective Projection-Based 3D CT Reconstruction from Biplanar X-rays

    Authors: Daeun Kyung, Kyungmin Jo, Jaegul Choo, Joonseok Lee, Edward Choi

    Abstract: X-ray computed tomography (CT) is one of the most common imaging techniques used to diagnose various diseases in the medical field. Its high contrast sensitivity and spatial resolution allow the physician to observe details of body parts such as bones, soft tissue, blood vessels, etc. As it involves potentially harmful radiation exposure to patients and surgeons, however, reconstructing 3D CT volu… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

  24. arXiv:2302.14240  [pdf

    eess.IV eess.SP

    SSL-QALAS: Self-Supervised Learning for Rapid Multiparameter Estimation in Quantitative MRI Using 3D-QALAS

    Authors: Yohan Jun, Jaejin Cho, Xiaoqing Wang, Michael Gee, P. Ellen Grant, Berkin Bilgic, Borjan Gagoski

    Abstract: Purpose: To develop and evaluate a method for rapid estimation of multiparametric T1, T2, proton density (PD), and inversion efficiency (IE) maps from 3D-quantification using an interleaved Look-Locker acquisition sequence with T2 preparation pulse (3D-QALAS) measurements using self-supervised learning (SSL) without the need for an external dictionary. Methods: A SSL-based QALAS mapping method (SS… ▽ More

    Submitted 23 January, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: 18 figures, 4 tables

  25. arXiv:2302.06774  [pdf, other

    eess.AS cs.SD

    Speaker-Independent Acoustic-to-Articulatory Speech Inversion

    Authors: Peter Wu, Li-Wei Chen, Cheol Jun Cho, Shinji Watanabe, Louis Goldstein, Alan W Black, Gopala K. Anumanchipalli

    Abstract: To build speech processing methods that can handle speech as naturally as humans, researchers have explored multiple ways of building an invertible mapping from speech to an interpretable space. The articulatory space is a promising inversion target, since this space captures the mechanics of speech production. To this end, we build an acoustic-to-articulatory inversion (AAI) model that leverages… ▽ More

    Submitted 24 July, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

  26. arXiv:2212.08122  [pdf, other

    cs.HC cs.AI cs.RO eess.SY

    Hybrid Paradigm-based Brain-Computer Interface for Robotic Arm Control

    Authors: Byeong-Hoo Lee, Jeong-Hyun Cho, Byung-Hee Kwon

    Abstract: Brain-computer interface (BCI) uses brain signals to communicate with external devices without actual control. Particularly, BCI is one of the interfaces for controlling the robotic arm. In this study, we propose a knowledge distillation-based framework to manipulate robotic arm through hybrid paradigm induced EEG signals for practical use. The teacher model is designed to decode input data hierar… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

  27. arXiv:2212.00723  [pdf, other

    eess.SP cs.LG

    Target-centered Subject Transfer Framework for EEG Data Augmentation

    Authors: Kang Yin, Byeong-Hoo Lee, Byoung-Hee Kwon, Jeong-Hyun Cho

    Abstract: Data augmentation approaches are widely explored for the enhancement of decoding electroencephalogram signals. In subject-independent brain-computer interface system, domain adaption and generalization are utilized to shift source subjects' data distribution to match the target subject as an augmentation. However, previous works either introduce noises (e.g., by noise addition or generation with r… ▽ More

    Submitted 23 November, 2022; originally announced December 2022.

  28. arXiv:2212.00687  [pdf

    eess.IV

    3D-EPI Blip-Up/Down Acquisition (BUDA) with CAIPI and Joint Hankel Structured Low-Rank Reconstruction for Rapid Distortion-Free High-Resolution T2* Mapping

    Authors: Zhifeng Chen, Congyu Liao, Xiaozhi Cao, Benedikt A. Poser, Zhongbiao Xu, Wei-Ching Lo, Manyi Wen, Jaejin Cho, Qiyuan Tian, Yaohui Wang, Yanqiu Feng, Ling Xia, Wufan Chen, Feng Liu, Berkin Bilgic

    Abstract: Purpose: This work aims to develop a novel distortion-free 3D-EPI acquisition and image reconstruction technique for fast and robust, high-resolution, whole-brain imaging as well as quantitative T2* mapping. Methods: 3D-Blip-Up and -Down Acquisition (3D-BUDA) sequence is designed for both single- and multi-echo 3D GRE-EPI imaging using multiple shots with blip-up and -down readouts to encode B0 fi… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

  29. arXiv:2211.04426  [pdf

    physics.med-ph eess.IV

    Time-efficient, High Resolution 3T Whole Brain Quantitative Relaxometry using 3D-QALAS with Wave-CAIPI Readouts

    Authors: Jaejin Cho, Borjan Gagoski, Tae Hyung Kim, Fuyixue Wang, Daniel Nico Splitthoff, Wei-Ching Lo, Wei Liu, Daniel Polak, Stephen Cauley, Kawin Setsompop, P. Ellen Grant, Berkin Bilgic

    Abstract: Purpose: Volumetric, high-resolution, quantitative mapping of brain tissue relaxation properties is hindered by long acquisition times and signal-to-noise (SNR) challenges. This study, for the first time, combines the time-efficient wave-CAIPI readouts into the 3D-quantification using an interleaved Look-Locker acquisition sequence with a T2 preparation pulse (3D-QALAS) acquisition scheme, enablin… ▽ More

    Submitted 27 January, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

  30. Evidence of Vocal Tract Articulation in Self-Supervised Learning of Speech

    Authors: Cheol Jun Cho, Peter Wu, Abdelrahman Mohamed, Gopala K. Anumanchipalli

    Abstract: Recent self-supervised learning (SSL) models have proven to learn rich representations of speech, which can readily be utilized by diverse downstream tasks. To understand such utilities, various analyses have been done for speech SSL models to reveal which and how information is encoded in the learned representations. Although the scope of previous analyses is extensive in acoustic, phonetic, and… ▽ More

    Submitted 20 July, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

  31. Enemy Spotted: in-game gun sound dataset for gunshot classification and localization

    Authors: Junwoo Park, Youngwoo Cho, Gyuhyeon Sim, Hojoon Lee, Jaegul Choo

    Abstract: Recently, deep learning-based methods have drawn huge attention due to their simple yet high performance without domain knowledge in sound classification and localization tasks. However, a lack of gun sounds in existing datasets has been a major obstacle to implementing a support system to spot criminals from their gunshots by leveraging deep learning models. Since the occurrence of gunshot is rar… ▽ More

    Submitted 16 February, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted at IEEE Conference on Games (GoG) 2022

  32. arXiv:2208.05445  [pdf, other

    eess.AS cs.AI cs.LG

    Non-Contrastive Self-supervised Learning for Utterance-Level Information Extraction from Speech

    Authors: Jaejin Cho, Jes'us Villalba, Laureano Moro-Velazquez, Najim Dehak

    Abstract: In recent studies, self-supervised pre-trained models tend to outperform supervised pre-trained models in transfer learning. In particular, self-supervised learning (SSL) of utterance-level speech representation can be used in speech applications that require discriminative representation of consistent attributes within an utterance: speaker, language, emotion, and age. Existing frame-level self-s… ▽ More

    Submitted 10 August, 2022; originally announced August 2022.

    Comments: EARLY ACCESS of IEEE JSTSP Special Issue on Self-Supervised Learning for Speech and Audio Processing

  33. arXiv:2208.05413  [pdf, other

    eess.AS cs.LG

    Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations

    Authors: Jaejin Cho, Raghavendra Pappagari, Piotr Żelasko, Laureano Moro-Velazquez, Jesús Villalba, Najim Dehak

    Abstract: Considering the abundance of unlabeled speech data and the high labeling costs, unsupervised learning methods can be essential for better system development. One of the most successful methods is contrastive self-supervised methods, which require negative sampling: sampling alternative samples to contrast with the current sample (anchor). However, it is hard to ensure if all the negative samples b… ▽ More

    Submitted 10 August, 2022; originally announced August 2022.

    Comments: Accepted at Interspeech 2022

  34. arXiv:2207.11534  [pdf, other

    eess.IV cs.AI cs.CV

    Comparative Validation of AI and non-AI Methods in MRI Volumetry to Diagnose Parkinsonian Syndromes

    Authors: Joomee Song, Juyoung Hahm, Jisoo Lee, Chae Yeon Lim, Myung Jin Chung, Jinyoung Youn, Jin Whan Cho, Jong Hyeon Ahn, Kyung-Su Kim

    Abstract: Automated segmentation and volumetry of brain magnetic resonance imaging (MRI) scans are essential for the diagnosis of Parkinson's disease (PD) and Parkinson's plus syndromes (P-plus). To enhance the diagnostic performance, we adopt deep learning (DL) models in brain segmentation and compared their performance with the gold-standard non-DL method. We collected brain MRI scans of healthy controls… ▽ More

    Submitted 23 July, 2022; originally announced July 2022.

    Comments: Joomee Song and Juyoung Hahm contributed equally to this work as the co-first author. Jong Hyeon Ahn and Kyung-Su Kim ([email protected]) contributed equally to this work as the co-corresponding author

  35. arXiv:2206.13700  [pdf, other

    cs.SD cs.LG eess.AS

    Domain Agnostic Few-shot Learning for Speaker Verification

    Authors: Seunghan Yang, Debasmit Das, Janghoon Cho, Hyoungwoo Park, Sungrack Yun

    Abstract: Deep learning models for verification systems often fail to generalize to new users and new environments, even though they learn highly discriminative features. To address this problem, we propose a few-shot domain generalization framework that learns to tackle distribution shift for new users and new domains. Our framework consists of domain-specific and domain-aggregation networks, which are the… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: Proceedings of INTERSPEECH 2022

  36. arXiv:2206.08494  [pdf, other

    cs.AI eess.SP

    Factorization Approach for Sparse Spatio-Temporal Brain-Computer Interface

    Authors: Byeong-Hoo Lee, Jeong-Hyun Cho, Byoung-Hee Kwon, Seong-Whan Lee

    Abstract: Recently, advanced technologies have unlimited potential in solving various problems with a large amount of data. However, these technologies have yet to show competitive performance in brain-computer interfaces (BCIs) which deal with brain signals. Basically, brain signals are difficult to collect in large quantities, in particular, the amount of information would be sparse in spontaneous BCIs. I… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: 8 pages

  37. arXiv:2204.09578  [pdf, other

    eess.SP cs.LG stat.AP

    Restructuring TCAD System: Teaching Traditional TCAD New Tricks

    Authors: Sanghoon Myung, Wonik Jang, Seonghoon Jin, Jae Myung Choe, Changwook Jeong, Dae Sin Kim

    Abstract: Traditional TCAD simulation has succeeded in predicting and optimizing the device performance; however, it still faces a massive challenge - a high computational cost. There have been many attempts to replace TCAD with deep learning, but it has not yet been completely replaced. This paper presents a novel algorithm restructuring the traditional TCAD system. The proposed algorithm predicts three-di… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: In Proceedings of 2021 IEEE International Electron Devices Meeting (IEDM)

    Journal ref: Proc. of IEDM 2021, 18.2.1-18.2.4 (2021)

  38. arXiv:2202.12827  [pdf

    eess.SP

    On Digital Subcarrier Multiplexing under A Bandwidth Limitation and ASE Noise

    Authors: Junho Cho, Xi Chen, Greg Raybon, Son Thai Le

    Abstract: We show that digital subcarrier multiplexing (DSM) systems require much greater complexity for Nyquist pulse shaping than single-carrier (SC) systems, and it is a misconception that both systems use the same bandwidth when using the same pulse shaping. Through back-to-back (B2B) experiments with realistic transmitter (TX) modules and amplified spontaneous emission (ASE) noise loading, we show that… ▽ More

    Submitted 25 February, 2022; originally announced February 2022.

  39. arXiv:2202.11756  [pdf, other

    cs.CR cs.LG eess.IV

    ML-based Anomaly Detection in Optical Fiber Monitoring

    Authors: Khouloud Abdelli, Joo Yeon Cho, Carsten Tropschug

    Abstract: Secure and reliable data communication in optical networks is critical for high-speed internet. We propose a data driven approach for the anomaly detection and faults identification in optical networks to diagnose physical attacks such as fiber breaks and optical tapping. The proposed methods include an autoencoder-based anomaly detection and an attention-based bidirectional gated recurrent unit a… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

    Comments: The AAAI-22 Workshop on Artificial Intelligence for Cyber Security (AICS)

  40. arXiv:2202.02814  [pdf

    eess.IV cs.LG

    Wave-Encoded Model-based Deep Learning for Highly Accelerated Imaging with Joint Reconstruction

    Authors: Jaejin Cho, Borjan Gagoski, Taehyung Kim, Qiyuan Tian, Stephen Robert Frost, Itthi Chatnuntawech, Berkin Bilgic

    Abstract: Purpose: To propose a wave-encoded model-based deep learning (wave-MoDL) strategy for highly accelerated 3D imaging and joint multi-contrast image reconstruction, and further extend this to enable rapid quantitative imaging using an interleaved look-locker acquisition sequence with T2 preparation pulse (3D-QALAS). Method: Recently introduced MoDL technique successfully incorporates convolutional… ▽ More

    Submitted 6 February, 2022; originally announced February 2022.

    Comments: 8 figures, 1 table

  41. arXiv:2202.02492  [pdf, other

    cs.IT eess.SP

    Predicting Future CSI Feedback For Highly-Mobile Massive MIMO Systems

    Authors: Yu Zhang, Ahmed Alkhateeb, Pranav Madadi, Jeongho Jeon, Joonyoung Cho, Charlie Zhang

    Abstract: Massive multiple-input multiple-output (MIMO) system is promising in providing unprecedentedly high data rate. To achieve its full potential, the transceiver needs complete channel state information (CSI) to perform transmit/receive precoding/combining. This requirement, however, is challenging in the practical systems due to the unavoidable processing and feedback delays, which oftentimes degrade… ▽ More

    Submitted 5 February, 2022; originally announced February 2022.

  42. Memory-guided Image De-raining Using Time-Lapse Data

    Authors: Jaehoon Cho, Seungryong Kim, Kwanghoon Sohn

    Abstract: This paper addresses the problem of single image de-raining, that is, the task of recovering clean and rain-free background scenes from a single image obscured by a rainy artifact. Although recent advances adopt real-world time-lapse data to overcome the need for paired rain-clean images, they are limited to fully exploit the time-lapse data. The main cause is that, in terms of network architectur… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

  43. arXiv:2112.07123  [pdf, other

    cs.HC eess.SP q-bio.NC

    Recognition of Tactile-related EEG Signals Generated by Self-touch

    Authors: Myoung-Ki Kim, Jeong-Hyun Cho, Hye-Bin Shin

    Abstract: Touch is the first sense among human senses. Not only that, but it is also one of the most important senses that are indispensable. However, compared to sight and hearing, it is often neglected. In particular, since humans use the tactile sense of the skin to recognize and manipulate objects, without tactile sensation, it is very difficult to recognize or skillfully manipulate objects. In addition… ▽ More

    Submitted 13 December, 2021; originally announced December 2021.

    Comments: Submitted to 2022 10th IEEE International Winter Conference on Brain-Computer Interface

  44. On the Kurtosis of Modulation Formats for Characterizing the Nonlinear Fiber Propagation

    Authors: Junho Cho, Robert Tkach

    Abstract: Knowing only two high-order statistical moments of modulation symbols, often represented by the fourth moment called "kurtosis", the overestimation of nonlinear interference (NLI) in a Gaussian noise (GN) model due to Gaussian signaling assumption can be corrected through an enhanced GN (EGN) model. However, in some modern optical communication systems where the transmitted modulation symbols are… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

  45. arXiv:2111.11410  [pdf, other

    cs.IT eess.SP

    Turbo Autoencoder with a Trainable Interleaver

    Authors: Karl Chahine, Yihan Jiang, Pooja Nuti, Hyeji Kim, Joonyoung Cho

    Abstract: A critical aspect of reliable communication involves the design of codes that allow transmissions to be robustly and computationally efficiently decoded under noisy conditions. Advances in the design of reliable codes have been driven by coding theory and have been sporadic. Recently, it is shown that channel codes that are comparable to modern codes can be learned solely via deep learning. In par… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

  46. arXiv:2109.13425  [pdf, ps, other

    eess.AS cs.LG cs.SD

    The JHU submission to VoxSRC-21: Track 3

    Authors: Jejin Cho, Jesus Villalba, Najim Dehak

    Abstract: This technical report describes Johns Hopkins University speaker recognition system submitted to Voxceleb Speaker Recognition Challenge 2021 Track 3: Self-supervised speaker verification (closed). Our overall training process is similar to the proposed one from the first place team in the last year's VoxSRC2020 challenge. The main difference is a recently proposed non-contrastive self-supervised m… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

  47. arXiv:2109.09683  [pdf

    eess.SP cs.IT physics.optics

    Single-ended Coherent Receiver

    Authors: Son Thai Le, Vahid Aref, Junho Cho

    Abstract: Commercial coherent receivers utilize balanced photodetectors (PDs) with high single-port rejection ratio (SPRR) to mitigate the signal-signal beat interference (SSBI) due to the square-law detection process. As the symbol rates of coherent transponders are increased to 100 Gbaud and beyond, maintaining a high SPRR in a cost-effective manner becomes more and more challenging. One potential approac… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

  48. arXiv:2108.12587  [pdf

    physics.med-ph eess.IV

    BUDA-SAGE with self-supervised denoising enables fast, distortion-free, high-resolution T2, T2*, para- and dia-magnetic susceptibility mapping

    Authors: Zijing Zhang, Long Wang, Jaejin Cho, Congyu Liao, Hyeong-Geol Shin, Xiaozhi Cao, Jongho Lee, Jinmin Xu, Tao Zhang, Huihui Ye, Kawin Setsompop, Huafeng Liu, Berkin Bilgic

    Abstract: To rapidly obtain high resolution T2, T2* and quantitative susceptibility mapping (QSM) source separation maps with whole-brain coverage and high geometric fidelity. We propose Blip Up-Down Acquisition for Spin And Gradient Echo imaging (BUDA-SAGE), an efficient echo-planar imaging (EPI) sequence for quantitative mapping. The acquisition includes multiple T2*-, T2'- and T2-weighted contrasts. We a… ▽ More

    Submitted 9 September, 2021; v1 submitted 28 August, 2021; originally announced August 2021.

  49. arXiv:2106.01918  [pdf

    eess.IV eess.SP physics.bio-ph

    Highly Accelerated EPI with Wave Encoding and Multi-shot Simultaneous Multi-Slice Imaging

    Authors: Jaejin Cho, Congyu Liao, Qiyuan Tian, Zijing Zhang, Jinmin Xu, Wei-Ching Lo, Benedikt A. Poser, V. Andrew Stenger, Jason Stockmann, Kawin Setsompop, Berkin Bilgic

    Abstract: We introduce wave encoded acquisition and reconstruction techniques for highly accelerated echo planar imaging (EPI) with reduced g-factor penalty and image artifacts. Wave-EPI involves playing sinusoidal gradients during the EPI readout while employing interslice shifts as in blipped-CAIPI acquisitions. This spreads the aliasing in all spatial directions, thereby taking better advantage of 3D coi… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

  50. arXiv:2104.01188  [pdf

    eess.SP cs.LG physics.med-ph

    Scan Specific Artifact Reduction in K-space (SPARK) Neural Networks Synergize with Physics-based Reconstruction to Accelerate MRI

    Authors: Yamin Arefeen, Onur Beker, Jaejin Cho, Heng Yu, Elfar Adalsteinsson, Berkin Bilgic

    Abstract: Purpose: To develop a scan-specific model that estimates and corrects k-space errors made when reconstructing accelerated Magnetic Resonance Imaging (MRI) data. Methods: Scan-Specific Artifact Reduction in k-space (SPARK) trains a convolutional-neural-network to estimate and correct k-space errors made by an input reconstruction technique by back-propagating from the mean-squared-error loss betw… ▽ More

    Submitted 28 April, 2022; v1 submitted 2 April, 2021; originally announced April 2021.