Search | arXiv e-print repository

On analytic exponential functors on free groups

Abstract: This paper concerns exponential contravariant functors on free groups. We obtain an equivalence of categories between analytic, exponential contravariant functors on free groups and conilpotent cocommutative Hopf algebras. This result explains how equivalences of categories obtained previously by Pirashvili and by Powell interact. Moreover, we obtain an equivalence between the categories of outer,… ▽ More This paper concerns exponential contravariant functors on free groups. We obtain an equivalence of categories between analytic, exponential contravariant functors on free groups and conilpotent cocommutative Hopf algebras. This result explains how equivalences of categories obtained previously by Pirashvili and by Powell interact. Moreover, we obtain an equivalence between the categories of outer, exponential contravariant functors on free groups and bicommutative Hopf algebras. We also go further by introducing a subclass of analytic, contravariant functors on free groups, called primitive functors; and prove an equivalence between primitive, exponential contravariant functors and primitive cocommutative Hopf algebras. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.08178 [pdf, other]

Key-point Guided Deformable Image Manipulation Using Diffusion Model

Authors: Seok-Hwan Oh, Guil Jung, Myeong-Gee Kim, Sang-Yun Kim, Young-Min Kim, Hyeon-Jik Lee, Hyuk-Sool Kwon, Hyeon-Min Bae

Abstract: In this paper, we introduce a Key-point-guided Diffusion probabilistic Model (KDM) that gains precise control over images by manipulating the object's key-point. We propose a two-stage generative model incorporating an optical flow map as an intermediate output. By doing so, a dense pixel-wise understanding of the semantic relation between the image and sparse key point is configured, leading to m… ▽ More In this paper, we introduce a Key-point-guided Diffusion probabilistic Model (KDM) that gains precise control over images by manipulating the object's key-point. We propose a two-stage generative model incorporating an optical flow map as an intermediate output. By doing so, a dense pixel-wise understanding of the semantic relation between the image and sparse key point is configured, leading to more realistic image generation. Additionally, the integration of optical flow helps regulate the inter-frame variance of sequential images, demonstrating an authentic sequential image generation. The KDM is evaluated with diverse key-point conditioned image synthesis tasks, including facial image generation, human pose synthesis, and echocardiography video prediction, demonstrating the KDM is proving consistency enhanced and photo-realistic images compared with state-of-the-art models. △ Less

Submitted 18 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

Comments: 24 pages

arXiv:2401.07476 [pdf, other]

Background study of the AMoRE-pilot experiment

Authors: A. Agrawal, V. V. Alenkov, P. Aryal, J. Beyer, B. Bhandari, R. S. Boiko, K. Boonin, O. Buzanov, C. R. Byeon, N. Chanthima, M. K. Cheoun, J. S. Choe, Seonho Choi, S. Choudhury, J. S. Chung, F. A. Danevich, M. Djamal, D. Drung, C. Enss, A. Fleischmann, A. M. Gangapshev, L. Gastaldo, Yu. M. Gavrilyuk, A. M. Gezhaev, O. Gileva , et al. (83 additional authors not shown)

Abstract: We report a study on the background of the Advanced Molybdenum-Based Rare process Experiment (AMoRE), a search for neutrinoless double beta decay (\znbb) of $^{100}$Mo. The pilot stage of the experiment was conducted using $\sim$1.9 kg of \CAMOO~ crystals at the Yangyang Underground Laboratory, South Korea, from 2015 to 2018. We compared the measured $β/γ$ energy spectra in three experimental conf… ▽ More We report a study on the background of the Advanced Molybdenum-Based Rare process Experiment (AMoRE), a search for neutrinoless double beta decay (\znbb) of $^{100}$Mo. The pilot stage of the experiment was conducted using $\sim$1.9 kg of \CAMOO~ crystals at the Yangyang Underground Laboratory, South Korea, from 2015 to 2018. We compared the measured $β/γ$ energy spectra in three experimental configurations with the results of Monte Carlo simulations and identified the background sources in each configuration. We replaced several detector components and enhanced the neutron shielding to lower the background level between configurations. A limit on the half-life of $0νββ$ decay of $^{100}$Mo was found at $T_{1/2}^{0ν} \ge 3.0\times 10^{23}$ years at 90\% confidence level, based on the measured background and its modeling. Further reduction of the background rate in the AMoRE-I and AMoRE-II are discussed. △ Less

Submitted 7 April, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.06798 [pdf]

Evaluation of Mean Shift, ComBat, and CycleGAN for Harmonizing Brain Connectivity Matrices Across Sites

Authors: Hanliang Xu, Nancy R. Newlin, Michael E. Kim, Chenyu Gao, Praitayini Kanakaraj, Aravind R. Krishnan, Lucas W. Remedios, Nazirah Mohd Khairi, Kimberly Pechman, Derek Archer, Timothy J. Hohman, Angela L. Jefferson, The BIOCARD Study Team, Ivana Isgum, Yuankai Huo, Daniel Moyer, Kurt G. Schilling, Bennett A. Landman

Abstract: Connectivity matrices derived from diffusion MRI (dMRI) provide an interpretable and generalizable way of understanding the human brain connectome. However, dMRI suffers from inter-site and between-scanner variation, which impedes analysis across datasets to improve robustness and reproducibility of results. To evaluate different harmonization approaches on connectivity matrices, we compared graph… ▽ More Connectivity matrices derived from diffusion MRI (dMRI) provide an interpretable and generalizable way of understanding the human brain connectome. However, dMRI suffers from inter-site and between-scanner variation, which impedes analysis across datasets to improve robustness and reproducibility of results. To evaluate different harmonization approaches on connectivity matrices, we compared graph measures derived from these matrices before and after applying three harmonization techniques: mean shift, ComBat, and CycleGAN. The sample comprises 168 age-matched, sex-matched normal subjects from two studies: the Vanderbilt Memory and Aging Project (VMAP) and the Biomarkers of Cognitive Decline Among Normal Individuals (BIOCARD). First, we plotted the graph measures and used coefficient of variation (CoV) and the Mann-Whitney U test to evaluate different methods' effectiveness in removing site effects on the matrices and the derived graph measures. ComBat effectively eliminated site effects for global efficiency and modularity and outperformed the other two methods. However, all methods exhibited poor performance when harmonizing average betweenness centrality. Second, we tested whether our harmonization methods preserved correlations between age and graph measures. All methods except for CycleGAN in one direction improved correlations between age and global efficiency and between age and modularity from insignificant to significant with p-values less than 0.05. △ Less

Submitted 24 January, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

Comments: 11 pages, 5 figures, to be published in SPIE Medical Imaging 2024: Image Processing

arXiv:2401.06443 [pdf, other]

BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation Pretraining

Authors: Minjun Kim, Seungwoo Song, Youhan Lee, Haneol Jang, Kyungtae Lim

Abstract: The current research direction in generative models, such as the recently developed GPT4, aims to find relevant knowledge information for multimodal and multilingual inputs to provide answers. Under these research circumstances, the demand for multilingual evaluation of visual question answering (VQA) tasks, a representative task of multimodal systems, has increased. Accordingly, we propose a bili… ▽ More The current research direction in generative models, such as the recently developed GPT4, aims to find relevant knowledge information for multimodal and multilingual inputs to provide answers. Under these research circumstances, the demand for multilingual evaluation of visual question answering (VQA) tasks, a representative task of multimodal systems, has increased. Accordingly, we propose a bilingual outside-knowledge VQA (BOK-VQA) dataset in this study that can be extended to multilingualism. The proposed data include 17K images, 17K question-answer pairs for both Korean and English and 280K instances of knowledge information related to question-answer content. We also present a framework that can effectively inject knowledge information into a VQA system by pretraining the knowledge information of BOK-VQA data in the form of graph embeddings. Finally, through in-depth analysis, we demonstrated the actual effect of the knowledge information contained in the constructed training data on VQA. △ Less

Submitted 15 March, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

arXiv:2401.04313 [pdf]

New twisted van der Waals fabrication method based on strongly adhesive polymer

Authors: Giung Park, Suhan Son, Jongchan Kim, Yunyeong Chang, Kaixuan Zhang, Miyoung Kim, Jieun Lee, Je-Geun Park

Abstract: Observations of emergent quantum phases in twisted bilayer graphene prompted a flurry of activities in van-der-Waals (vdW) materials beyond graphene. Most current twisted experiments use a so-called tear-and-stack method using a polymer called PPC. However, despite the clear advantage of the current PPC tear-and-stack method, there are also technical limitations, mainly a limited number of vdW mat… ▽ More Observations of emergent quantum phases in twisted bilayer graphene prompted a flurry of activities in van-der-Waals (vdW) materials beyond graphene. Most current twisted experiments use a so-called tear-and-stack method using a polymer called PPC. However, despite the clear advantage of the current PPC tear-and-stack method, there are also technical limitations, mainly a limited number of vdW materials that can be studied using this PPC-based method. This technical bottleneck has been preventing further development of the exciting field beyond a few available vdW samples. To overcome this challenge and facilitate future expansion, we developed a new tear-and-stack method using a strongly adhesive polycaprolactone (PCL). With similar angular accuracy, our technology allows fabrication without a capping layer, facilitating surface analysis and ensuring inherently clean interfaces and low operating temperatures. More importantly, it can be applied to many other vdW materials that have remained inaccessible with the PPC-based method. We present our results on twist homostructures made with a wide choice of vdW materials - from two well-studied vdW materials (graphene and MoS$_2$) to the first-ever demonstrations of other vdW materials (NbSe$_2$, NiPS$_3$, and Fe$_3$GeTe$_2$). Therefore, our new technique will help expand $moir\acute{e}$ physics beyond few selected vdW materials and open up more exciting developments. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 26 pages, 4+7 figures, accepted to 2D Materials

arXiv:2401.03707 [pdf, other]

FMA-Net: Flow-Guided Dynamic Filtering and Iterative Feature Refinement with Multi-Attention for Joint Video Super-Resolution and Deblurring

Authors: Geunhyuk Youk, Jihyong Oh, Munchurl Kim

Abstract: We present a joint learning scheme of video super-resolution and deblurring, called VSRDB, to restore clean high-resolution (HR) videos from blurry low-resolution (LR) ones. This joint restoration problem has drawn much less attention compared to single restoration problems. In this paper, we propose a novel flow-guided dynamic filtering (FGDF) and iterative feature refinement with multi-attention… ▽ More We present a joint learning scheme of video super-resolution and deblurring, called VSRDB, to restore clean high-resolution (HR) videos from blurry low-resolution (LR) ones. This joint restoration problem has drawn much less attention compared to single restoration problems. In this paper, we propose a novel flow-guided dynamic filtering (FGDF) and iterative feature refinement with multi-attention (FRMA), which constitutes our VSRDB framework, denoted as FMA-Net. Specifically, our proposed FGDF enables precise estimation of both spatio-temporally-variant degradation and restoration kernels that are aware of motion trajectories through sophisticated motion representation learning. Compared to conventional dynamic filtering, the FGDF enables the FMA-Net to effectively handle large motions into the VSRDB. Additionally, the stacked FRMA blocks trained with our novel temporal anchor (TA) loss, which temporally anchors and sharpens features, refine features in a course-to-fine manner through iterative updates. Extensive experiments demonstrate the superiority of the proposed FMA-Net over state-of-the-art methods in terms of both quantitative and qualitative quality. Codes and pre-trained models are available at: https://kaist-viclab.github.io/fmanet-site △ Less

Submitted 27 March, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

Comments: CVPR2024 (camera-ready version). The last two authors are co-corresponding authors. Please visit our project page at https://kaist-viclab.github.io/fmanet-site

arXiv:2401.03667 [pdf, ps, other]

doi 10.1093/mnras/stae1304

Quantum refraction effects in pulsar radio emission

Authors: Dong-Hoon Kim, Chul Min Kim, Sang Pyo Kim

Abstract: Highly magnetized neutron stars exhibit the vacuum non-linear electrodynamics effects, which can be well described using the one-loop effective action for quantum electrodynamics. In this context, we study the propagation and polarization of pulsar radio emission, based on the post-Maxwellian Lagrangian from the Heisenberg-Euler-Schwinger action. Given the refractive index obtained from this Lagra… ▽ More Highly magnetized neutron stars exhibit the vacuum non-linear electrodynamics effects, which can be well described using the one-loop effective action for quantum electrodynamics. In this context, we study the propagation and polarization of pulsar radio emission, based on the post-Maxwellian Lagrangian from the Heisenberg-Euler-Schwinger action. Given the refractive index obtained from this Lagrangian, we determine the leading-order corrections to both the propagation and polarization vectors due to quantum refraction via perturbation analysis. In addition, the effects on the orthogonality between the propagation and polarization vectors and the Faraday rotation angle, all due to quantum refraction are investigated. Furthermore, from the dual refractive index and the associated polarization modes, we discuss quantum birefringence, with the optical phenomenology analogous to its classical counterpart. △ Less

Submitted 11 June, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

Comments: 14 pages, 5 figures

Journal ref: Published in Monthly Notices of the Royal Astronomical Society, 531, 2148 on 20 May 2024

arXiv:2401.03567 [pdf, other]

Hyperbolic Distance-Based Speech Separation

Authors: Darius Petermann, Minje Kim

Abstract: In this work, we explore the task of hierarchical distance-based speech separation defined on a hyperbolic manifold. Based on the recent advent of audio-related tasks performed in non-Euclidean spaces, we propose to make use of the Poincaré ball to effectively unveil the inherent hierarchical structure found in complex speaker mixtures. We design two sets of experiments in which the distance-based… ▽ More In this work, we explore the task of hierarchical distance-based speech separation defined on a hyperbolic manifold. Based on the recent advent of audio-related tasks performed in non-Euclidean spaces, we propose to make use of the Poincaré ball to effectively unveil the inherent hierarchical structure found in complex speaker mixtures. We design two sets of experiments in which the distance-based parent sound classes, namely "near" and "far", can contain up to two or three speakers (i.e., children) each. We show that our hyperbolic approach is suitable for unveiling hierarchical structure from the problem definition, resulting in improved child-level separation. We further show that a clear correlation emerges between the notion of hyperbolic certainty (i.e., the distance to the ball's origin) and acoustic semantics such as speaker density, inter-source location, and microphone-to-speaker distance. △ Less

Submitted 7 January, 2024; originally announced January 2024.

arXiv:2401.03549 [pdf, other]

The JCMT Transient Survey: Six-Year Summary of 450/850\,$μ$m Protostellar Variability and Calibration Pipeline Version 2.0

Authors: Steve Mairs, Seonjae Lee, Doug Johnstone, Colton Broughton, Jeong-Eun Lee, Gregory J. Herczeg, Graham S. Bell, Zhiwei Chen, Carlos Contreras-Peña, Logan Francis, Jennifer Hatchell, Mi-Ryang Kim, Sheng-Yuan Liu, Geumsook Park, Keping Qiu, Yao-Te Wang, Xu Zhang, The JCMT Transient Team

Abstract: The JCMT Transient Survey has been monitoring eight Gould Belt low-mass star-forming regions since December 2015 and six somewhat more distant intermediate-mass star-forming regions since February 2020 with SCUBA-2 on the JCMT at \ShortS and \LongS and with an approximately monthly cadence. We introduce our Pipeline v2 relative calibration procedures for image alignment and flux calibration across… ▽ More The JCMT Transient Survey has been monitoring eight Gould Belt low-mass star-forming regions since December 2015 and six somewhat more distant intermediate-mass star-forming regions since February 2020 with SCUBA-2 on the JCMT at \ShortS and \LongS and with an approximately monthly cadence. We introduce our Pipeline v2 relative calibration procedures for image alignment and flux calibration across epochs, improving on our previous Pipeline v1 by decreasing measurement uncertainties and providing additional robustness. These new techniques work at both \LongS and \ShortNS, where v1 only allowed investigation of the \LongS data. Pipeline v2 achieves better than $0.5^{\prime\prime}$ relative image alignment, less than a tenth of the submillimeter beam widths. The v2 relative flux calibration is found to be 1\% at \LongS and $<5$\% at \ShortNS. The improvement in the calibration is demonstrated by comparing the two pipelines over the first four years of the survey and recovering additional robust variables with v2. Using the full six years of the Gould Belt survey the number of robust variables increases by 50\,\%, and at \ShortS we identify four robust variables, all of which are also robust at \LongNS. The multi-wavelength light curves for these sources are investigated and found to be consistent with the variability being due to dust heating within the envelope in response to accretion luminosity changes from the central source. △ Less

Submitted 7 January, 2024; originally announced January 2024.

Comments: Accepted for Publication in the The Astrophysical Journal. DOI link to data will become public after the proof stage is complete

arXiv:2401.03060 [pdf]

Super-resolution multi-contrast unbiased eye atlases with deep probabilistic refinement

Authors: Ho Hin Lee, Adam M. Saunders, Michael E. Kim, Samuel W. Remedios, Lucas W. Remedios, Yucheng Tang, Qi Yang, Xin Yu, Shunxing Bao, Chloe Cho, Louise A. Mawn, Tonia S. Rex, Kevin L. Schey, Blake E. Dewey, Jeffrey M. Spraggins, Jerry L. Prince, Yuankai Huo, Bennett A. Landman

Abstract: Purpose: Eye morphology varies significantly across the population, especially for the orbit and optic nerve. These variations limit the feasibility and robustness of generalizing population-wise features of eye organs to an unbiased spatial reference. Approach: To tackle these limitations, we propose a process for creating high-resolution unbiased eye atlases. First, to restore spatial details… ▽ More Purpose: Eye morphology varies significantly across the population, especially for the orbit and optic nerve. These variations limit the feasibility and robustness of generalizing population-wise features of eye organs to an unbiased spatial reference. Approach: To tackle these limitations, we propose a process for creating high-resolution unbiased eye atlases. First, to restore spatial details from scans with a low through-plane resolution compared to a high in-plane resolution, we apply a deep learning-based super-resolution algorithm. Then, we generate an initial unbiased reference with an iterative metric-based registration using a small portion of subject scans. We register the remaining scans to this template and refine the template using an unsupervised deep probabilistic approach that generates a more expansive deformation field to enhance the organ boundary alignment. We demonstrate this framework using magnetic resonance images across four different tissue contrasts, generating four atlases in separate spatial alignments. Results: For each tissue contrast, we find a significant improvement using the Wilcoxon signed-rank test in the average Dice score across four labeled regions compared to a standard registration framework consisting of rigid, affine, and deformable transformations. These results highlight the effective alignment of eye organs and boundaries using our proposed process. Conclusions: By combining super-resolution preprocessing and deep probabilistic models, we address the challenge of generating an eye atlas to serve as a standardized reference across a largely variable population. △ Less

Submitted 14 June, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

Comments: Revised for submission to SPIE Journal of Medical Imaging. 26 pages, 6 figures

arXiv:2401.01498 [pdf, other]

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction

Authors: Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Semin Kim, Joun Yeop Lee, Nam Soo Kim

Abstract: We propose a novel text-to-speech (TTS) framework centered around a neural transducer. Our approach divides the whole TTS pipeline into semantic-level sequence-to-sequence (seq2seq) modeling and fine-grained acoustic modeling stages, utilizing discrete semantic tokens obtained from wav2vec2.0 embeddings. For a robust and efficient alignment modeling, we employ a neural transducer named token trans… ▽ More We propose a novel text-to-speech (TTS) framework centered around a neural transducer. Our approach divides the whole TTS pipeline into semantic-level sequence-to-sequence (seq2seq) modeling and fine-grained acoustic modeling stages, utilizing discrete semantic tokens obtained from wav2vec2.0 embeddings. For a robust and efficient alignment modeling, we employ a neural transducer named token transducer for the semantic token prediction, benefiting from its hard monotonic alignment constraints. Subsequently, a non-autoregressive (NAR) speech generator efficiently synthesizes waveforms from these semantic tokens. Additionally, a reference speech controls temporal dynamics and acoustic conditions at each stage. This decoupled framework reduces the training complexity of TTS while allowing each stage to focus on semantic and acoustic modeling. Our experimental results on zero-shot adaptive TTS demonstrate that our model surpasses the baseline in terms of speech quality and speaker similarity, both objectively and subjectively. We also delve into the inference speed and prosody control capabilities of our approach, highlighting the potential of neural transducers in TTS frameworks. △ Less

Submitted 2 January, 2024; originally announced January 2024.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2401.01228 [pdf, ps, other]

doi 10.1103/PhysRevA.107.022423

Non-Gaussian entanglement criteria for atomic homodyne detection

Authors: Jaehak Lee, Jiyong Park, Jaewan Kim, M. S. Kim, Hyunchul Nha

Abstract: Homodyne measurement is a crucial tool widely used to address continuous variables for bosonic quantum systems. While an ideal homodyne detection provides a powerful analysis, e.g. to effectively measure quadrature amplitudes of light in quantum optics, it relies on the use of a strong reference field, the so-called local oscillator typically in a coherent state. Such a strong coherent local oscil… ▽ More Homodyne measurement is a crucial tool widely used to address continuous variables for bosonic quantum systems. While an ideal homodyne detection provides a powerful analysis, e.g. to effectively measure quadrature amplitudes of light in quantum optics, it relies on the use of a strong reference field, the so-called local oscillator typically in a coherent state. Such a strong coherent local oscillator may not be readily available particularly for a massive quantum system like Bose-Einstein condensate (BEC), posing a substantial challenge in dealing with continuous variables appropriately. It is necessary to establish a practical framework that includes the effects of non-ideal local oscillators for a rigorous assessment of various quantum tests and applications. We here develop entanglement criteria beyond Gaussian regime applicable for this realistic homodyne measurement that do not require assumptions on the state of local oscillators. We discuss the working conditions of homodyne detection to effectively detect non-Gaussian quantum entanglement under various states of local oscillators. △ Less

Submitted 2 January, 2024; originally announced January 2024.

Comments: 9 pages, 5 figures

Journal ref: Phys. Rev. A 107, 022423 (2023)

arXiv:2401.01099 [pdf, other]

Efficient Parallel Audio Generation using Group Masked Language Modeling

Authors: Myeonghun Jeong, Minchan Kim, Joun Yeop Lee, Nam Soo Kim

Abstract: We present a fast and high-quality codec language model for parallel audio generation. While SoundStorm, a state-of-the-art parallel audio generation model, accelerates inference speed compared to autoregressive models, it still suffers from slow inference due to iterative sampling. To resolve this problem, we propose Group-Masked Language Modeling~(G-MLM) and Group Iterative Parallel Decoding~(G-… ▽ More We present a fast and high-quality codec language model for parallel audio generation. While SoundStorm, a state-of-the-art parallel audio generation model, accelerates inference speed compared to autoregressive models, it still suffers from slow inference due to iterative sampling. To resolve this problem, we propose Group-Masked Language Modeling~(G-MLM) and Group Iterative Parallel Decoding~(G-IPD) for efficient parallel audio generation. Both the training and sampling schemes enable the model to synthesize high-quality audio with a small number of iterations by effectively modeling the group-wise conditional dependencies. In addition, our model employs a cross-attention-based architecture to capture the speaker style of the prompt voice and improves computational efficiency. Experimental results demonstrate that our proposed model outperforms the baselines in prompt-based audio generation. △ Less

Submitted 2 January, 2024; originally announced January 2024.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2401.00252 [pdf, ps, other]

Cluster algebras and monotone Lagrangian tori

Authors: Yunhyung Cho, Myungho Kim, Yoosik Kim, Euiyong Park

Abstract: Motivated by recent developments in the construction of Newton--Okounkov bodies and toric degenerations via cluster algebras in [GHKK18, FO20], we consider a family of Newton--Okounkov polytopes of a complex smooth projective variety $X$ related by a composition of tropicalized cluster mutations. According to the work of [HK15], the toric degeneration associated with each Newton--Okounkov polytope… ▽ More Motivated by recent developments in the construction of Newton--Okounkov bodies and toric degenerations via cluster algebras in [GHKK18, FO20], we consider a family of Newton--Okounkov polytopes of a complex smooth projective variety $X$ related by a composition of tropicalized cluster mutations. According to the work of [HK15], the toric degeneration associated with each Newton--Okounkov polytope $Δ$ in the family produces a Lagrangian torus fibration of $X$ over $Δ$. We investigate circumstances in which each Lagrangian torus fibration possesses a monotone Lagrangian torus fiber. We provide a sufficient condition, based on the data of tropical integer points and exchange matrices, for the family of constructed monotone Lagrangian tori to contain infinitely many monotone Lagrangian tori, no two of which are related by any symplectomorphisms. By employing this criterion and exploiting the correspondence between the tropical integer points and the dual canonical basis elements, we generate infinitely many distinct monotone Lagrangian tori on flag manifolds of arbitrary type except in a few cases. △ Less

Submitted 30 December, 2023; originally announced January 2024.

Comments: 43 pages

arXiv:2312.17138 [pdf, ps, other]

Entanglement entropies in the abelian arithmetic Chern-Simons theory

Authors: Hee-Joong Chung, Dohyeong Kim, Minhyong Kim, Jeehoon Park, Hwajong Yoo

Abstract: The notion of {\em entanglement entropy} in quantum mechanical systems is an important quantity, which measures how much a physical state is entangled in a composite system. Mathematically, it measures how much the state vector is not decomposable as elements in the tensor product of two Hilbert spaces. In this paper, we seek its arithmetic avatar: the theory of arithmetic Chern-Simons theory with… ▽ More The notion of {\em entanglement entropy} in quantum mechanical systems is an important quantity, which measures how much a physical state is entangled in a composite system. Mathematically, it measures how much the state vector is not decomposable as elements in the tensor product of two Hilbert spaces. In this paper, we seek its arithmetic avatar: the theory of arithmetic Chern-Simons theory with finite gauge group $G$ naturally associates a state vector inside the product of two quantum Hilbert spaces and we provide a formula for the {\em von Neumann entanglement entropy} of such state vector when $G$ is a cyclic group of prime order. △ Less

Submitted 28 December, 2023; originally announced December 2023.

Comments: 13 pages

MSC Class: 11R34; 81P40; 81T99

arXiv:2312.16411 [pdf, ps, other]

Wolff potential estimates and Wiener criterion for nonlocal equations with nonstandard growth

Authors: Minhyun Kim, Ki-Ahm Lee, Se-Chan Lee

Abstract: We prove the Wolff potential estimates for nonlocal equations with nonstandard growth. As an application, we obtain the Wiener criterion in this framework, which provides a necessary and sufficient condition for boundary points to be regular. Our approach relies on the fine analysis of superharmonic functions in view of nonlocal nonlinear potential theory. We prove the Wolff potential estimates for nonlocal equations with nonstandard growth. As an application, we obtain the Wiener criterion in this framework, which provides a necessary and sufficient condition for boundary points to be regular. Our approach relies on the fine analysis of superharmonic functions in view of nonlocal nonlinear potential theory. △ Less

Submitted 26 December, 2023; originally announced December 2023.

Comments: 38 pages

MSC Class: 31C45; 31B25; 31B15; 35R11

arXiv:2312.15283 [pdf, other]

Impact of discontinuous grain size distributions on the spectral energy distribution of debris disks

Authors: Minjae Kim, Sebastian Wolf

Abstract: The collisional evolution of debris disks is expected to result in a characteristic wavy pattern of the grain size distributions, i.e., an under/overabundance of particles of specific sizes. This perturbed grain size distribution potentially leaves characteristic patterns in the spectral energy distribution (SED) of the disk system. We aim to quantify and understand the specific influence of disco… ▽ More The collisional evolution of debris disks is expected to result in a characteristic wavy pattern of the grain size distributions, i.e., an under/overabundance of particles of specific sizes. This perturbed grain size distribution potentially leaves characteristic patterns in the spectral energy distribution (SED) of the disk system. We aim to quantify and understand the specific influence of discontinuous particle size distributions on the appearance of debris disks. For this purpose, we consider dust emission models based on two different grain size distributions, i.e., once with a single and once with a broken power law. We compare the spectral index $α$ ($F_ν\,\propto ν^{\rm{\,α}}$) in the case of a continuous grain size distribution with that of a discontinuous grain size distribution. We perform this comparison for central stars with different spectral types and two different disk structures (e.g., slim and broad debris dust rings). Within the considered parameter space, we find a characteristic difference between the spectral slopes of the SED in the different scenarios. More specifically, the overabundance of small grains leads to a steeper slope in the far-infrared/submillimeter regime, while the spectral index in the mm regime is hardly affected. On the other hand, the underabundance of medium-sized grains results in a slight steepening of the far-infrared slope of SED, while its primary effect is on the mm slope of SED, causing it to become shallower. We also find that the impact of an overabundance of small dust particles is more pronounced than that of an underabundance of medium-sized dust particles. Furthermore, we find that the difference between the spectral indices for the two different grain size distributions is largest for debris disks around brighter central stars and broader disks. △ Less

Submitted 23 December, 2023; originally announced December 2023.

Comments: 14 pages, 10 figures

arXiv:2312.13822 [pdf, other]

Universal Noise Annotation: Unveiling the Impact of Noisy annotation on Object Detection

Authors: Kwangrok Ryoo, Yeonsik Jo, Seungjun Lee, Mira Kim, Ahra Jo, Seung Hwan Kim, Seungryong Kim, Soonyoung Lee

Abstract: For object detection task with noisy labels, it is important to consider not only categorization noise, as in image classification, but also localization noise, missing annotations, and bogus bounding boxes. However, previous studies have only addressed certain types of noise (e.g., localization or categorization). In this paper, we propose Universal-Noise Annotation (UNA), a more practical settin… ▽ More For object detection task with noisy labels, it is important to consider not only categorization noise, as in image classification, but also localization noise, missing annotations, and bogus bounding boxes. However, previous studies have only addressed certain types of noise (e.g., localization or categorization). In this paper, we propose Universal-Noise Annotation (UNA), a more practical setting that encompasses all types of noise that can occur in object detection, and analyze how UNA affects the performance of the detector. We analyzed the development direction of previous works of detection algorithms and examined the factors that impact the robustness of detection model learning method. We open-source the code for injecting UNA into the dataset and all the training log and weight are also shared. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: appendix and code : https://github.com/Ryoo72/UNA

arXiv:2312.13615 [pdf, other]

Self-supervised Complex Network for Machine Sound Anomaly Detection

Authors: Miseul Kim, Minh Tri Ho, Hong-Goo Kang

Abstract: In this paper, we propose an anomaly detection algorithm for machine sounds with a deep complex network trained by self-supervision. Using the fact that phase continuity information is crucial for detecting abnormalities in time-series signals, our proposed algorithm utilizes the complex spectrum as an input and performs complex number arithmetic throughout the entire process. Since the usefulness… ▽ More In this paper, we propose an anomaly detection algorithm for machine sounds with a deep complex network trained by self-supervision. Using the fact that phase continuity information is crucial for detecting abnormalities in time-series signals, our proposed algorithm utilizes the complex spectrum as an input and performs complex number arithmetic throughout the entire process. Since the usefulness of phase information can vary depending on the type of machine sound, we also apply an attention mechanism to control the weights of the complex and magnitude spectrum bottleneck features depending on the machine type. We train our network to perform a self-supervised task that classifies the machine identifier (id) of normal input sounds among multiple classes. At test time, an input signal is detected as anomalous if the trained model is unable to correctly classify the id. In other words, we determine the presence of an anomality when the output cross-entropy score of the multiclass identification task is lower than a pre-defined threshold. Experiments with the MIMII dataset show that the proposed algorithm has a much higher area under the curve (AUC) score than conventional magnitude spectrum-based algorithms. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: Published in EUSIPCO 2021

arXiv:2312.13603 [pdf, other]

Style Modeling for Multi-Speaker Articulation-to-Speech

Authors: Miseul Kim, Zhenyu Piao, Jihyun Lee, Hong-Goo Kang

Abstract: In this paper, we propose a neural articulation-to-speech (ATS) framework that synthesizes high-quality speech from articulatory signal in a multi-speaker situation. Most conventional ATS approaches only focus on modeling contextual information of speech from a single speaker's articulatory features. To explicitly represent each speaker's speaking style as well as the contextual information, our p… ▽ More In this paper, we propose a neural articulation-to-speech (ATS) framework that synthesizes high-quality speech from articulatory signal in a multi-speaker situation. Most conventional ATS approaches only focus on modeling contextual information of speech from a single speaker's articulatory features. To explicitly represent each speaker's speaking style as well as the contextual information, our proposed model estimates style embeddings, guided from the essential speech style attributes such as pitch and energy. We adopt convolutional layers and transformer-based attention layers for our model to fully utilize both local and global information of articulatory signals, measured by electromagnetic articulography (EMA). Our model significantly improves the quality of synthesized speech compared to the baseline in terms of objective and subjective measurements in the Haskins dataset. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: 5 pages, Accepted to ICASSP 2023

arXiv:2312.13600 [pdf, other]

BrainTalker: Low-Resource Brain-to-Speech Synthesis with Transfer Learning using Wav2Vec 2.0

Authors: Miseul Kim, Zhenyu Piao, Jihyun Lee, Hong-Goo Kang

Abstract: Decoding spoken speech from neural activity in the brain is a fast-emerging research topic, as it could enable communication for people who have difficulties with producing audible speech. For this task, electrocorticography (ECoG) is a common method for recording brain activity with high temporal resolution and high spatial precision. However, due to the risky surgical procedure required for obta… ▽ More Decoding spoken speech from neural activity in the brain is a fast-emerging research topic, as it could enable communication for people who have difficulties with producing audible speech. For this task, electrocorticography (ECoG) is a common method for recording brain activity with high temporal resolution and high spatial precision. However, due to the risky surgical procedure required for obtaining ECoG recordings, relatively little of this data has been collected, and the amount is insufficient to train a neural network-based Brain-to-Speech (BTS) system. To address this problem, we propose BrainTalker-a novel BTS framework that generates intelligible spoken speech from ECoG signals under extremely low-resource scenarios. We apply a transfer learning approach utilizing a pre-trained self supervised model, Wav2Vec 2.0. Specifically, we train an encoder module to map ECoG signals to latent embeddings that match Wav2Vec 2.0 representations of the corresponding spoken speech. These embeddings are then transformed into mel-spectrograms using stacked convolutional and transformer-based layers, which are fed into a neural vocoder to synthesize speech waveform. Experimental results demonstrate our proposed framework achieves outstanding performance in terms of subjective and objective metrics, including a Pearson correlation coefficient of 0.9 between generated and ground truth mel spectrograms. We share publicly available Demos and Code. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: 5 pages. Accepted to BHI 2023

arXiv:2312.13528 [pdf, other]

DyBluRF: Dynamic Deblurring Neural Radiance Fields for Blurry Monocular Video

Authors: Minh-Quan Viet Bui, Jongmin Park, Jihyong Oh, Munchurl Kim

Abstract: Neural Radiance Fields (NeRF), initially developed for static scenes, have inspired many video novel view synthesis techniques. However, the challenge for video view synthesis arises from motion blur, a consequence of object or camera movement during exposure, which hinders the precise synthesis of sharp spatio-temporal views. In response, we propose a novel dynamic deblurring NeRF framework for b… ▽ More Neural Radiance Fields (NeRF), initially developed for static scenes, have inspired many video novel view synthesis techniques. However, the challenge for video view synthesis arises from motion blur, a consequence of object or camera movement during exposure, which hinders the precise synthesis of sharp spatio-temporal views. In response, we propose a novel dynamic deblurring NeRF framework for blurry monocular video, called DyBluRF, consisting of a Base Ray Initialization (BRI) stage and a Motion Decomposition-based Deblurring (MDD) stage. Our DyBluRF is the first that handles the novel view synthesis for blurry monocular video with a novel two-stage framework. In the BRI stage, we coarsely reconstruct dynamic 3D scenes and jointly initialize the base ray, which is further used to predict latent sharp rays, using the inaccurate camera pose information from the given blurry frames. In the MDD stage, we introduce a novel Incremental Latent Sharp-rays Prediction (ILSP) approach for the blurry monocular video frames by decomposing the latent sharp rays into global camera motion and local object motion components. We further propose two loss functions for effective geometry regularization and decomposition of static and dynamic scene components without any mask supervision. Experiments show that DyBluRF outperforms qualitatively and quantitatively the SOTA methods. △ Less

Submitted 29 March, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: The first two authors contributed equally to this work (equal contribution). The last two authors advised equally to this work. Please visit our project page at https://kaist-viclab.github.io/dyblurf-site/

arXiv:2312.12638 [pdf, other]

Using Exact Tests from Algebraic Statistics in Sparse Multi-way Analyses: An Application to Analyzing Differential Item Functioning

Authors: Shishir Agrawal, Luis David Garcia Puente, Minho Kim, Flavia Sancier-Barbosa

Abstract: Asymptotic goodness-of-fit methods in contingency table analysis can struggle with sparse data, especially in multi-way tables where it can be infeasible to meet sample size requirements for a robust application of distributional assumptions. However, algebraic statistics provides exact alternatives to these classical asymptotic methods that remain viable even with sparse data. We apply these meth… ▽ More Asymptotic goodness-of-fit methods in contingency table analysis can struggle with sparse data, especially in multi-way tables where it can be infeasible to meet sample size requirements for a robust application of distributional assumptions. However, algebraic statistics provides exact alternatives to these classical asymptotic methods that remain viable even with sparse data. We apply these methods to a context in psychometrics and education research that leads naturally to multi-way contingency tables: the analysis of differential item functioning (DIF). We explain concretely how to apply the exact methods of algebraic statistics to DIF analysis using the R package algstat, and we compare their performance to that of classical asymptotic methods. △ Less

Submitted 26 December, 2023; v1 submitted 19 December, 2023; originally announced December 2023.

Comments: 25 pages, tex tweaks; comments welcome

arXiv:2312.12391 [pdf, other]

vTrain: A Simulation Framework for Evaluating Cost-effective and Compute-optimal Large Language Model Training

Authors: Jehyeon Bang, Yujeong Choi, Myeongwoo Kim, Yongdeok Kim, Minsoo Rhu

Abstract: As large language models (LLMs) become widespread in various application domains, a critical challenge the AI community is facing is how to train these large AI models in a cost-effective manner. Existing LLM training plans typically employ a heuristic based parallel training strategy which is based on empirical observations rather than grounded upon a thorough examination of the search space of L… ▽ More As large language models (LLMs) become widespread in various application domains, a critical challenge the AI community is facing is how to train these large AI models in a cost-effective manner. Existing LLM training plans typically employ a heuristic based parallel training strategy which is based on empirical observations rather than grounded upon a thorough examination of the search space of LLM parallelization. Such limitation renders existing systems to leave significant performance left on the table, wasting millions of dollars worth of training cost. This paper presents our profiling-driven simulator called vTrain, providing AI practitioners a fast yet accurate software framework to determine an efficient and cost-effective LLM training system configuration. We demonstrate vTrain's practicality through several case studies, e.g., effectively evaluating optimal training parallelization strategies that balances training time and its associated training cost, efficient multi-tenant GPU cluster schedulers targeting multiple LLM training jobs, and determining a compute-optimal LLM model architecture given a fixed compute budget. △ Less

Submitted 27 November, 2023; originally announced December 2023.

arXiv:2312.11520 [pdf]

Implementing biosensing based user preference visualisation in architectural spaces

Authors: Mi Kyoung Kim

Abstract: This study delves into the interplay between architectural spaces and human emotions, leveraging the emergent field of neuroarchitecture. It examines the functional and aesthetic influence of architectural design on individual users, with a focus on biosensing data such as brainwave and eye tracking information to understand user preferences. This study delves into the interplay between architectural spaces and human emotions, leveraging the emergent field of neuroarchitecture. It examines the functional and aesthetic influence of architectural design on individual users, with a focus on biosensing data such as brainwave and eye tracking information to understand user preferences. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Comments: 20 pages

arXiv:2312.11519 [pdf]

Analysing user sentiment data for architectural interior spaces

Authors: Mi Kyoung Kim

Abstract: This study aims to develop a data driven system to enhance the analysis and improvement of user experiences in interior spaces, acknowledging the significant impact of design on individuals health, productivity, and quality of life. This study aims to develop a data driven system to enhance the analysis and improvement of user experiences in interior spaces, acknowledging the significant impact of design on individuals health, productivity, and quality of life. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Comments: 13 pages

arXiv:2312.10519 [pdf, other]

Interpretable Online Network Dictionary Learning for Inferring Long-Range Chromatin Interactions

Authors: Vishal Rana, Jianhao Peng, Chao Pan, Hanbaek Lyu, Albert Cheng, Minji Kim, Olgica Milenkovic

Abstract: Dictionary learning (DL) is commonly used in computational biology to tackle ubiquitous clustering problems due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability and are not optimized for large-scale graph-structured data. We propose a novel DL algorithm called online convex network dictionary learning (onlin… ▽ More Dictionary learning (DL) is commonly used in computational biology to tackle ubiquitous clustering problems due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability and are not optimized for large-scale graph-structured data. We propose a novel DL algorithm called online convex network dictionary learning (online cvxNDL) that can handle extremely large datasets and enables the interpretation of dictionary elements, which serve as cluster representatives, through convex combinations of real measurements. Moreover, the algorithm can be applied to network-structured data via specialized subnetwork sampling techniques. To demonstrate the utility of our approach, we apply cvxNDL on 3D-genome RNAPII ChIA-Drop data to identify important long-range interaction patterns. ChIA-Drop probes higher-order interactions, and produces hypergraphs whose nodes represent genomic fragments. The hyperedges represent observed physical contacts. Our hypergraph model analysis creates an interpretable dictionary of long-range interaction patterns that accurately represent global chromatin physical contact maps. Using dictionary information, one can also associate the contact maps with RNA transcripts and infer cellular functions. Our results offer two key insights. First, we demonstrate that online cvxNDL retains the accuracy of classical DL methods while simultaneously ensuring unique interpretability and scalability. Second, we identify distinct collections of proximal and distal interaction patterns involving chromatin elements shared by related processes across different chromosomes, as well as patterns unique to specific chromosomes. To associate the dictionary elements with biological properties of the corresponding chromatin regions, we employ Gene Ontology enrichment analysis and perform RNA coexpression studies. △ Less

Submitted 16 December, 2023; originally announced December 2023.

arXiv:2312.10118 [pdf, other]

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Authors: Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon, Munchurl Kim

Abstract: Self-supervised monocular depth estimation (DE) is an approach to learning depth without costly depth ground truths. However, it often struggles with moving objects that violate the static scene assumption during training. To address this issue, we introduce a coarse-to-fine training strategy leveraging the ground contacting prior based on the observation that most moving objects in outdoor scenes… ▽ More Self-supervised monocular depth estimation (DE) is an approach to learning depth without costly depth ground truths. However, it often struggles with moving objects that violate the static scene assumption during training. To address this issue, we introduce a coarse-to-fine training strategy leveraging the ground contacting prior based on the observation that most moving objects in outdoor scenes contact the ground. In the coarse training stage, we exclude the objects in dynamic classes from the reprojection loss calculation to avoid inaccurate depth learning. To provide precise supervision on the depth of the objects, we present a novel Ground-contacting-prior Disparity Smoothness Loss (GDS-Loss) that encourages a DE network to align the depth of the objects with their ground-contacting points. Subsequently, in the fine training stage, we refine the DE network to learn the detailed depth of the objects from the reprojection loss, while ensuring accurate DE on the moving object regions by employing our regularization loss with a cost-volume-based weighting factor. Our overall coarse-to-fine training strategy can easily be integrated with existing DE methods without any modifications, significantly enhancing DE performance on challenging Cityscapes and KITTI datasets, especially in the moving object regions. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2312.09827 [pdf, other]

doi 10.1103/PhysRevC.109.054910

Identified charged-hadron production in $p$$+$Al, $^3$He$+$Au, and Cu$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV and in U$+$U collisions at $\sqrt{s_{_{NN}}}=193$ GeV

Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, C. Aidala, N. N. Ajitanand, Y. Akiba, R. Akimoto, J. Alexander, M. Alfred, V. Andrieux, K. Aoki, N. Apadula, H. Asano, E. T. Atomssa, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, X. Bai, N. S. Bandara, B. Bannier, K. N. Barish, S. Bathe, V. Baublis , et al. (456 additional authors not shown)

Abstract: The PHENIX experiment has performed a systematic study of identified charged-hadron ($π^\pm$, $K^\pm$, $p$, $\bar{p}$) production at midrapidity in $p$$+$Al, $^3$He$+$Au, Cu$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV and U$+$U collisions at $\sqrt{s_{_{NN}}}=193$ GeV. Identified charged-hadron invariant transverse-momentum ($p_T$) and transverse-mass ($m_T$) spectra are presented and interprete… ▽ More The PHENIX experiment has performed a systematic study of identified charged-hadron ($π^\pm$, $K^\pm$, $p$, $\bar{p}$) production at midrapidity in $p$$+$Al, $^3$He$+$Au, Cu$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV and U$+$U collisions at $\sqrt{s_{_{NN}}}=193$ GeV. Identified charged-hadron invariant transverse-momentum ($p_T$) and transverse-mass ($m_T$) spectra are presented and interpreted in terms of radially expanding thermalized systems. The particle ratios of $K/π$ and $p/π$ have been measured in different centrality ranges of large (Cu$+$Au, U$+$U) and small ($p$$+$Al, $^3$He$+$Au) collision systems. The values of $K/π$ ratios measured in all considered collision systems were found to be consistent with those measured in $p$$+$$p$ collisions. However the values of $p/π$ ratios measured in large collision systems reach the values of $\approx0.6$, which is $\approx2$ times larger than in $p$$+$$p$ collisions. These results can be qualitatively understood in terms of the baryon enhancement expected from hadronization by recombination. Identified charged-hadron nuclear-modification factors ($R_{AB}$) are also presented. Enhancement of proton $R_{AB}$ values over meson $R_{AB}$ values was observed in central $^3$He$+$Au, Cu$+$Au, and U$+$U collisions. The proton $R_{AB}$ values measured in $p$$+$Al collision system were found to be consistent with $R_{AB}$ values of $φ$, $π^\pm$, $K^\pm$, and $π^0$ mesons, which may indicate that the size of the system produced in $p$$+$Al collisions is too small for recombination to cause a noticeable increase in proton production. △ Less

Submitted 22 May, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: 480 authors from 78 institutions, 18 pages, 6 tables, 16 figures. v2 is version accepted for publication in Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

Journal ref: Phys. Rev. C 109, 054910 (2024)

arXiv:2312.09691 [pdf, other]

Quilt: Robust Data Segment Selection against Concept Drifts

Authors: Minsu Kim, Seong-Hyeon Hwang, Steven Euijong Whang

Abstract: Continuous machine learning pipelines are common in industrial settings where models are periodically trained on data streams. Unfortunately, concept drifts may occur in data streams where the joint distribution of the data X and label y, P(X, y), changes over time and possibly degrade model accuracy. Existing concept drift adaptation approaches mostly focus on updating the model to the new data p… ▽ More Continuous machine learning pipelines are common in industrial settings where models are periodically trained on data streams. Unfortunately, concept drifts may occur in data streams where the joint distribution of the data X and label y, P(X, y), changes over time and possibly degrade model accuracy. Existing concept drift adaptation approaches mostly focus on updating the model to the new data possibly using ensemble techniques of previous models and tend to discard the drifted historical data. However, we contend that explicitly utilizing the drifted data together leads to much better model accuracy and propose Quilt, a data-centric framework for identifying and selecting data segments that maximize model accuracy. To address the potential downside of efficiency, Quilt extends existing data subset selection techniques, which can be used to reduce the training data without compromising model accuracy. These techniques cannot be used as is because they only assume virtual drifts where the posterior probabilities P(y|X) are assumed not to change. In contrast, a key challenge in our setup is to also discard undesirable data segments with concept drifts. Quilt thus discards drifted data segments and selects data segment subsets holistically for accurate and efficient model training. The two operations use gradient-based scores, which have little computation overhead. In our experiments, we show that Quilt outperforms state-of-the-art drift adaptation and data selection baselines on synthetic and real datasets. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: Accepted to AAAI 2024

arXiv:2312.09634 [pdf, other]

Vectorizing string entries for data processing on tables: when are larger language models better?

Authors: Léo Grinsztajn, Edouard Oyallon, Myung Jun Kim, Gaël Varoquaux

Abstract: There are increasingly efficient data processing pipelines that work on vectors of numbers, for instance most machine learning models, or vector databases for fast similarity search. These require converting the data to numbers. While this conversion is easy for simple numerical and categorical entries, databases are strife with text entries, such as names or descriptions. In the age of large lang… ▽ More There are increasingly efficient data processing pipelines that work on vectors of numbers, for instance most machine learning models, or vector databases for fast similarity search. These require converting the data to numbers. While this conversion is easy for simple numerical and categorical entries, databases are strife with text entries, such as names or descriptions. In the age of large language models, what's the best strategies to vectorize tables entries, baring in mind that larger models entail more operational complexity? We study the benefits of language models in 14 analytical tasks on tables while varying the training size, as well as for a fuzzy join benchmark. We introduce a simple characterization of a column that reveals two settings: 1) a dirty categories setting, where strings share much similarities across entries, and conversely 2) a diverse entries setting. For dirty categories, pretrained language models bring little-to-no benefit compared to simpler string models. For diverse entries, we show that larger language models improve data processing. For these we investigate the complexity-performance tradeoffs and show that they reflect those of classic text embedding: larger models tend to perform better, but it is useful to fine tune them for embedding purposes. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2312.09572 [pdf, other]

doi 10.1109/ACCESS.2023.3344177

IR-UWB Radar-Based Contactless Silent Speech Recognition of Vowels, Consonants, Words, and Phrases

Authors: Sunghwa Lee, Younghoon Shin, Myungjong Kim, Jiwon Seo

Abstract: Several sensing techniques have been proposed for silent speech recognition (SSR); however, many of these methods require invasive processes or sensor attachment to the skin using adhesive tape or glue, rendering them unsuitable for frequent use in daily life. By contrast, impulse radio ultra-wideband (IR-UWB) radar can operate without physical contact with users' articulators and related body par… ▽ More Several sensing techniques have been proposed for silent speech recognition (SSR); however, many of these methods require invasive processes or sensor attachment to the skin using adhesive tape or glue, rendering them unsuitable for frequent use in daily life. By contrast, impulse radio ultra-wideband (IR-UWB) radar can operate without physical contact with users' articulators and related body parts, offering several advantages for SSR. These advantages include high range resolution, high penetrability, low power consumption, robustness to external light or sound interference, and the ability to be embedded in space-constrained handheld devices. This study demonstrated IR-UWB radar-based contactless SSR using four types of speech stimuli (vowels, consonants, words, and phrases). To achieve this, a novel speech feature extraction algorithm specifically designed for IR-UWB radar-based SSR is proposed. Each speech stimulus is recognized by applying a classification algorithm to the extracted speech features. Two different algorithms, multidimensional dynamic time warping (MD-DTW) and deep neural network-hidden Markov model (DNN-HMM), were compared for the classification task. Additionally, a favorable radar antenna position, either in front of the user's lips or below the user's chin, was determined to achieve higher recognition accuracy. Experimental results demonstrated the efficacy of the proposed speech feature extraction algorithm combined with DNN-HMM for classifying vowels, consonants, words, and phrases. Notably, this study represents the first demonstration of phoneme-level SSR using contactless radar. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: Submitted to IEEE Access

arXiv:2312.08703 [pdf, other]

doi 10.1103/PhysRevResearch.6.023241

A Rydberg-atom approach to the integer factorization problem

Authors: Juyoung Park, Seokho Jeong, Minhyuk Kim, Kangheun Kim, Andrew Byun, Louis Vignoli, Louis-Paul Henry, Loïc Henriet, Jaewook Ahn

Abstract: The task of factoring integers poses a significant challenge in modern cryptography, and quantum computing holds the potential to efficiently address this problem compared to classical algorithms. Thus, it is crucial to develop quantum computing algorithms to address this problem. This study introduces a quantum approach that utilizes Rydberg atoms to tackle the factorization problem. Experimental… ▽ More The task of factoring integers poses a significant challenge in modern cryptography, and quantum computing holds the potential to efficiently address this problem compared to classical algorithms. Thus, it is crucial to develop quantum computing algorithms to address this problem. This study introduces a quantum approach that utilizes Rydberg atoms to tackle the factorization problem. Experimental demonstrations are conducted for the factorization of small composite numbers such as $6 = 2 \times 3$, $15 = 3 \times 5$, and $35 = 5 \times 7$. This approach involves employing Rydberg-atom graphs to algorithmically program binary multiplication tables, yielding many-body ground states that represent superpositions of factoring solutions. Subsequently, these states are probed using quantum adiabatic computing. Limitations of this method are discussed, specifically addressing the scalability of current Rydberg quantum computing for the intricate computational problem. △ Less

Submitted 31 January, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: 12 pages, 5 figures

arXiv:2312.08136 [pdf, other]

ProNeRF: Learning Efficient Projection-Aware Ray Sampling for Fine-Grained Implicit Neural Radiance Fields

Authors: Juan Luis Gonzalez Bello, Minh-Quan Viet Bui, Munchurl Kim

Abstract: Recent advances in neural rendering have shown that, albeit slow, implicit compact models can learn a scene's geometries and view-dependent appearances from multiple views. To maintain such a small memory footprint but achieve faster inference times, recent works have adopted `sampler' networks that adaptively sample a small subset of points along each ray in the implicit neural radiance fields. A… ▽ More Recent advances in neural rendering have shown that, albeit slow, implicit compact models can learn a scene's geometries and view-dependent appearances from multiple views. To maintain such a small memory footprint but achieve faster inference times, recent works have adopted `sampler' networks that adaptively sample a small subset of points along each ray in the implicit neural radiance fields. Although these methods achieve up to a 10$\times$ reduction in rendering time, they still suffer from considerable quality degradation compared to the vanilla NeRF. In contrast, we propose ProNeRF, which provides an optimal trade-off between memory footprint (similar to NeRF), speed (faster than HyperReel), and quality (better than K-Planes). ProNeRF is equipped with a novel projection-aware sampling (PAS) network together with a new training strategy for ray exploration and exploitation, allowing for efficient fine-grained particle sampling. Our ProNeRF yields state-of-the-art metrics, being 15-23x faster with 0.65dB higher PSNR than NeRF and yielding 0.95dB higher PSNR than the best published sampler-based method, HyperReel. Our exploration and exploitation training strategy allows ProNeRF to learn the full scenes' color and density distributions while also learning efficient ray sampling focused on the highest-density regions. We provide extensive experimental results that support the effectiveness of our method on the widely adopted forward-facing and 360 datasets, LLFF and Blender, respectively. △ Less

Submitted 13 December, 2023; originally announced December 2023.

Comments: Visit our project website at https://kaist-viclab.github.io/pronerf-site/

arXiv:2312.08071 [pdf, other]

Novel View Synthesis with View-Dependent Effects from a Single Image

Authors: Juan Luis Gonzalez Bello, Munchurl Kim

Abstract: In this paper, we firstly consider view-dependent effects into single image-based novel view synthesis (NVS) problems. For this, we propose to exploit the camera motion priors in NVS to model view-dependent appearance or effects (VDE) as the negative disparity in the scene. By recognizing specularities "follow" the camera motion, we infuse VDEs into the input images by aggregating input pixel colo… ▽ More In this paper, we firstly consider view-dependent effects into single image-based novel view synthesis (NVS) problems. For this, we propose to exploit the camera motion priors in NVS to model view-dependent appearance or effects (VDE) as the negative disparity in the scene. By recognizing specularities "follow" the camera motion, we infuse VDEs into the input images by aggregating input pixel colors along the negative depth region of the epipolar lines. Also, we propose a `relaxed volumetric rendering' approximation that allows computing the densities in a single pass, improving efficiency for NVS from single images. Our method can learn single-image NVS from image sequences only, which is a completely self-supervised learning method, for the first time requiring neither depth nor camera pose annotations. We present extensive experiment results and show that our proposed method can learn NVS with VDEs, outperforming the SOTA single-view NVS methods on the RealEstate10k and MannequinChallenge datasets. △ Less

Submitted 13 December, 2023; originally announced December 2023.

Comments: Visit our website https://kaist-viclab.github.io/monovde-site

arXiv:2312.07087 [pdf, other]

Toward Robustness in Multi-label Classification: A Data Augmentation Strategy against Imbalance and Noise

Authors: Hwanjun Song, Minseok Kim, Jae-Gil Lee

Abstract: Multi-label classification poses challenges due to imbalanced and noisy labels in training data. We propose a unified data augmentation method, named BalanceMix, to address these challenges. Our approach includes two samplers for imbalanced labels, generating minority-augmented instances with high diversity. It also refines multi-labels at the label-wise granularity, categorizing noisy labels as c… ▽ More Multi-label classification poses challenges due to imbalanced and noisy labels in training data. We propose a unified data augmentation method, named BalanceMix, to address these challenges. Our approach includes two samplers for imbalanced labels, generating minority-augmented instances with high diversity. It also refines multi-labels at the label-wise granularity, categorizing noisy labels as clean, re-labeled, or ambiguous for robust optimization. Extensive experiments on three benchmark datasets demonstrate that BalanceMix outperforms existing state-of-the-art methods. We release the code at https://github.com/DISL-Lab/BalanceMix. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Comments: This paper was accepted at AAAI 2024. We upload the full version of our paper on arXiv due to the page limit of AAAI

arXiv:2312.06279 [pdf, other]

Regional Correlation Aided Mobile Traffic Prediction with Spatiotemporal Deep Learning

Authors: JeongJun Park, Lusungu J. Mwasinga, Huigyu Yang, Syed M. Raza, Duc-Tai Le, Moonseong Kim, Min Young Chung, Hyunseung Choo

Abstract: Mobile traffic data in urban regions shows differentiated patterns during different hours of the day. The exploitation of these patterns enables highly accurate mobile traffic prediction for proactive network management. However, recent Deep Learning (DL) driven studies have only exploited spatiotemporal features and have ignored the geographical correlations, causing high complexity and erroneous… ▽ More Mobile traffic data in urban regions shows differentiated patterns during different hours of the day. The exploitation of these patterns enables highly accurate mobile traffic prediction for proactive network management. However, recent Deep Learning (DL) driven studies have only exploited spatiotemporal features and have ignored the geographical correlations, causing high complexity and erroneous mobile traffic predictions. This paper addresses these limitations by proposing an enhanced mobile traffic prediction scheme that combines the clustering strategy of daily mobile traffic peak time and novel multi Temporal Convolutional Network with a Long Short Term Memory (multi TCN-LSTM) model. The mobile network cells that exhibit peak traffic during the same hour of the day are clustered together. Our experiments on large-scale real-world mobile traffic data show up to 28% performance improvement compared to state-of-the-art studies, which confirms the efficacy and viability of the proposed approach. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 4 pages, 5 figures, 1 table. This paper is already accepted on IEEE Consumer Communications & Networking Conference(CCNC) 2024

arXiv:2312.04745 [pdf, other]

A Brief Tutorial on Sample Size Calculations for Fairness Audits

Authors: Harvineet Singh, Fan Xia, Mi-Ok Kim, Romain Pirracchio, Rumi Chunara, Jean Feng

Abstract: In fairness audits, a standard objective is to detect whether a given algorithm performs substantially differently between subgroups. Properly powering the statistical analysis of such audits is crucial for obtaining informative fairness assessments, as it ensures a high probability of detecting unfairness when it exists. However, limited guidance is available on the amount of data necessary for a… ▽ More In fairness audits, a standard objective is to detect whether a given algorithm performs substantially differently between subgroups. Properly powering the statistical analysis of such audits is crucial for obtaining informative fairness assessments, as it ensures a high probability of detecting unfairness when it exists. However, limited guidance is available on the amount of data necessary for a fairness audit, lacking directly applicable results concerning commonly used fairness metrics. Additionally, the consideration of unequal subgroup sample sizes is also missing. In this tutorial, we address these issues by providing guidance on how to determine the required subgroup sample sizes to maximize the statistical power of hypothesis tests for detecting unfairness. Our findings are applicable to audits of binary classification models and multiple fairness metrics derived as summaries of the confusion matrix. Furthermore, we discuss other aspects of audit study designs that can increase the reliability of audit results. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: 4 pages, 1 figure, 1 table, Workshop on Regulatable Machine Learning at the 37th Conference on Neural Information Processing Systems

arXiv:2312.03852 [pdf, other]

The JWST Early Release Science Program for Direct Observations of Exoplanetary Systems V: Do Self-Consistent Atmospheric Models Represent JWST Spectra? A Showcase With VHS 1256 b

Authors: Simon Petrus, Niall Whiteford, Polychronis Patapis, Beth A. Biller, Andrew Skemer, Sasha Hinkley, Genaro Suárez, Anna Lueber, Paulina Palma-Bifani, Jordan M. Stone, Johanna M. Vos, Caroline V. Morley, Pascal Tremblin, Benjamin Charnay, Christiane Helling, Brittany E. Miles, Aarynn L. Carter, Jason J. Wang, Markus Janson, Eileen C. Gonzales, Ben Sutlieff, Kielan K. W. Hoch, Mickaël Bonnefoy, Gaël Chauvin, Olivier Absil , et al. (97 additional authors not shown)

Abstract: The unprecedented medium-resolution (R~1500-3500) near- and mid-infrared (1-18um) spectrum provided by JWST for the young (140+/-20Myr) low-mass (12-20MJup) L-T transition (L7) companion VHS1256b gives access to a catalogue of molecular absorptions. In this study, we present a comprehensive analysis of this dataset utilizing a forward modelling approach, applying our Bayesian framework, ForMoSA. W… ▽ More The unprecedented medium-resolution (R~1500-3500) near- and mid-infrared (1-18um) spectrum provided by JWST for the young (140+/-20Myr) low-mass (12-20MJup) L-T transition (L7) companion VHS1256b gives access to a catalogue of molecular absorptions. In this study, we present a comprehensive analysis of this dataset utilizing a forward modelling approach, applying our Bayesian framework, ForMoSA. We explore five distinct atmospheric models to assess their performance in estimating key atmospheric parameters: Teff, log(g), [M/H], C/O, gamma, fsed, and R. Our findings reveal that each parameter's estimate is significantly influenced by factors such as the wavelength range considered and the model chosen for the fit. This is attributed to systematic errors in the models and their challenges in accurately replicating the complex atmospheric structure of VHS1256b, notably the complexity of its clouds and dust distribution. To propagate the impact of these systematic uncertainties on our atmospheric property estimates, we introduce innovative fitting methodologies based on independent fits performed on different spectral windows. We finally derived a Teff consistent with the spectral type of the target, considering its young age, which is confirmed by our estimate of log(g). Despite the exceptional data quality, attaining robust estimates for chemical abundances [M/H] and C/O, often employed as indicators of formation history, remains challenging. Nevertheless, the pioneering case of JWST's data for VHS1256b has paved the way for future acquisitions of substellar spectra that will be systematically analyzed to directly compare the properties of these objects and correct the systematics in the models. △ Less

Submitted 31 January, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

Comments: 32 pages, 16 figures, 6 tables, 2 appendices

arXiv:2312.02512 [pdf, other]

AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation

Authors: Jeongsoo Choi, Se Jin Park, Minsu Kim, Yong Man Ro

Abstract: This paper proposes a novel direct Audio-Visual Speech to Audio-Visual Speech Translation (AV2AV) framework, where the input and output of the system are multimodal (i.e., audio and visual speech). With the proposed AV2AV, two key advantages can be brought: 1) We can perform real-like conversations with individuals worldwide in a virtual meeting by utilizing our own primary languages. In contrast… ▽ More This paper proposes a novel direct Audio-Visual Speech to Audio-Visual Speech Translation (AV2AV) framework, where the input and output of the system are multimodal (i.e., audio and visual speech). With the proposed AV2AV, two key advantages can be brought: 1) We can perform real-like conversations with individuals worldwide in a virtual meeting by utilizing our own primary languages. In contrast to Speech-to-Speech Translation (A2A), which solely translates between audio modalities, the proposed AV2AV directly translates between audio-visual speech. This capability enhances the dialogue experience by presenting synchronized lip movements along with the translated speech. 2) We can improve the robustness of the spoken language translation system. By employing the complementary information of audio-visual speech, the system can effectively translate spoken language even in the presence of acoustic noise, showcasing robust performance. To mitigate the problem of the absence of a parallel AV2AV translation dataset, we propose to train our spoken language translation system with the audio-only dataset of A2A. This is done by learning unified audio-visual speech representations through self-supervised learning in advance to train the translation system. Moreover, we propose an AV-Renderer that can generate raw audio and video in parallel. It is designed with zero-shot speaker modeling, thus the speaker in source audio-visual speech can be maintained at the target translated audio-visual speech. The effectiveness of AV2AV is evaluated with extensive experiments in a many-to-many language translation setting. Demo page is available on https://choijeongsoo.github.io/av2av. △ Less

Submitted 26 March, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

Comments: CVPR 2024. Code & Demo: https://choijeongsoo.github.io/av2av

arXiv:2312.01026 [pdf, other]

Token Fusion: Bridging the Gap between Token Pruning and Token Merging

Authors: Minchul Kim, Shangqian Gao, Yen-Chang Hsu, Yilin Shen, Hongxia Jin

Abstract: Vision Transformers (ViTs) have emerged as powerful backbones in computer vision, outperforming many traditional CNNs. However, their computational overhead, largely attributed to the self-attention mechanism, makes deployment on resource-constrained edge devices challenging. Multiple solutions rely on token pruning or token merging. In this paper, we introduce "Token Fusion" (ToFu), a method that… ▽ More Vision Transformers (ViTs) have emerged as powerful backbones in computer vision, outperforming many traditional CNNs. However, their computational overhead, largely attributed to the self-attention mechanism, makes deployment on resource-constrained edge devices challenging. Multiple solutions rely on token pruning or token merging. In this paper, we introduce "Token Fusion" (ToFu), a method that amalgamates the benefits of both token pruning and token merging. Token pruning proves advantageous when the model exhibits sensitivity to input interpolations, while token merging is effective when the model manifests close to linear responses to inputs. We combine this to propose a new scheme called Token Fusion. Moreover, we tackle the limitations of average merging, which doesn't preserve the intrinsic feature norm, resulting in distributional shifts. To mitigate this, we introduce MLERP merging, a variant of the SLERP technique, tailored to merge multiple tokens while maintaining the norm distribution. ToFu is versatile, applicable to ViTs with or without additional training. Our empirical evaluations indicate that ToFu establishes new benchmarks in both classification and image generation tasks concerning computational efficiency and model accuracy. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: To appear in WACV 2024

arXiv:2312.00894 [pdf, other]

Leveraging Large Language Models to Improve REST API Testing

Authors: Myeongsoo Kim, Tyler Stennett, Dhruv Shah, Saurabh Sinha, Alessandro Orso

Abstract: The widespread adoption of REST APIs, coupled with their growing complexity and size, has led to the need for automated REST API testing tools. Current tools focus on the structured data in REST API specifications but often neglect valuable insights available in unstructured natural-language descriptions in the specifications, which leads to suboptimal test coverage. Recently, to address this gap,… ▽ More The widespread adoption of REST APIs, coupled with their growing complexity and size, has led to the need for automated REST API testing tools. Current tools focus on the structured data in REST API specifications but often neglect valuable insights available in unstructured natural-language descriptions in the specifications, which leads to suboptimal test coverage. Recently, to address this gap, researchers have developed techniques that extract rules from these human-readable descriptions and query knowledge bases to derive meaningful input values. However, these techniques are limited in the types of rules they can extract and prone to produce inaccurate results. This paper presents RESTGPT, an innovative approach that leverages the power and intrinsic context-awareness of Large Language Models (LLMs) to improve REST API testing. RESTGPT takes as input an API specification, extracts machine-interpretable rules, and generates example parameter values from natural-language descriptions in the specification. It then augments the original specification with these rules and values. Our evaluations indicate that RESTGPT outperforms existing techniques in both rule extraction and value generation. Given these promising results, we outline future research directions for advancing REST API testing through LLMs. △ Less

Submitted 29 January, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

Comments: To be published in the 46th IEEE/ACM International Conference on Software Engineering - New Ideas and Emerging Results Track (ICSE-NIER 2024)

arXiv:2311.18515 [pdf, ps, other]

The Euler-Glaisher Theorem over Totally Real Number Fields

Authors: Se Wook Jang, Byeong Moon Kim, Kwang Hoon Kim

Abstract: In this paper, we study the partition theory over totally real number fields. Let $K$ be a totally real number field. A partition of a totally positive algebraic integer $δ$ over $K$ is $λ=(λ_1,λ_2,\ldots,λ_r)$ for some totally positive integers $λ_i$ such that $δ=λ_1+λ_2+\cdots+λ_r$. We find an identity to explain the number of partitions of $δ$ whose parts do not belong to a given ideal… ▽ More In this paper, we study the partition theory over totally real number fields. Let $K$ be a totally real number field. A partition of a totally positive algebraic integer $δ$ over $K$ is $λ=(λ_1,λ_2,\ldots,λ_r)$ for some totally positive integers $λ_i$ such that $δ=λ_1+λ_2+\cdots+λ_r$. We find an identity to explain the number of partitions of $δ$ whose parts do not belong to a given ideal $\mathfrak a$. We obtain a generalization of the Euler-Glaisher Theorem over totally real number fields as a corollary. We also prove that the number of solutions to the equation $δ=x_1+2x_2+\cdots+nx_n$ with $x_i$ totally positive or $0$ is equal to that of chain partitions of $δ$. A chain partition of $δ$ is a partition $λ=(λ_1,λ_2,\ldots,λ_r)$ of $δ$ such that $λ_{i+1}-λ_i$ is totally positive or $0$. △ Less

Submitted 30 November, 2023; originally announced November 2023.

MSC Class: 11P84; 11R80

arXiv:2311.18514 [pdf, ps, other]

The Sylvester Theorem and the Rogers-Ramanujan Identities over Totally Real Number Fields

Authors: Se Wook Jang, Byeong Moon Kim, Kwang Hoon Kim

Abstract: In this paper, we prove two identities on the partition of a totally positive algebraic integer over a totally real number field which are the generalization of the Sylvester Theorem and that of the Rogers-Ramanujan Identities. Additionally, we give an another version of generalized Rogers-Ramanujan Identities. In this paper, we prove two identities on the partition of a totally positive algebraic integer over a totally real number field which are the generalization of the Sylvester Theorem and that of the Rogers-Ramanujan Identities. Additionally, we give an another version of generalized Rogers-Ramanujan Identities. △ Less

Submitted 30 November, 2023; originally announced November 2023.

MSC Class: 11P84; 11R80

arXiv:2311.16128 [pdf, other]

Physics-Inspired Discrete-Phase Optimization for 3D Beamforming with PIN-Diode Extra-Large Antenna Arrays

Authors: Minsung Kim, Annalise Stockley, Keith Briggs, Kyle Jamieson

Abstract: Large antenna arrays can steer narrow beams towards a target area, and thus improve the communications capacity of wireless channels and the fidelity of radio sensing. Hardware that is capable of continuously-variable phase shifts is expensive, presenting scaling challenges. PIN diodes that apply only discrete phase shifts are promising and cost-effective; however, unlike continuous phase shifters… ▽ More Large antenna arrays can steer narrow beams towards a target area, and thus improve the communications capacity of wireless channels and the fidelity of radio sensing. Hardware that is capable of continuously-variable phase shifts is expensive, presenting scaling challenges. PIN diodes that apply only discrete phase shifts are promising and cost-effective; however, unlike continuous phase shifters, finding the best phase configuration across elements is an NP-hard optimization problem. Thus, the complexity of optimization becomes a new bottleneck for large-antenna arrays. To address this challenge, this paper suggests a procedure for converting the optimization objective function from a ratio of quadratic functions to a sequence of more easily solvable quadratic unconstrained binary optimization (QUBO) sub-problems. This conversion is an exact equivalence, and the resulting QUBO forms are standard input formats for various physics-inspired optimization methods. We demonstrate that a simulated annealing approach is very effective for solving these sub-problems, and we give performance metrics for several large array types optimized by this technique. Through numerical experiments, we report 3D beamforming performance for extra-large arrays with up to 10,000 elements. △ Less

Submitted 3 January, 2024; v1 submitted 30 October, 2023; originally announced November 2023.

arXiv:2311.15683 [pdf]

doi 10.1038/s41528-024-00315-1

Ultrasensitive Textile Strain Sensors Redefine Wearable Silent Speech Interfaces with High Machine Learning Efficiency

Authors: Chenyu Tang, Muzi Xu, Wentian Yi, Zibo Zhang, Edoardo Occhipinti, Chaoqun Dong, Dafydd Ravenscroft, Sung-Min Jung, Sanghyo Lee, Shuo Gao, Jong Min Kim, Luigi G. Occhipinti

Abstract: Our research presents a wearable Silent Speech Interface (SSI) technology that excels in device comfort, time-energy efficiency, and speech decoding accuracy for real-world use. We developed a biocompatible, durable textile choker with an embedded graphene-based strain sensor, capable of accurately detecting subtle throat movements. This sensor, surpassing other strain sensors in sensitivity by 42… ▽ More Our research presents a wearable Silent Speech Interface (SSI) technology that excels in device comfort, time-energy efficiency, and speech decoding accuracy for real-world use. We developed a biocompatible, durable textile choker with an embedded graphene-based strain sensor, capable of accurately detecting subtle throat movements. This sensor, surpassing other strain sensors in sensitivity by 420%, simplifies signal processing compared to traditional voice recognition methods. Our system uses a computationally efficient neural network, specifically a one-dimensional convolutional neural network with residual structures, to decode speech signals. This network is energy and time-efficient, reducing computational load by 90% while achieving 95.25% accuracy for a 20-word lexicon and swiftly adapting to new users and words with minimal samples. This innovation demonstrates a practical, sensitive, and precise wearable SSI suitable for daily communication applications. △ Less

Submitted 7 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: 5 figures in the article; 11 figures and 4 tables in supplementary information

Journal ref: npj Flexible Electronics (2024)

arXiv:2311.15632 [pdf]

Application of Long-short Term Memory (LSTM) Model for Forecasting NOx Emission in Pohang Area

Authors: Sangdeok Lee, MinChung Kim

Abstract: Emissions of nitric oxide and nitrogen dioxide, which are named as NOx, are a major environmental and health concern.To react to the climate crisis, the South Korean government has strengthened NOx emission regulations. An accurate NOx prediction model can help companies to meet their NOx emission quotas and achieve cost savings. This study focuses on developing a model which forecasts the amount… ▽ More Emissions of nitric oxide and nitrogen dioxide, which are named as NOx, are a major environmental and health concern.To react to the climate crisis, the South Korean government has strengthened NOx emission regulations. An accurate NOx prediction model can help companies to meet their NOx emission quotas and achieve cost savings. This study focuses on developing a model which forecasts the amount of NOx emissions in Pohang, a heavy industrial city in South Korea with serious air pollution problems.In this study, the Long-short term memory (LSTM) modeling is applied to predict the amount of NOx emissions, with missing data imputation using stochastic regression. Two parameters (i.e., time windows and learning rates) necessary to run the LSTM model are tested and selected using the Adam optimizer, one of the popular optimization methods in LSTM. I found that the model that I applied achieved the acceptable prediction performance since its Mean Absolute Scaled Error (MASE), the most important evaluation criterion, is less than 1. This means that applying the model that I developed in predicting future NOx emissions will perform better than a naive prediction, a model that simply predicts them based on the last observed data point. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.15518 [pdf, other]

The Seoul National University AGN Monitoring Project III: H$β$ lag measurements of 32 luminous AGNs and the high-luminosity end of the size--luminosity relation

Authors: Jong-Hak Woo, Shu Wang, Suvendu Rakshit, Hojin Cho, Donghoon Son, Vardha N. Bennert, Elena Gallo, Edmund Hodges-Kluck, Tommaso Treu, Aaron J. Barth, Wanjin Cho, Adi Foord, Jaehyuk Geum, Hengxiao Guo, Yashashree Jadhav, Yiseul Jeon, Kyle M. Kabasares, Won-Suk Kang, Changseok Kim, Minjin Kim, Tae-Woo Kim, Huynh Anh N. Le, Matthew A. Malkan, Amit Kumar Mandal, Daeseong Park , et al. (6 additional authors not shown)

Abstract: We present the main results from a long-term reverberation mapping campaign carried out for the Seoul National University Active Galactic Nuclei (AGN) Monitoring Project. High-quality data were obtained during 2015-2021 for 32 luminous AGNs (i.e., continuum luminosity in the range of $10^{44-46}$ erg s$^{-1}$) at a regular cadence, of 20-30 days for spectroscopy and 3-5 days for photometry. We obt… ▽ More We present the main results from a long-term reverberation mapping campaign carried out for the Seoul National University Active Galactic Nuclei (AGN) Monitoring Project. High-quality data were obtained during 2015-2021 for 32 luminous AGNs (i.e., continuum luminosity in the range of $10^{44-46}$ erg s$^{-1}$) at a regular cadence, of 20-30 days for spectroscopy and 3-5 days for photometry. We obtain time lag measurements between the variability in the H$β$ emission and the continuum for 32 AGNs; twenty-five of those have the best lag measurements based on our quality assessment, examining correlation strength, and the posterior lag distribution. Our study significantly increases the current sample of reverberation-mapped AGNs, particularly at the moderate to high luminosity end. Combining our results with literature measurements, we derive a H$β$ broad line region size--luminosity relation with a shallower slope than reported in the literature. For a given luminosity, most of our measured lags are shorter than the expectation, implying that single-epoch black hole mass estimators based on previous calibrations could suffer large systematic uncertainties. △ Less

Submitted 26 November, 2023; originally announced November 2023.

Comments: Accepted by ApJ; 39 pages, 22 figures

arXiv:2311.14496 [pdf, other]

RTPS Attack Dataset Description

Authors: Dong Young Kim, Dongsung Kim, Yuchan Song, Gang Min Kim, Min Geun Song, Jeong Do Yoo, Huy Kang Kim

Abstract: This paper explains all about our RTPS datasets. We collect malicious/benign packet data by injecting attack data in an Unmanned Ground Vehicle (UGV) in the normal state. We assembled the testbed, consisting of UGV, Controller, PC, and Router. We collect this dataset in the UGV part of our testbed. We conducted two types of attack "Command Injection" and "Command Injection with ARP Spoofing" on… ▽ More This paper explains all about our RTPS datasets. We collect malicious/benign packet data by injecting attack data in an Unmanned Ground Vehicle (UGV) in the normal state. We assembled the testbed, consisting of UGV, Controller, PC, and Router. We collect this dataset in the UGV part of our testbed. We conducted two types of attack "Command Injection" and "Command Injection with ARP Spoofing" on our testbed. The data collection time is 180, 300, 600, and 1200. The scenario has 30 each on collection time, 240 total. We expect this dataset to contribute to the development of defense technologies like anomaly detection to address security threat issues in ROS2 networks and Fast-DDS implements. △ Less

Submitted 2 April, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

Comments: This manuscript is written in Korean. You can download our dataset through our lab: https://ocslab.hksecurity.net/Datasets/rtps-attack-dataset We welcome your comments or feedback. Contact INFO: Dong Young Kim ([email protected]), Huy Kang Kim ([email protected])

Showing 201–250 of 2,954 results for author: Kim, M