Search | arXiv e-print repository

DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model

Authors: Mona Sheikh Zeinoddin, Chiara Lena, Jiongqi Qu, Luca Carlini, Mattia Magro, Seunghoi Kim, Elena De Momi, Sophia Bano, Matthew Grech-Sollars, Evangelos Mazomenos, Daniel C. Alexander, Danail Stoyanov, Matthew J. Clarkson, Mobarakol Islam

Abstract: Robotic-assisted surgery (RAS) relies on accurate depth estimation for 3D reconstruction and visualization. While foundation models like Depth Anything Models (DAM) show promise, directly applying them to surgery often yields suboptimal results. Fully fine-tuning on limited surgical data can cause overfitting and catastrophic forgetting, compromising model robustness and generalization. Although L… ▽ More Robotic-assisted surgery (RAS) relies on accurate depth estimation for 3D reconstruction and visualization. While foundation models like Depth Anything Models (DAM) show promise, directly applying them to surgery often yields suboptimal results. Fully fine-tuning on limited surgical data can cause overfitting and catastrophic forgetting, compromising model robustness and generalization. Although Low-Rank Adaptation (LoRA) addresses some adaptation issues, its uniform parameter distribution neglects the inherent feature hierarchy, where earlier layers, learning more general features, require more parameters than later ones. To tackle this issue, we introduce Depth Anything in Robotic Endoscopic Surgery (DARES), a novel approach that employs a new adaptation technique, Vector Low-Rank Adaptation (Vector-LoRA) on the DAM V2 to perform self-supervised monocular depth estimation in RAS scenes. To enhance learning efficiency, we introduce Vector-LoRA by integrating more parameters in earlier layers and gradually decreasing parameters in later layers. We also design a reprojection loss based on the multi-scale SSIM error to enhance depth perception by better tailoring the foundation model to the specific requirements of the surgical environment. The proposed method is validated on the SCARED dataset and demonstrates superior performance over recent state-of-the-art self-supervised monocular depth estimation techniques, achieving an improvement of 13.3% in the absolute relative error metric. The code and pre-trained weights are available at https://github.com/mobarakol/DARES. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: 11 pages

arXiv:2408.17416 [pdf, other]

Superconductivity in pressurized Re$_{0.10}$Mo$_{0.90}$B$_2$

Authors: S. Sinha, J. Lim, Z. Li, J. S. Kim, A. C. Hire, P. M. Dee, R. S. Kumar, D. Popov, R. J. Hemley, R. G. Hennig, P. J. Hirschfeld, G. R. Stewart, J. J. Hamlin

Abstract: The recent surprising discovery of superconductivity with critical temperature $T_c$ = 32 K in MoB$_2$ above 70 GPa has led to the search for related materials that may superconduct at similarly high $T_c$ values and lower pressures. We have studied the superconducting and structural properties of Re$_{0.10}$Mo$_{0.90}$B$_2$ to 170 GPa. A structural phase transition from R3m to P6/mmm commences at… ▽ More The recent surprising discovery of superconductivity with critical temperature $T_c$ = 32 K in MoB$_2$ above 70 GPa has led to the search for related materials that may superconduct at similarly high $T_c$ values and lower pressures. We have studied the superconducting and structural properties of Re$_{0.10}$Mo$_{0.90}$B$_2$ to 170 GPa. A structural phase transition from R3m to P6/mmm commences at 48 GPa, with the first signatures of superconductivity appearing above 44 GPa. The critical temperature is observed to increase with pressure. A complete resistive transition is observed only above 150 GPa, where the highest onset $T_c$ of 30 K is also achieved. Upon releasing pressure, the high pressure superconducting phase is found to be metastable. During unloading, a complete resistive superconducting transition is observed all the way down to 20 GPa (with onset $T_c \sim 20$ K). Our results suggest that the P6/mmm structure is responsible for the observed superconductivity. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: 7 pages, 7 figures, supplemental material. All data and analysis code associated with this work is available at https://doi.org/10.5281/zenodo.13359794

arXiv:2408.17066 [pdf, other]

Non-verbal Interaction and Interface with a Quadruped Robot using Body and Hand Gestures: Design and User Experience Evaluation

Authors: Soohyun Shin, Trevor Evetts, Hunter Saylor, Hyunji Kim, Soojin Woo, Wonhwha Rhee, Seong-Woo Kim

Abstract: In recent years, quadruped robots have attracted significant attention due to their practical advantages in maneuverability, particularly when navigating rough terrain and climbing stairs. As these robots become more integrated into various industries, including construction and healthcare, researchers have increasingly focused on developing intuitive interaction methods such as speech and gesture… ▽ More In recent years, quadruped robots have attracted significant attention due to their practical advantages in maneuverability, particularly when navigating rough terrain and climbing stairs. As these robots become more integrated into various industries, including construction and healthcare, researchers have increasingly focused on developing intuitive interaction methods such as speech and gestures that do not require separate devices such as keyboards or joysticks. This paper aims at investigating a comfortable and efficient interaction method with quadruped robots that possess a familiar form factor. To this end, we conducted two preliminary studies to observe how individuals naturally interact with a quadruped robot in natural and controlled settings, followed by a prototype experiment to examine human preferences for body-based and hand-based gesture controls using a Unitree Go1 Pro quadruped robot. We assessed the user experience of 13 participants using the User Experience Questionnaire and measured the time taken to complete specific tasks. The findings of our preliminary results indicate that humans have a natural preference for communicating with robots through hand and body gestures rather than speech. In addition, participants reported higher satisfaction and completed tasks more quickly when using body gestures to interact with the robot. This contradicts the fact that most gesture-based control technologies for quadruped robots are hand-based. The video is available at https://youtu.be/rysv1p1zvp4. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: 16 pages

arXiv:2408.17006 [pdf, other]

Retrieval-Augmented Natural Language Reasoning for Explainable Visual Question Answering

Authors: Su Hyeon Lim, Minkuk Kim, Hyeon Bae Kim, Seong Tae Kim

Abstract: Visual Question Answering with Natural Language Explanation (VQA-NLE) task is challenging due to its high demand for reasoning-based inference. Recent VQA-NLE studies focus on enhancing model networks to amplify the model's reasoning capability but this approach is resource-consuming and unstable. In this work, we introduce a new VQA-NLE model, ReRe (Retrieval-augmented natural language Reasoning)… ▽ More Visual Question Answering with Natural Language Explanation (VQA-NLE) task is challenging due to its high demand for reasoning-based inference. Recent VQA-NLE studies focus on enhancing model networks to amplify the model's reasoning capability but this approach is resource-consuming and unstable. In this work, we introduce a new VQA-NLE model, ReRe (Retrieval-augmented natural language Reasoning), using leverage retrieval information from the memory to aid in generating accurate answers and persuasive explanations without relying on complex networks and extra datasets. ReRe is an encoder-decoder architecture model using a pre-trained clip vision encoder and a pre-trained GPT-2 language model as a decoder. Cross-attention layers are added in the GPT-2 for processing retrieval features. ReRe outperforms previous methods in VQA accuracy and explanation score and shows improvement in NLE with more persuasive, reliability. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: ICIP Workshop 2024

arXiv:2408.16213 [pdf, other]

M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation

Authors: Jonggwon Park, Soobum Kim, Byungmu Yoon, Jihun Hyun, Kyoyun Choi

Abstract: The rapid evolution of artificial intelligence, especially in large language models (LLMs), has significantly impacted various domains, including healthcare. In chest X-ray (CXR) analysis, previous studies have employed LLMs, but with limitations: either underutilizing the multi-tasking capabilities of LLMs or lacking clinical accuracy. This paper presents M4CXR, a multi-modal LLM designed to enha… ▽ More The rapid evolution of artificial intelligence, especially in large language models (LLMs), has significantly impacted various domains, including healthcare. In chest X-ray (CXR) analysis, previous studies have employed LLMs, but with limitations: either underutilizing the multi-tasking capabilities of LLMs or lacking clinical accuracy. This paper presents M4CXR, a multi-modal LLM designed to enhance CXR interpretation. The model is trained on a visual instruction-following dataset that integrates various task-specific datasets in a conversational format. As a result, the model supports multiple tasks such as medical report generation (MRG), visual grounding, and visual question answering (VQA). M4CXR achieves state-of-the-art clinical accuracy in MRG by employing a chain-of-thought prompting strategy, in which it identifies findings in CXR images and subsequently generates corresponding reports. The model is adaptable to various MRG scenarios depending on the available inputs, such as single-image, multi-image, and multi-study contexts. In addition to MRG, M4CXR performs visual grounding at a level comparable to specialized models and also demonstrates outstanding performance in VQA. Both quantitative and qualitative assessments reveal M4CXR's versatility in MRG, visual grounding, and VQA, while consistently maintaining clinical accuracy. △ Less

Submitted 28 August, 2024; originally announced August 2024.

arXiv:2408.15620 [pdf, other]

CAPER: Enhancing Career Trajectory Prediction using Temporal Knowledge Graph and Ternary Relationship

Authors: Yeon-Chang Lee, JaeHyun Lee, Michiharu Yamashita, Dongwon Lee, Sang-Wook Kim

Abstract: The problem of career trajectory prediction (CTP) aims to predict one's future employer or job position. While several CTP methods have been developed for this problem, we posit that none of these methods (1) jointly considers the mutual ternary dependency between three key units (i.e., user, position, and company) of a career and (2) captures the characteristic shifts of key units in career over… ▽ More The problem of career trajectory prediction (CTP) aims to predict one's future employer or job position. While several CTP methods have been developed for this problem, we posit that none of these methods (1) jointly considers the mutual ternary dependency between three key units (i.e., user, position, and company) of a career and (2) captures the characteristic shifts of key units in career over time, leading to an inaccurate understanding of the job movement patterns in the labor market. To address the above challenges, we propose a novel solution, named as CAPER, that solves the challenges via sophisticated temporal knowledge graph (TKG) modeling. It enables the utilization of a graph-structured knowledge base with rich expressiveness, effectively preserving the changes in job movement patterns. Furthermore, we devise an extrapolated career reasoning task on TKG for a realistic evaluation. The experiments on a real-world career trajectory dataset demonstrate that CAPER consistently and significantly outperforms four baselines, two recent TKG reasoning methods, and five state-of-the-art CTP methods in predicting one's future companies and positions-i.e., on average, yielding 6.80% and 34.58% more accurate predictions, respectively. △ Less

Submitted 28 August, 2024; originally announced August 2024.

arXiv:2408.15214 [pdf, other]

EDGE: Predictable Scatter in the Stellar Mass--Halo Mass Relation of Dwarf Galaxies

Authors: Stacy Y. Kim, Justin I. Read, Martin P. Rey, Matthew D. A. Orkney, Sushanta Nigudkar, Andrew Pontzen, Ethan Taylor, Oscar Agertz, Payel Das

Abstract: The stellar-mass--halo-mass (SMHM) relation is central to our understanding of galaxy formation and the nature of dark matter. However, its normalisation, slope, and scatter are highly uncertain at dwarf galaxy scales. In this paper, we present DarkLight, a new semi-empirical dwarf galaxy formation model designed to robustly predict the SMHM relation for the smallest galaxies. DarkLight harnesses… ▽ More The stellar-mass--halo-mass (SMHM) relation is central to our understanding of galaxy formation and the nature of dark matter. However, its normalisation, slope, and scatter are highly uncertain at dwarf galaxy scales. In this paper, we present DarkLight, a new semi-empirical dwarf galaxy formation model designed to robustly predict the SMHM relation for the smallest galaxies. DarkLight harnesses a correlation between the mean star formation rate of dwarfs and their peak rotation speed -- the $\langle$SFR$\rangle$-$v_{\rm max}$ relation -- that we derive from simulations and observations. Given the sparsity of data for isolated dwarfs with $v_{\rm max} \lesssim 20$ km/s, we fit the $\langle$SFR$\rangle$-$v_{\rm max}$ relation to observational data for dwarfs above this velocity scale and to the high-resolution EDGE cosmological simulations below. Reionisation quenching is implemented via distinct $\langle$SFR$\rangle$-$v_{\rm max}$ relations before and after reionisation. We find that the SMHM scatter is small at reionisation, $\sim$0.2 dex, but rises to $\sim$0.5 dex ($1σ$) at a halo mass of $\sim$10$^9$ M$_\odot$ as star formation is quenched by reionisation but dark matter halo masses continue to grow. While we do not find a significant break in the slope of the SMHM relation, one can be introduced if reionisation occurs early ($z_{\rm quench} \gtrsim 5$). Finally, we find that dwarfs can be star forming today down to a halo mass of $\sim$2 $\times 10^9$ M$_\odot$. We predict that the lowest mass star forming dwarf irregulars in the nearby universe are the tip of the iceberg of a much larger population of quiescent isolated dwarfs. △ Less

Submitted 27 August, 2024; originally announced August 2024.

Comments: 15 pages, 13 figures. Key results are summarized in Figures 3-6. To be submitted to MNRAS. Comments welcome!

arXiv:2408.14855 [pdf, other]

Enhancing Analogical Reasoning in the Abstraction and Reasoning Corpus via Model-Based RL

Authors: Jihwan Lee, Woochang Sim, Sejin Kim, Sundong Kim

Abstract: This paper demonstrates that model-based reinforcement learning (model-based RL) is a suitable approach for the task of analogical reasoning. We hypothesize that model-based RL can solve analogical reasoning tasks more efficiently through the creation of internal models. To test this, we compared DreamerV3, a model-based RL method, with Proximal Policy Optimization, a model-free RL method, on the… ▽ More This paper demonstrates that model-based reinforcement learning (model-based RL) is a suitable approach for the task of analogical reasoning. We hypothesize that model-based RL can solve analogical reasoning tasks more efficiently through the creation of internal models. To test this, we compared DreamerV3, a model-based RL method, with Proximal Policy Optimization, a model-free RL method, on the Abstraction and Reasoning Corpus (ARC) tasks. Our results indicate that model-based RL not only outperforms model-free RL in learning and generalizing from single tasks but also shows significant advantages in reasoning across similar tasks. △ Less

Submitted 27 August, 2024; originally announced August 2024.

Comments: Accepted to IJCAI 2024 IARML Workshop

arXiv:2408.14739 [pdf, other]

VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech

Authors: Heeseung Kim, Sang-gil Lee, Jiheum Yeom, Che Hyun Lee, Sungwon Kim, Sungroh Yoon

Abstract: We propose VoiceTailor, a parameter-efficient speaker-adaptive text-to-speech (TTS) system, by equipping a pre-trained diffusion-based TTS model with a personalized adapter. VoiceTailor identifies pivotal modules that benefit from the adapter based on a weight change ratio analysis. We utilize Low-Rank Adaptation (LoRA) as a parameter-efficient adaptation method and incorporate the adapter into pi… ▽ More We propose VoiceTailor, a parameter-efficient speaker-adaptive text-to-speech (TTS) system, by equipping a pre-trained diffusion-based TTS model with a personalized adapter. VoiceTailor identifies pivotal modules that benefit from the adapter based on a weight change ratio analysis. We utilize Low-Rank Adaptation (LoRA) as a parameter-efficient adaptation method and incorporate the adapter into pivotal modules of the pre-trained diffusion decoder. To achieve powerful adaptation performance with few parameters, we explore various guidance techniques for speaker adaptation and investigate the best strategies to strengthen speaker information. VoiceTailor demonstrates comparable speaker adaptation performance to existing adaptive TTS models by fine-tuning only 0.25\% of the total parameters. VoiceTailor shows strong robustness when adapting to a wide range of real-world speakers, as shown in the demo. △ Less

Submitted 27 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

Comments: INTERSPEECH 2024

arXiv:2408.14688 [pdf, other]

Lowering threshold of NaI(Tl) scintillator to 0.7 keV in the COSINE-100 experiment

Authors: G. H. Yu, N. Carlin, J. Y. Cho, J. J. Choi, S. Choi, A. C. Ezeribe, L. E. França, C. Ha, I. S. Hahn, S. J. Hollick, E. J. Jeon, H. W. Joo, W. G. Kang, M. Kauer, B. H. Kim, H. J. Kim, J. Kim, K. W. Kim, S. H. Kim, S. K. Kim, W. K. Kim, Y. D. Kim, Y. H. Kim, Y. J. Ko, D. H. Lee , et al. (34 additional authors not shown)

Abstract: COSINE-100 is a direct dark matter search experiment, with the primary goal of testing the annual modulation signal observed by DAMA/LIBRA, using the same target material, NaI(Tl). In previous analyses, we achieved the same 1 keV energy threshold used in the DAMA/LIBRA's analysis that reported an annual modulation signal with 11.6$σ$ significance. In this article, we report an improved analysis th… ▽ More COSINE-100 is a direct dark matter search experiment, with the primary goal of testing the annual modulation signal observed by DAMA/LIBRA, using the same target material, NaI(Tl). In previous analyses, we achieved the same 1 keV energy threshold used in the DAMA/LIBRA's analysis that reported an annual modulation signal with 11.6$σ$ significance. In this article, we report an improved analysis that lowered the threshold to 0.7 keV, thanks to the application of Multi-Layer Perception network and a new likelihood parameter with waveforms in the frequency domain. The lower threshold would enable a better comparison of COSINE-100 with new DAMA results with a 0.75 keV threshold and account for differences in quenching factors. Furthermore the lower threshold can enhance COSINE-100's sensitivity to sub-GeV dark matter searches. △ Less

Submitted 26 August, 2024; originally announced August 2024.

arXiv:2408.14488 [pdf]

Multi-Task Multi-Fidelity Learning of Properties for Energetic Materials

Authors: Robert J. Appleton, Daniel Klinger, Brian H. Lee, Michael Taylor, Sohee Kim, Samuel Blankenship, Brian C. Barnes, Steven F. Son, Alejandro Strachan

Abstract: Data science and artificial intelligence are playing an increasingly important role in the physical sciences. Unfortunately, in the field of energetic materials data scarcity limits the accuracy and even applicability of ML tools. To address data limitations, we compiled multi-modal data: both experimental and computational results for several properties. We find that multi-task neural networks ca… ▽ More Data science and artificial intelligence are playing an increasingly important role in the physical sciences. Unfortunately, in the field of energetic materials data scarcity limits the accuracy and even applicability of ML tools. To address data limitations, we compiled multi-modal data: both experimental and computational results for several properties. We find that multi-task neural networks can learn from multi-modal data and outperform single-task models trained for specific properties. As expected, the improvement is more significant for data-scarce properties. These models are trained using descriptors built from simple molecular information and can be readily applied for large-scale materials screening to explore multiple properties simultaneously. This approach is widely applicable to fields outside energetic materials. △ Less

Submitted 21 August, 2024; originally announced August 2024.

Comments: 16 pages, 4 figures, 2 tables

arXiv:2408.14005 [pdf, other]

The Calibration of Polycyclic Aromatic Hydrocarbon Dust Emission as a Star Formation Rate Indicator in the AKARI NEP Survey

Authors: Helen Kyung Kim, Matthew A. Malkan, Toshinobu Takagi, Nagisa Oi, Denis Burgarella, Takamitsu Miyaji, Hyunjin Shim, Hideo Matsuhara, Tomotsugu Goto, Yoichi Ohyama, Veronique Buat, Seong Jin Kim

Abstract: Polycyclic aromatic hydrocarbon (PAH) dust emission has been proposed as an effective extinction-independent star formation rate (SFR) indicator in the mid-infrared (MIR), but this may depend on conditions in the interstellar medium. The coverage of the AKARI/Infrared Camera (IRC) allows us to study the effects of metallicity, starburst intensity, and active galactic nuclei on PAH emission in gala… ▽ More Polycyclic aromatic hydrocarbon (PAH) dust emission has been proposed as an effective extinction-independent star formation rate (SFR) indicator in the mid-infrared (MIR), but this may depend on conditions in the interstellar medium. The coverage of the AKARI/Infrared Camera (IRC) allows us to study the effects of metallicity, starburst intensity, and active galactic nuclei on PAH emission in galaxies with $f_ν(L18W)\lesssim 19$ AB mag. Observations include follow-up, rest-frame optical spectra of 443 galaxies within the AKARI North Ecliptic Pole survey that have IRC detections from 7-24 $μ$m. We use optical emission line diagnostics to infer SFR based on H$α$ and [O II]$λλ3726,3729$ emission line luminosities. The PAH 6.2 $μ$m and PAH 7.7 $μ$m luminosities ($L(PAH\ 6.2\ μm)$ and $L(PAH\ 7.7\ μm)$, respectively) derived using multi-wavelength model fits are consistent with those derived from slitless spectroscopy within 0.2 dex. $L(PAH\ 6.2\ μm)$ and $L(PAH\ 7.7\ μm)$ correlate linearly with the 24 $μ$m-dust corrected H$α$ luminosity only for normal, star-forming ``main-sequence" galaxies. Assuming multi-linear correlations, we quantify the additional dependencies on metallicity and starburst intensity, which we use to correct our PAH SFR calibrations at $0<z<1.2$ for the first time. We derive the cosmic star formation rate density (SFRD) per comoving volume from $0.15 \lesssim z \lesssim 1$. The PAH SFRD is consistent with that of the far-infrared and reaches an order of magnitude higher than that of uncorrected UV observations at $z\sim1$. Starburst galaxies contribute $\gtrsim 0.7$ of the total SFRD at $z\sim1$ compared to main-sequence galaxies. △ Less

Submitted 26 August, 2024; originally announced August 2024.

Comments: Accepted for publication in The Astrophysical Journal. 50 pages, 27 figures, 9 tables

arXiv:2408.13687 [pdf, other]

Quantum error correction below the surface code threshold

Authors: Rajeev Acharya, Laleh Aghababaie-Beni, Igor Aleiner, Trond I. Andersen, Markus Ansmann, Frank Arute, Kunal Arya, Abraham Asfaw, Nikita Astrakhantsev, Juan Atalaya, Ryan Babbush, Dave Bacon, Brian Ballard, Joseph C. Bardin, Johannes Bausch, Andreas Bengtsson, Alexander Bilmes, Sam Blackwell, Sergio Boixo, Gina Bortoli, Alexandre Bourassa, Jenna Bovaird, Leon Brill, Michael Broughton, David A. Browne , et al. (224 additional authors not shown)

Abstract: Quantum error correction provides a path to reach practical quantum computing by combining multiple physical qubits into a logical qubit, where the logical error rate is suppressed exponentially as more qubits are added. However, this exponential suppression only occurs if the physical error rate is below a critical threshold. In this work, we present two surface code memories operating below this… ▽ More Quantum error correction provides a path to reach practical quantum computing by combining multiple physical qubits into a logical qubit, where the logical error rate is suppressed exponentially as more qubits are added. However, this exponential suppression only occurs if the physical error rate is below a critical threshold. In this work, we present two surface code memories operating below this threshold: a distance-7 code and a distance-5 code integrated with a real-time decoder. The logical error rate of our larger quantum memory is suppressed by a factor of $Λ$ = 2.14 $\pm$ 0.02 when increasing the code distance by two, culminating in a 101-qubit distance-7 code with 0.143% $\pm$ 0.003% error per cycle of error correction. This logical memory is also beyond break-even, exceeding its best physical qubit's lifetime by a factor of 2.4 $\pm$ 0.3. We maintain below-threshold performance when decoding in real time, achieving an average decoder latency of 63 $μ$s at distance-5 up to a million cycles, with a cycle time of 1.1 $μ$s. To probe the limits of our error-correction performance, we run repetition codes up to distance-29 and find that logical performance is limited by rare correlated error events occurring approximately once every hour, or 3 $\times$ 10$^9$ cycles. Our results present device performance that, if scaled, could realize the operational requirements of large scale fault-tolerant quantum algorithms. △ Less

Submitted 24 August, 2024; originally announced August 2024.

Comments: 10 pages, 4 figures, Supplementary Information

arXiv:2408.13604 [pdf]

Thermoelectric signature of quantum criticality in the heavy-fermion superconductor CeRhIn$_5$

Authors: Zi-Yu Cao, Honghong Wang, Chan-Koo Park, Tae Beom Park, Harim Jang, Soonbeom Seo, Sung-Il Kim, Tuson Park

Abstract: The evolution of the Fermi surface across the quantum critical point (QCP), which is relevant for characterizing the quantum criticality and understanding its relation with unconventional superconductivity, is an intriguing subject in the study of strongly correlated electron systems. In this study, we report the thermopower measurements to investigate a change in Fermi surface across the QCP in p… ▽ More The evolution of the Fermi surface across the quantum critical point (QCP), which is relevant for characterizing the quantum criticality and understanding its relation with unconventional superconductivity, is an intriguing subject in the study of strongly correlated electron systems. In this study, we report the thermopower measurements to investigate a change in Fermi surface across the QCP in pure and 4.4% Sn-doped CeRhIn$_5$. Results show that their thermopower behavior differs significantly in the vicinity of their respective pressure-induced QCP. In pure CeRhIn$_5$, a drastic collapse of the thermopower takes place at the Kondo breakdown QCP, where the Fermi surface reconstructs concurrently with the development of the magnetic order. By contrast, the thermopower exhibits a broadly symmetric behavior around the QCP in 4.4% Sn-doped CeRhIn$_5$, which is a characteristic of the spin-density-wave QCP. These observations are consistent with the theoretical expectations and suggest the effectiveness of thermopower measurement in discriminating the nature of quantum criticality in heavy-fermion systems. △ Less

Submitted 24 August, 2024; originally announced August 2024.

Comments: 12 pages, 4 figures

arXiv:2408.13092 [pdf, other]

Diffusion-based Episodes Augmentation for Offline Multi-Agent Reinforcement Learning

Authors: Jihwan Oh, Sungnyun Kim, Gahee Kim, Sunghwan Kim, Se-Young Yun

Abstract: Offline multi-agent reinforcement learning (MARL) is increasingly recognized as crucial for effectively deploying RL algorithms in environments where real-time interaction is impractical, risky, or costly. In the offline setting, learning from a static dataset of past interactions allows for the development of robust and safe policies without the need for live data collection, which can be fraught… ▽ More Offline multi-agent reinforcement learning (MARL) is increasingly recognized as crucial for effectively deploying RL algorithms in environments where real-time interaction is impractical, risky, or costly. In the offline setting, learning from a static dataset of past interactions allows for the development of robust and safe policies without the need for live data collection, which can be fraught with challenges. Building on this foundational importance, we present EAQ, Episodes Augmentation guided by Q-total loss, a novel approach for offline MARL framework utilizing diffusion models. EAQ integrates the Q-total function directly into the diffusion model as a guidance to maximize the global returns in an episode, eliminating the need for separate training. Our focus primarily lies on cooperative scenarios, where agents are required to act collectively towards achieving a shared goal-essentially, maximizing global returns. Consequently, we demonstrate that our episodes augmentation in a collaborative manner significantly boosts offline MARL algorithm compared to the original dataset, improving the normalized return by +17.3% and +12.9% for medium and poor behavioral policies in SMAC simulator, respectively. △ Less

Submitted 23 August, 2024; originally announced August 2024.

Comments: Accepted by SPIGM Workshop at ICML 2024 (Structured Probabilistic Inference & Generative Modeling)

arXiv:2408.12890 [pdf, other]

Multiple Areal Feature Aware Transportation Demand Prediction

Authors: Sumin Han, Jisun An, Youngjun Park, Suji Kim, Kitae Jang, Dongman Lee

Abstract: A reliable short-term transportation demand prediction supports the authorities in improving the capability of systems by optimizing schedules, adjusting fleet sizes, and generating new transit networks. A handful of research efforts incorporate one or a few areal features while learning spatio-temporal correlation, to capture similar demand patterns between similar areas. However, urban character… ▽ More A reliable short-term transportation demand prediction supports the authorities in improving the capability of systems by optimizing schedules, adjusting fleet sizes, and generating new transit networks. A handful of research efforts incorporate one or a few areal features while learning spatio-temporal correlation, to capture similar demand patterns between similar areas. However, urban characteristics are polymorphic, and they need to be understood by multiple areal features such as land use, sociodemographics, and place-of-interest (POI) distribution. In this paper, we propose a novel spatio-temporal multi-feature-aware graph convolutional recurrent network (ST-MFGCRN) that fuses multiple areal features during spatio-temproal understanding. Inside ST-MFGCRN, we devise sentinel attention to calculate the areal similarity matrix by allowing each area to take partial attention if the feature is not useful. We evaluate the proposed model on two real-world transportation datasets, one with our constructed BusDJ dataset and one with benchmark TaxiBJ. Results show that our model outperforms the state-of-the-art baselines up to 7\% on BusDJ and 8\% on TaxiBJ dataset. △ Less

Submitted 23 August, 2024; originally announced August 2024.

arXiv:2408.12875 [pdf, other]

Disentangling, Amplifying, and Debiasing: Learning Disentangled Representations for Fair Graph Neural Networks

Authors: Yeon-Chang Lee, Hojung Shin, Sang-Wook Kim

Abstract: Graph Neural Networks (GNNs) have become essential tools for graph representation learning in various domains, such as social media and healthcare. However, they often suffer from fairness issues due to inherent biases in node attributes and graph structure, leading to unfair predictions. To address these challenges, we propose a novel GNN framework, DAB-GNN, that Disentangles, Amplifies, and deBi… ▽ More Graph Neural Networks (GNNs) have become essential tools for graph representation learning in various domains, such as social media and healthcare. However, they often suffer from fairness issues due to inherent biases in node attributes and graph structure, leading to unfair predictions. To address these challenges, we propose a novel GNN framework, DAB-GNN, that Disentangles, Amplifies, and deBiases attribute, structure, and potential biases in the GNN mechanism. DAB-GNN employs a disentanglement and amplification module that isolates and amplifies each type of bias through specialized disentanglers, followed by a debiasing module that minimizes the distance between subgroup distributions to ensure fairness. Extensive experiments on five datasets demonstrate that DAB-GNN significantly outperforms ten state-of-the-art competitors in terms of achieving an optimal balance between accuracy and fairness. △ Less

Submitted 23 August, 2024; originally announced August 2024.

arXiv:2408.12692 [pdf, other]

Unlocking Intrinsic Fairness in Stable Diffusion

Authors: Eunji Kim, Siwon Kim, Rahim Entezari, Sungroh Yoon

Abstract: Recent text-to-image models like Stable Diffusion produce photo-realistic images but often show demographic biases. Previous debiasing methods focused on training-based approaches, failing to explore the root causes of bias and overlooking Stable Diffusion's potential for unbiased image generation. In this paper, we demonstrate that Stable Diffusion inherently possesses fairness, which can be unlo… ▽ More Recent text-to-image models like Stable Diffusion produce photo-realistic images but often show demographic biases. Previous debiasing methods focused on training-based approaches, failing to explore the root causes of bias and overlooking Stable Diffusion's potential for unbiased image generation. In this paper, we demonstrate that Stable Diffusion inherently possesses fairness, which can be unlocked to achieve debiased outputs. Through carefully designed experiments, we identify the excessive bonding between text prompts and the diffusion process as a key source of bias. To address this, we propose a novel approach that perturbs text conditions to unleash Stable Diffusion's intrinsic fairness. Our method effectively mitigates bias without additional tuning, while preserving image-text alignment and image quality. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: 21 pages, 20 figures; First two authors contributed equally

arXiv:2408.12490 [pdf, other]

Probabilistic Homotopy Optimization for Dynamic Motion Planning

Authors: Shayan Pardis, Matthew Chignoli, Sangbae Kim

Abstract: We present a homotopic approach to solving challenging, optimization-based motion planning problems. The approach uses Homotopy Optimization, which, unlike standard continuation methods for solving homotopy problems, solves a sequence of constrained optimization problems rather than a sequence of nonlinear systems of equations. The insight behind our proposed algorithm is formulating the discovery… ▽ More We present a homotopic approach to solving challenging, optimization-based motion planning problems. The approach uses Homotopy Optimization, which, unlike standard continuation methods for solving homotopy problems, solves a sequence of constrained optimization problems rather than a sequence of nonlinear systems of equations. The insight behind our proposed algorithm is formulating the discovery of this sequence of optimization problems as a search problem in a multidimensional homotopy parameter space. Our proposed algorithm, the Probabilistic Homotopy Optimization algorithm, switches between solve and sample phases, using solutions to easy problems as initial guesses to more challenging problems. We analyze how our algorithm performs in the presence of common challenges to homotopy methods, such as bifurcation, folding, and disconnectedness of the homotopy solution manifold. Finally, we demonstrate its utility via a case study on two dynamic motion planning problems: the cart-pole and the MIT Humanoid. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: 8 pages, 9 Figures, 2 Tables, to appear in the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

arXiv:2408.12433 [pdf]

Technology and Performance Benchmarks of IQM's 20-Qubit Quantum Computer

Authors: Leonid Abdurakhimov, Janos Adam, Hasnain Ahmad, Olli Ahonen, Manuel Algaba, Guillermo Alonso, Ville Bergholm, Rohit Beriwal, Matthias Beuerle, Clinton Bockstiegel, Alessio Calzona, Chun Fai Chan, Daniele Cucurachi, Saga Dahl, Rakhim Davletkaliyev, Olexiy Fedorets, Alejandro Gomez Frieiro, Zheming Gao, Johan Guldmyr, Andrew Guthrie, Juha Hassel, Hermanni Heimonen, Johannes Heinsoo, Tuukka Hiltunen, Keiran Holland , et al. (89 additional authors not shown)

Abstract: Quantum computing has tremendous potential to overcome some of the fundamental limitations present in classical information processing. Yet, today's technological limitations in the quality and scaling prevent exploiting its full potential. Quantum computing based on superconducting quantum processing units (QPUs) is among the most promising approaches towards practical quantum advantage. In thi… ▽ More Quantum computing has tremendous potential to overcome some of the fundamental limitations present in classical information processing. Yet, today's technological limitations in the quality and scaling prevent exploiting its full potential. Quantum computing based on superconducting quantum processing units (QPUs) is among the most promising approaches towards practical quantum advantage. In this article the basic technological approach of IQM Quantum Computers is described covering both the QPU and the rest of the full-stack quantum computer. In particular, the focus is on a 20-qubit quantum computer featuring the Garnet QPU and its architecture, which we will scale up to 150 qubits. We also present QPU and system-level benchmarks, including a median 2-qubit gate fidelity of 99.5% and genuinely entangling all 20 qubits in a Greenberger-Horne-Zeilinger (GHZ) state. △ Less

Submitted 22 August, 2024; originally announced August 2024.

arXiv:2408.12343 [pdf, other]

Catastrophic Emission of Charges from Near-Extremal Rotating Charged Nariai Black Holes

Authors: Chiang-Mei Chen, Chun-Chih Huang, Sang Pyo Kim, Chun-Yu Wei

Abstract: Kerr-Newman black holes in a de Sitter space have the limit of rotating Nariai black holes with the near-horizon geometry of a warped ${\rm dS}_3 \times {\rm S}^1/Z_2$ when the black hole horizon and the cosmological horizon coincide or approach close to each other. We study the effect of rotation on the emission of charges in the near-extremal rotating charged Nariai black hole and compare it to… ▽ More Kerr-Newman black holes in a de Sitter space have the limit of rotating Nariai black holes with the near-horizon geometry of a warped ${\rm dS}_3 \times {\rm S}^1/Z_2$ when the black hole horizon and the cosmological horizon coincide or approach close to each other. We study the effect of rotation on the emission of charges in the near-extremal rotating charged Nariai black hole and compare it to those from the near-extremal Nariai black hole and near-extremal Kerr-Newman black hole in de Sitter space. The emission has an exponential amplification for charges with high energy and becomes catastrophic when the two horizons are very close to each together. The angular momentum of black holes decreases the mean number of charges by a factor not by an order. We observe a catastrophic emission of boson condensation for charges with the effective energy equal to the chemical potential in the spacelike outer region of the cosmological horizon. Further, the rotating charged Nariai black holes can evolve into singular spacetimes with a naked singularity by the Schwinger pair production. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: 11 pages, 6 figures. arXiv admin note: text overlap with arXiv:2309.00218

arXiv:2408.12061 [pdf, other]

Spin-Orbit Coupling for Optical Vortex Generation in van der Waals Materials

Authors: Jaegang Jo, Sujeong Byun, Munseong Bae, Jianwei Wang, Haejun Chung, Sejeong Kim

Abstract: An optical vortex beam has attracted significant attention across diverse applications, including optical manipulation, phase-contrast microscopy, optical communication, and quantum photonics. To utilize vortex generators for integrated photonics, researchers have developed ultra-compact vortex generators using fork gratings, metasurfaces, and integrated microcombs. However, those devices depend o… ▽ More An optical vortex beam has attracted significant attention across diverse applications, including optical manipulation, phase-contrast microscopy, optical communication, and quantum photonics. To utilize vortex generators for integrated photonics, researchers have developed ultra-compact vortex generators using fork gratings, metasurfaces, and integrated microcombs. However, those devices depend on costly, time-consuming nanofabrication and are constrained by the low signal-to-noise ratio due to the fabrication error. As an alternative maneuver, spin-orbit coupling has emerged as a method to obtain the vortex beam by converting spin angular momentum (SAM) without nanostructures. Here, we demonstrate the creation of an optical vortex beam using van der Waals (vdW) materials. The significantly high birefringence of vdW materials allows generations of optical vortex beams with high efficiency in a sub-wavelength thickness. In this work, we utilize an 8-um-thick hexagonal boron nitride (hBN) crystal for the creation of optical vortices carrying topological charges of +2 and -2. We also present the generation of an optical vortex beam in a 320-nm-thick MoS2 crystal with a conversion efficiency of 0.09. This study paves the way for fabrication-less and ultra-compact optical vortex generators, which can be applied for integrated photonics and large-scale vortex generator arrays. △ Less

Submitted 21 August, 2024; originally announced August 2024.

Comments: 15 pages, 7 figures

arXiv:2408.11248 [pdf, ps, other]

Microlensing brown-dwarf companions in binaries detected during the 2022 and 2023 seasons

Authors: Cheongho Han, Ian A. Bond, Andrzej Udalski, Chung-Uk Lee, Andrew Gould, Michael D. Albrow, Sun-Ju Chung, Kyu-Ha Hwang, Youn Kil Jung, Yoon-Hyun Ryu, Yossi Shvartzvald, In-Gu Shin, Jennifer C. Yee, Hongjing Yang, Weicheng Zang, Sang-Mok Cha, Doeon Kim, Dong-Jin Kim, Seung-Lee Kim, Dong-Joo Lee, Yongseok Lee, Byeong-Gon Park, Richard W. Pogge, Fumio Abe, Ken Bando , et al. (41 additional authors not shown)

Abstract: Building on previous works to construct a homogeneous sample of brown dwarfs in binary systems, we investigate microlensing events detected by the Korea Microlensing Telescope Network (KMTNet) survey during the 2022 and 2023 seasons. Given the difficulty in distinguishing brown-dwarf events from those produced by binary lenses with nearly equal-mass components, we analyze all lensing events detect… ▽ More Building on previous works to construct a homogeneous sample of brown dwarfs in binary systems, we investigate microlensing events detected by the Korea Microlensing Telescope Network (KMTNet) survey during the 2022 and 2023 seasons. Given the difficulty in distinguishing brown-dwarf events from those produced by binary lenses with nearly equal-mass components, we analyze all lensing events detected during the seasons that exhibit anomalies characteristic of binary-lens systems. Using the same criteria consistently applied in previous studies, we identify six additional brown dwarf candidates through the analysis of lensing events KMT-2022-BLG-0412, KMT-2022-BLG-2286, KMT-2023-BLG-0201, KMT-2023-BLG-0601, KMT-2023-BLG-1684, and KMT-2023-BLG-1743. An examination of the mass posteriors shows that the median mass of the lens companions ranges from 0.02 $M_\odot$ to 0.05 $M_\odot$, indicating that these companions fall within the brown-dwarf mass range. The mass of the primary lenses ranges from 0.11 $M_\odot$ to 0.68 $M_\odot$, indicating that they are low-mass stars with substantially lower masses compared to the Sun. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: 13 pages, 17 figures, 12 tables

arXiv:2408.11217 [pdf, other]

Beyond skyrmion spin texture from quantum Kelvin-Helmholtz instability

Authors: SeungJung Huh, Wooyoung Yun, Gabin Yun, Samgyu Hwang, Kiryang Kwon, Junhyeok Hur, Seungho Lee, Hiromitsu Takeuchi, Se Kwon Kim, Jae-yoon Choi

Abstract: Topology profoundly influences diverse fields of science, providing a powerful framework for classifying phases of matter and predicting nontrivial excitations, such as solitons, vortices, and skyrmions. These topological defects are typically characterized by integer numbers, called topological charges, representing the winding number in their order parameter field. The classification and predict… ▽ More Topology profoundly influences diverse fields of science, providing a powerful framework for classifying phases of matter and predicting nontrivial excitations, such as solitons, vortices, and skyrmions. These topological defects are typically characterized by integer numbers, called topological charges, representing the winding number in their order parameter field. The classification and prediction of topological defects, however, become challenging when singularities are included within the integration domain for calculating the topological charge. While such exotic nonlinear excitations have been proposed in the superfluid $^3$He-A phase and spinor Bose-Einstein condensate of atomic gases, experimental observation of these structures and studies of their stability have long been elusive. Here we report the observation of a singular skyrmion that goes beyond the framework of topology in a ferromagnetic superfluid. The exotic skyrmions are sustained by undergoing anomalous symmetry breaking associated with the eccentric spin singularity and carry half of the elementary charge, distinctive from conventional skyrmions or merons. By successfully realizing the universal regime of the quantum Kelvin-Helmholtz instability, we identified the eccentric fractional skyrmions, produced by emission from a magnetic domain wall and a spontaneous splitting of an integer skyrmion with spin singularities. The singular skyrmions are stable and can be observed after 2~s of hold time. Our results confirm the universality between classical and quantum Kelvin-Helmholtz instabilities and broaden our understanding on complex nonlinear dynamics of nontrivial texture beyond skyrmion in topological quantum systems. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: 13 pages, 5 main figures and 7 supplemental figures

arXiv:2408.10727 [pdf, other]

T-matrix representation of optical scattering response: Suggestion for a data format

Authors: Nigar Asadova, Karim Achouri, Kristian Arjas, Baptiste Auguié, Roland Aydin, Alexandre Baron, Dominik Beutel, Bernd Bodermann, Kaoutar Boussaoud, Sven Burger, Minseok Choi, Krzysztof M. Czajkowski, Andrey B. Evlyukhin, Atefeh Fazel-Najafabadi, Ivan Fernandez-Corbaton, Puneet Garg, David Globosits, Ulrich Hohenester, Hongyoon Kim, Seokwoo Kim, Philippe Lalanne, Eric C. Le Ru, Jörg Meyer, Jungho Mun, Lorenzo Pattelli , et al. (17 additional authors not shown)

Abstract: The transition matrix, frequently abbreviated as T-matrix, contains the complete information in a linear approximation of how a spatially localized object scatters an incident field. The T-matrix is used to study the scattering response of an isolated object and describes the optical response of complex photonic materials made from ensembles of individual objects. T-matrices of certain common stru… ▽ More The transition matrix, frequently abbreviated as T-matrix, contains the complete information in a linear approximation of how a spatially localized object scatters an incident field. The T-matrix is used to study the scattering response of an isolated object and describes the optical response of complex photonic materials made from ensembles of individual objects. T-matrices of certain common structures, potentially, have been repeatedly calculated all over the world again and again. This is not necessary and constitutes a major challenge for various reasons. First, the resources spent on their computation represent an unsustainable financial and ecological burden. Second, with the onset of machine learning, data is the gold of our era, and it should be freely available to everybody to address novel scientific challenges. Finally, the possibility of reproducing simulations could tremendously improve if the considered T-matrices could be shared. To address these challenges, we found it important to agree on a common data format for T-matrices and to enable their collection from different sources and distribution. This document aims to develop the specifications for storing T-matrices and associated metadata. The specifications should allow maximum freedom to accommodate as many use cases as possible without introducing any ambiguity in the stored data. The common format will assist in setting up a public database of T-matrices. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: Submitted to the Journal of Quantitative Spectroscopy and Radiative Transfer

arXiv:2408.10490 [pdf, other]

Analysis of Plan-based Retrieval for Grounded Text Generation

Authors: Ameya Godbole, Nicholas Monath, Seungyeon Kim, Ankit Singh Rawat, Andrew McCallum, Manzil Zaheer

Abstract: In text generation, hallucinations refer to the generation of seemingly coherent text that contradicts established knowledge. One compelling hypothesis is that hallucinations occur when a language model is given a generation task outside its parametric knowledge (due to rarity, recency, domain, etc.). A common strategy to address this limitation is to infuse the language models with retrieval mech… ▽ More In text generation, hallucinations refer to the generation of seemingly coherent text that contradicts established knowledge. One compelling hypothesis is that hallucinations occur when a language model is given a generation task outside its parametric knowledge (due to rarity, recency, domain, etc.). A common strategy to address this limitation is to infuse the language models with retrieval mechanisms, providing the model with relevant knowledge for the task. In this paper, we leverage the planning capabilities of instruction-tuned LLMs and analyze how planning can be used to guide retrieval to further reduce the frequency of hallucinations. We empirically evaluate several variations of our proposed approach on long-form text generation tasks. By improving the coverage of relevant facts, plan-guided retrieval and generation can produce more informative responses while providing a higher rate of attribution to source documents. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.10402 [pdf, other]

doi 10.1016/j.cap.2024.08.009

Stacking-Dependent Van Hove Singularity Shifts in Three-Dimensional Charge Density Waves of Kagome Metals AV$_3$Sb$_5$ (A = K, Rb, Cs)

Authors: Chanchal K. Barman, Sun-Woo Kim, Youngkuk Kim

Abstract: Vanadium-based kagome systems AV$_3$Sb$_5$ (A = K, Rb, Cs) have emerged as paradigmatic examples exhibiting unconventional charge density waves (CDWs) and superconductivity linked to van Hove singularities (VHSs). Despite extensive studies, the three-dimensional (3D) nature of CDW states in these systems remains elusive. This study employs first-principles density functional theory and a tight-bin… ▽ More Vanadium-based kagome systems AV$_3$Sb$_5$ (A = K, Rb, Cs) have emerged as paradigmatic examples exhibiting unconventional charge density waves (CDWs) and superconductivity linked to van Hove singularities (VHSs). Despite extensive studies, the three-dimensional (3D) nature of CDW states in these systems remains elusive. This study employs first-principles density functional theory and a tight-binding model to investigate the stacking-dependent electronic structures of 3D CDWs in AV$_3$Sb$_5$, emphasizing the significant role of interlayer coupling in behaviors of the VHSs associated with diverse 3D CDW orders. We develop a minimal 3D tight-binding model and present a detailed analysis of band structures and density of states for various 3D CDW stacking configurations, including those with and without a $π$-phase shift stacking of the inverse star of David, as well as alternating stacking of the inverse star of David and the star of David. We find that VHSs exist below the Fermi level even in 3D CDWs without $π$-phase shift stackings, and that these VHSs shift downward in the $π$-phase shift stacking CDW structure, stabilizing the $2\times2\times2$ $π$-shifted inverse star of David distortions in alternating vanadium layers as the ground state 3D CDW order of AV$_3$Sb$_5$. Our work provides the electronic origin of 3D CDW orders, paving the way for a deeper understanding of CDWs and superconductivity in AV$_3$Sb$_5$ kagome metals. △ Less

Submitted 28 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

Comments: 4 figures + Supplementary information

Journal ref: Current Applied Physics 68, 31 (2024)

arXiv:2408.10356 [pdf, other]

Diversity and stylization of the contemporary user-generated visual arts in the complexity-entropy plane

Authors: Seunghwan Kim, Byunghwee Lee, Wonjae Lee

Abstract: The advent of computational and numerical methods in recent times has provided new avenues for analyzing art historiographical narratives and tracing the evolution of art styles therein. Here, we investigate an evolutionary process underpinning the emergence and stylization of contemporary user-generated visual art styles using the complexity-entropy (C-H) plane, which quantifies local structures… ▽ More The advent of computational and numerical methods in recent times has provided new avenues for analyzing art historiographical narratives and tracing the evolution of art styles therein. Here, we investigate an evolutionary process underpinning the emergence and stylization of contemporary user-generated visual art styles using the complexity-entropy (C-H) plane, which quantifies local structures in paintings. Informatizing 149,780 images curated in DeviantArt and Behance platforms from 2010 to 2020, we analyze the relationship between local information of the C-H space and multi-level image features generated by a deep neural network and a feature extraction algorithm. The results reveal significant statistical relationships between the C-H information of visual artistic styles and the dissimilarities of the multi-level image features over time within groups of artworks. By disclosing a particular C-H region where the diversity of image representations is noticeably manifested, our analyses reveal an empirical condition of emerging styles that are both novel in the C-H plane and characterized by greater stylistic diversity. Our research shows that visual art analyses combined with physics-inspired methodologies and machine learning, can provide macroscopic insights into quantitatively mapping relevant characteristics of an evolutionary process underpinning the creative stylization of uncharted visual arts of given groups and time. △ Less

Submitted 21 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

Comments: 18 pages, 3 figures, 1 table, SI(4 figures, 3 tables)

arXiv:2408.09806 [pdf, other]

Improved background modeling for dark matter search with COSINE-100

Authors: G. H. Yu, N. Carlin, J. Y. Cho, J. J. Choi, S. Choi, A. C. Ezeribe, L. E. Franca, C. Ha, I. S. Hahn, S. J. Hollick, E. J. Jeon, H. W. Joo, W. G. Kang, M. Kauer, B. H. Kim, H. J. Kim, J. Kim, K. W. Kim, S. H. Kim, S. K. Kim, W. K. Kim, Y. D. Kim, Y. H. Kim, Y. J. Ko, D. H. Lee , et al. (33 additional authors not shown)

Abstract: COSINE-100 aims to conclusively test the claimed dark matter annual modulation signal detected by DAMA/LIBRA collaboration. DAMA/LIBRA has released updated analysis results by lowering the energy threshold to 0.75 keV through various upgrades. They have consistently claimed to have observed the annual modulation. In COSINE-100, it is crucial to lower the energy threshold for a direct comparison wi… ▽ More COSINE-100 aims to conclusively test the claimed dark matter annual modulation signal detected by DAMA/LIBRA collaboration. DAMA/LIBRA has released updated analysis results by lowering the energy threshold to 0.75 keV through various upgrades. They have consistently claimed to have observed the annual modulation. In COSINE-100, it is crucial to lower the energy threshold for a direct comparison with DAMA/LIBRA, which also enhances the sensitivity of the search for low-mass dark matter, enabling COSINE-100 to explore this area. Therefore, it is essential to have a precise and quantitative understanding of the background spectrum across all energy ranges. This study expands the background modeling from 0.7 to 4000 keV using 2.82 years of COSINE-100 data. The modeling has been improved to describe the background spectrum across all energy ranges accurately. Assessments of the background spectrum are presented, considering the nonproportionality of NaI(Tl) crystals at both low and high energies and the characteristic X-rays produced by the interaction of external backgrounds with materials such as copper. Additionally, constraints on the fit parameters obtained from the alpha spectrum modeling fit are integrated into this model. These improvements are detailed in the paper. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.09662 [pdf, other]

CusADi: A GPU Parallelization Framework for Symbolic Expressions and Optimal Control

Authors: Se Hwan Jeon, Seungwoo Hong, Ho Jae Lee, Charles Khazoom, Sangbae Kim

Abstract: The parallelism afforded by GPUs presents significant advantages in training controllers through reinforcement learning (RL). However, integrating model-based optimization into this process remains challenging due to the complexity of formulating and solving optimization problems across thousands of instances. In this work, we present CusADi, an extension of the CasADi symbolic framework to suppor… ▽ More The parallelism afforded by GPUs presents significant advantages in training controllers through reinforcement learning (RL). However, integrating model-based optimization into this process remains challenging due to the complexity of formulating and solving optimization problems across thousands of instances. In this work, we present CusADi, an extension of the CasADi symbolic framework to support the parallelization of arbitrary closed-form expressions on GPUs with CUDA. We also formulate a closed-form approximation for solving general optimal control problems, enabling large-scale parallelization and evaluation of MPC controllers. Our results show a ten-fold speedup relative to similar MPC implementation on the CPU, and we demonstrate the use of CusADi for various applications, including parallel simulation, parameter sweeps, and policy training. △ Less

Submitted 18 August, 2024; originally announced August 2024.

Comments: RAL 2024 submission

arXiv:2408.09140 [pdf, other]

Learning to Explore for Stochastic Gradient MCMC

Authors: SeungHyun Kim, Seohyeon Jung, Seonghyeon Kim, Juho Lee

Abstract: Bayesian Neural Networks(BNNs) with high-dimensional parameters pose a challenge for posterior inference due to the multi-modality of the posterior distributions. Stochastic Gradient MCMC(SGMCMC) with cyclical learning rate scheduling is a promising solution, but it requires a large number of sampling steps to explore high-dimensional multi-modal posteriors, making it computationally expensive. In… ▽ More Bayesian Neural Networks(BNNs) with high-dimensional parameters pose a challenge for posterior inference due to the multi-modality of the posterior distributions. Stochastic Gradient MCMC(SGMCMC) with cyclical learning rate scheduling is a promising solution, but it requires a large number of sampling steps to explore high-dimensional multi-modal posteriors, making it computationally expensive. In this paper, we propose a meta-learning strategy to build \gls{sgmcmc} which can efficiently explore the multi-modal target distributions. Our algorithm allows the learned SGMCMC to quickly explore the high-density region of the posterior landscape. Also, we show that this exploration property is transferrable to various tasks, even for the ones unseen during a meta-training stage. Using popular image classification benchmarks and a variety of downstream tasks, we demonstrate that our method significantly improves the sampling efficiency, achieving better performance than vanilla \gls{sgmcmc} without incurring significant computational overhead. △ Less

Submitted 17 August, 2024; originally announced August 2024.

arXiv:2408.08577 [pdf, other]

Mechanistic Modeling of Lipid Nanoparticle Formation for the Delivery of Nucleic Acid Therapeutics

Authors: Pavan K. Inguva, Saikat Mukherjee, Pierre J. Walker, Mona A. Kanso, Jie Wang, Yanchen Wu, Vico Tenberg, Srimanta Santra, Shalini Singh, Shin Hyuk Kim, Bernhardt L. Trout, Martin Z. Bazant, Allan S. Myerson, Richard D. Braatz

Abstract: Nucleic acids such as mRNA have emerged as a promising therapeutic modality with the capability of addressing a wide range of diseases. Lipid nanoparticles (LNPs) as a delivery platform for nucleic acids were used in the COVID-19 vaccines and have received much attention. While modern manufacturing processes which involve rapidly mixing an organic stream containing the lipids with an aqueous strea… ▽ More Nucleic acids such as mRNA have emerged as a promising therapeutic modality with the capability of addressing a wide range of diseases. Lipid nanoparticles (LNPs) as a delivery platform for nucleic acids were used in the COVID-19 vaccines and have received much attention. While modern manufacturing processes which involve rapidly mixing an organic stream containing the lipids with an aqueous stream containing the nucleic acids are conceptually straightforward, detailed understanding of LNP formation and structure is still limited and scale-up can be challenging. Mathematical and computational methods are a promising avenue for deepening scientific understanding of the LNP formation process and facilitating improved process development and control. This article describes strategies for the mechanistic modeling of LNP formation, starting with strategies to estimate and predict important physicochemical properties of the various species such as diffusivities and solubilities. Subsequently, a framework is outlined for constructing mechanistic models of reactor- and particle-scale processes. Insights gained from the various models are mapped back to product quality attributes and process insights. Lastly, the use of the models to guide development of advanced process control and optimization strategies is discussed. △ Less

Submitted 16 August, 2024; originally announced August 2024.

Comments: 67 pages, 10 figures

arXiv:2408.08144 [pdf, other]

MIDAS: Multi-level Intent, Domain, And Slot Knowledge Distillation for Multi-turn NLU

Authors: Yan Li, So-Eon Kim, Seong-Bae Park, Soyeon Caren Han

Abstract: Although Large Language Models(LLMs) can generate coherent and contextually relevant text, they often struggle to recognise the intent behind the human user's query. Natural Language Understanding (NLU) models, however, interpret the purpose and key information of user's input to enable responsive interactions. Existing NLU models generally map individual utterances to a dual-level semantic frame,… ▽ More Although Large Language Models(LLMs) can generate coherent and contextually relevant text, they often struggle to recognise the intent behind the human user's query. Natural Language Understanding (NLU) models, however, interpret the purpose and key information of user's input to enable responsive interactions. Existing NLU models generally map individual utterances to a dual-level semantic frame, involving sentence-level intent and word-level slot labels. However, real-life conversations primarily consist of multi-turn conversations, involving the interpretation of complex and extended dialogues. Researchers encounter challenges addressing all facets of multi-turn dialogue conversations using a unified single NLU model. This paper introduces a novel approach, MIDAS, leveraging a multi-level intent, domain, and slot knowledge distillation for multi-turn NLU. To achieve this, we construct distinct teachers for varying levels of conversation knowledge, namely, sentence-level intent detection, word-level slot filling, and conversation-level domain classification. These teachers are then fine-tuned to acquire specific knowledge of their designated levels. A multi-teacher loss is proposed to facilitate the combination of these multi-level teachers, guiding a student model in multi-turn dialogue tasks. The experimental results demonstrate the efficacy of our model in improving the overall multi-turn conversation understanding, showcasing the potential for advancements in NLU models through the incorporation of multi-level dialogue knowledge distillation techniques. △ Less

Submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.08090 [pdf, other]

UV-Plane Beam Mapping for Non-Terrestrial Networks in 3GPP System-Level Simulations

Authors: Dong-Hyun Jung, Sucheol Kim, Miyeon Lee, Joon-Gyu Ryu, Junil Choi

Abstract: Due to the high altitudes and large beam sizes of satellites, the curvature of the Earth's surface can impact system-level performance. To consider this, 3GPP introduces the UV-plane beam mapping for system-level simulations of non-terrestrial networks (NTNs). This paper aims to provide a comprehensive understanding of how beams and user equipments (UEs) are placed on the UV-plane and subsequently… ▽ More Due to the high altitudes and large beam sizes of satellites, the curvature of the Earth's surface can impact system-level performance. To consider this, 3GPP introduces the UV-plane beam mapping for system-level simulations of non-terrestrial networks (NTNs). This paper aims to provide a comprehensive understanding of how beams and user equipments (UEs) are placed on the UV-plane and subsequently mapped to the Earth's surface. We present a general process of projecting UEs on the UV-plane onto the Earth's surface. This process could offer a useful guideline for beam and UE deployment when evaluating the system-level performance of NTNs. △ Less

Submitted 15 August, 2024; originally announced August 2024.

Comments: 5 pages, 9 figures, 1 table

arXiv:2408.07947 [pdf, other]

Conditional Brownian Bridge Diffusion Model for VHR SAR to Optical Image Translation

Authors: Seon-Hoon Kim, Dae-won Chung

Abstract: Synthetic Aperture Radar (SAR) imaging technology provides the unique advantage of being able to collect data regardless of weather conditions and time. However, SAR images exhibit complex backscatter patterns and speckle noise, which necessitate expertise for interpretation. Research on translating SAR images into optical-like representations has been conducted to aid the interpretation of SAR da… ▽ More Synthetic Aperture Radar (SAR) imaging technology provides the unique advantage of being able to collect data regardless of weather conditions and time. However, SAR images exhibit complex backscatter patterns and speckle noise, which necessitate expertise for interpretation. Research on translating SAR images into optical-like representations has been conducted to aid the interpretation of SAR data. Nevertheless, existing studies have predominantly utilized low-resolution satellite imagery datasets and have largely been based on Generative Adversarial Network (GAN) which are known for their training instability and low fidelity. To overcome these limitations of low-resolution data usage and GAN-based approaches, this paper introduces a conditional image-to-image translation approach based on Brownian Bridge Diffusion Model (BBDM). We conducted comprehensive experiments on the MSAW dataset, a paired SAR and optical images collection of 0.5m Very-High-Resolution (VHR). The experimental results indicate that our method surpasses both the Conditional Diffusion Models (CDMs) and the GAN-based models in diverse perceptual quality metrics. △ Less

Submitted 20 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

Comments: 5 pages, 2 figures, 1 table

arXiv:2408.07901 [pdf]

Coupling between electrons and charge density wave fluctuation and its possible role in superconductivity

Authors: Yeonghoon Lee, Yeahan Sur, Sunghun Kim, Jaehun Cha, Jounghoon Hyun, Chan-young Lim, Makoto Hashimoto, Donghui Lu, Younsik Kim, Soonsang Huh, Changyoung Kim, Shinichiro Ideta, Kiyohisa Tanaka, Kee Hoon Kim, Yeongkwan Kim

Abstract: In most of charge density wave (CDW) systems of different material classes, ranging from traditional correlated systems in low-dimension to recent topological systems with Kagome lattice, superconductivity emerges when the system is driven toward the quantum critical point (QCP) of CDW via external parameters of doping and pressure. Despite this rather universal trend, the essential hinge between… ▽ More In most of charge density wave (CDW) systems of different material classes, ranging from traditional correlated systems in low-dimension to recent topological systems with Kagome lattice, superconductivity emerges when the system is driven toward the quantum critical point (QCP) of CDW via external parameters of doping and pressure. Despite this rather universal trend, the essential hinge between CDW and superconductivity has not been established yet. Here, the evidence of coupling between electron and CDW fluctuation is reported, based on a temperature- and intercalation-dependent kink in the angle-resolved photoemission spectra of 2H-PdxTaSe2. Kinks are observed only when the system is in the CDW phase, regardless of whether a long- or short-range order is established. Notably, the coupling strength is enhanced upon long-range CDW suppression, albeit the coupling energy scale is reduced. Interestingly, estimation of the superconducting critical temperature by incorporating the observed coupling characteristics into McMillan's equation yields result closely resembling the known values of the superconducting dome. Our results thus highlight a compelling possibility that this new coupling mediates Cooper pairs, which provides new insights on the competing relationship not only for CDW, but also for other competing orders. △ Less

Submitted 14 August, 2024; originally announced August 2024.

Comments: 20 pages, 4 figures for the main text. To be published in Advanced Science

arXiv:2408.07790 [pdf, other]

Cropper: Vision-Language Model for Image Cropping through In-Context Learning

Authors: Seung Hyun Lee, Junjie Ke, Yinxiao Li, Junfeng He, Steven Hickson, Katie Datsenko, Sangpil Kim, Ming-Hsuan Yang, Irfan Essa, Feng Yang

Abstract: The goal of image cropping is to identify visually appealing crops within an image. Conventional methods rely on specialized architectures trained on specific datasets, which struggle to be adapted to new requirements. Recent breakthroughs in large vision-language models (VLMs) have enabled visual in-context learning without explicit training. However, effective strategies for vision downstream ta… ▽ More The goal of image cropping is to identify visually appealing crops within an image. Conventional methods rely on specialized architectures trained on specific datasets, which struggle to be adapted to new requirements. Recent breakthroughs in large vision-language models (VLMs) have enabled visual in-context learning without explicit training. However, effective strategies for vision downstream tasks with VLMs remain largely unclear and underexplored. In this paper, we propose an effective approach to leverage VLMs for better image cropping. First, we propose an efficient prompt retrieval mechanism for image cropping to automate the selection of in-context examples. Second, we introduce an iterative refinement strategy to iteratively enhance the predicted crops. The proposed framework, named Cropper, is applicable to a wide range of cropping tasks, including free-form cropping, subject-aware cropping, and aspect ratio-aware cropping. Extensive experiments and a user study demonstrate that Cropper significantly outperforms state-of-the-art methods across several benchmarks. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2408.07775 [pdf, ps, other]

Sharp quantitative stability estimates for critical points of fractional Sobolev inequalities

Authors: Haixia Chen, Seunghyeok Kim, Juncheng Wei

Abstract: By developing a unified approach based on integral representations, we establish sharp quantitative stability estimates for critical points of the fractional Sobolev inequalities induced by the embedding $\dot{H}^s({\mathbb R}^n) \hookrightarrow L^{2n \over n-2s}({\mathbb R}^n)$ in the whole range of $s \in (0,\frac{n}{2})$. By developing a unified approach based on integral representations, we establish sharp quantitative stability estimates for critical points of the fractional Sobolev inequalities induced by the embedding $\dot{H}^s({\mathbb R}^n) \hookrightarrow L^{2n \over n-2s}({\mathbb R}^n)$ in the whole range of $s \in (0,\frac{n}{2})$. △ Less

Submitted 14 August, 2024; originally announced August 2024.

Comments: 36 pages; comments welcome

arXiv:2408.07648 [pdf, other]

See It All: Contextualized Late Aggregation for 3D Dense Captioning

Authors: Minjung Kim, Hyung Suk Lim, Seung Hwan Kim, Soonyoung Lee, Bumsoo Kim, Gunhee Kim

Abstract: 3D dense captioning is a task to localize objects in a 3D scene and generate descriptive sentences for each object. Recent approaches in 3D dense captioning have adopted transformer encoder-decoder frameworks from object detection to build an end-to-end pipeline without hand-crafted components. However, these approaches struggle with contradicting objectives where a single query attention has to s… ▽ More 3D dense captioning is a task to localize objects in a 3D scene and generate descriptive sentences for each object. Recent approaches in 3D dense captioning have adopted transformer encoder-decoder frameworks from object detection to build an end-to-end pipeline without hand-crafted components. However, these approaches struggle with contradicting objectives where a single query attention has to simultaneously view both the tightly localized object regions and contextual environment. To overcome this challenge, we introduce SIA (See-It-All), a transformer pipeline that engages in 3D dense captioning with a novel paradigm called late aggregation. SIA simultaneously decodes two sets of queries-context query and instance query. The instance query focuses on localization and object attribute descriptions, while the context query versatilely captures the region-of-interest of relationships between multiple objects or with the global scene, then aggregated afterwards (i.e., late aggregation) via simple distance-based measures. To further enhance the quality of contextualized caption generation, we design a novel aggregator to generate a fully informed caption based on the surrounding context, the global environment, and object instances. Extensive experiments on two of the most widely-used 3D dense captioning datasets demonstrate that our proposed method achieves a significant improvement over prior methods. △ Less

Submitted 14 August, 2024; originally announced August 2024.

Comments: Accepted to ACL 2024 Findings

arXiv:2408.07506 [pdf, other]

Correlators for pseudo Hermitian systems

Authors: Yao Bai, Ting-Long Feng, Suro Kim, Cheng-Yang Lee, Lei-Hua Liu, Wangping Zhao, Siyi Zhou

Abstract: Pseudo-Hermitian system is a class of non-Hermitian system with Hamiltonian satisfying the condition $η^{-1}H^\daggerη=H$. We develop the in-in and Schwinger Keldysh formalism to calculate cosmological correlators for pseudo-Hermitian systems. We study a model consists of massive symplectic fermions coupled to the primordial curvature perturbation. The three-point function for the primordial curva… ▽ More Pseudo-Hermitian system is a class of non-Hermitian system with Hamiltonian satisfying the condition $η^{-1}H^\daggerη=H$. We develop the in-in and Schwinger Keldysh formalism to calculate cosmological correlators for pseudo-Hermitian systems. We study a model consists of massive symplectic fermions coupled to the primordial curvature perturbation. The three-point function for the primordial curvature perturbation is computed up to one-loop and compared to earlier work where the loop correction comes from a massive scalar boson. The two results differ by a minus sign. Therefore, the one loop correction to the three-point function cannot be used to distinguished scalar bosons and symplectic fermions. To conclude, we discuss possibilities where the scalar bosons and symplectic fermions may be distinguished. △ Less

Submitted 14 August, 2024; originally announced August 2024.

Comments: 19 pages, 2 figures

arXiv:2408.07416 [pdf, other]

Rethinking Open-Vocabulary Segmentation of Radiance Fields in 3D Space

Authors: Hyunjee Lee, Youngsik Yun, Jeongmin Bae, Seoha Kim, Youngjung Uh

Abstract: Understanding the 3D semantics of a scene is a fundamental problem for various scenarios such as embodied agents. While NeRFs and 3DGS excel at novel-view synthesis, previous methods for understanding their semantics have been limited to incomplete 3D understanding: their segmentation results are 2D masks and their supervision is anchored at 2D pixels. This paper revisits the problem set to pursue… ▽ More Understanding the 3D semantics of a scene is a fundamental problem for various scenarios such as embodied agents. While NeRFs and 3DGS excel at novel-view synthesis, previous methods for understanding their semantics have been limited to incomplete 3D understanding: their segmentation results are 2D masks and their supervision is anchored at 2D pixels. This paper revisits the problem set to pursue a better 3D understanding of a scene modeled by NeRFs and 3DGS as follows. 1) We directly supervise the 3D points to train the language embedding field. It achieves state-of-the-art accuracy without relying on multi-scale language embeddings. 2) We transfer the pre-trained language field to 3DGS, achieving the first real-time rendering speed without sacrificing training time or accuracy. 3) We introduce a 3D querying and evaluation protocol for assessing the reconstructed geometry and semantics together. Code, checkpoints, and annotations will be available online. Project page: https://hyunji12.github.io/Open3DRF △ Less

Submitted 18 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

Comments: Project page: https://hyunji12.github.io/Open3DRF

arXiv:2408.07372 [pdf, ps, other]

An Adaptive Importance Sampling for Locally Stable Point Processes

Authors: Hee-Geon Kang, Sunggon Kim

Abstract: The problem of finding the expected value of a statistic of a locally stable point process in a bounded region is addressed. We propose an adaptive importance sampling for solving the problem. In our proposal, we restrict the importance point process to the family of homogeneous Poisson point processes, which enables us to generate quickly independent samples of the importance point process. The o… ▽ More The problem of finding the expected value of a statistic of a locally stable point process in a bounded region is addressed. We propose an adaptive importance sampling for solving the problem. In our proposal, we restrict the importance point process to the family of homogeneous Poisson point processes, which enables us to generate quickly independent samples of the importance point process. The optimal intensity of the importance point process is found by applying the cross-entropy minimization method. In the proposed scheme, the expected value of the function and the optimal intensity are iteratively estimated in an adaptive manner. We show that the proposed estimator converges to the target value almost surely, and prove the asymptotic normality of it. We explain how to apply the proposed scheme to the estimation of the intensity of a stationary pairwise interaction point process. The performance of the proposed scheme is compared numerically with the Markov chain Monte Carlo simulation and the perfect sampling. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2408.06673 [pdf]

Pragmatic inference of scalar implicature by LLMs

Authors: Ye-eun Cho, Seong mook Kim

Abstract: This study investigates how Large Language Models (LLMs), particularly BERT (Devlin et al., 2019) and GPT-2 (Radford et al., 2019), engage in pragmatic inference of scalar implicature, such as some. Two sets of experiments were conducted using cosine similarity and next sentence/token prediction as experimental methods. The results in experiment 1 showed that, both models interpret some as pragmat… ▽ More This study investigates how Large Language Models (LLMs), particularly BERT (Devlin et al., 2019) and GPT-2 (Radford et al., 2019), engage in pragmatic inference of scalar implicature, such as some. Two sets of experiments were conducted using cosine similarity and next sentence/token prediction as experimental methods. The results in experiment 1 showed that, both models interpret some as pragmatic implicature not all in the absence of context, aligning with human language processing. In experiment 2, in which Question Under Discussion (QUD) was presented as a contextual cue, BERT showed consistent performance regardless of types of QUDs, while GPT-2 encountered processing difficulties since a certain type of QUD required pragmatic inference for implicature. The findings revealed that, in terms of theoretical approaches, BERT inherently incorporates pragmatic implicature not all within the term some, adhering to Default model (Levinson, 2000). In contrast, GPT-2 seems to encounter processing difficulties in inferring pragmatic implicature within context, consistent with Context-driven model (Sperber and Wilson, 2002). △ Less

Submitted 13 August, 2024; originally announced August 2024.

Comments: This research was presented at the Association for Computational Linguistics conference, held on August 11-16

arXiv:2408.06287 [pdf, other]

Infant Type Ia Supernovae from the KMTNet I. Multi-Color Evolution and Populations

Authors: Yuan Qi Ni, Dae-Sik Moon, Maria R. Drout, Youngdae Lee, Patrick Sandoval, Jeehye Shin, Hong Soo Park, Sang Chul Kim, Kyuseok Oh

Abstract: We conduct a systematic analysis of the early multi-band light curves and colors of 19 Type Ia Supernovae (SNe) from the Korea Microlensing Telescope Network SN Program, including 16 previously unpublished events. Seven are detected $\lesssim$ 1 day since the estimated epoch of first light and the rest within $\lesssim$ 3 days. Some show excess emission within $<$ 0.5 days to $\sim$ 2 days, but mo… ▽ More We conduct a systematic analysis of the early multi-band light curves and colors of 19 Type Ia Supernovae (SNe) from the Korea Microlensing Telescope Network SN Program, including 16 previously unpublished events. Seven are detected $\lesssim$ 1 day since the estimated epoch of first light and the rest within $\lesssim$ 3 days. Some show excess emission within $<$ 0.5 days to $\sim$ 2 days, but most show pure power-law rises. The colors are initially diverse before $\sim$ 5 days, but converge to a similar color at $\sim$ 10 days. We identify at least three populations based on 2--5-day color evolution: (1) "early-blues" exhibit slowly-evolving colors consistent with a $\sim$ 17,000 K blackbody; (2) "early-reds" have initially blue $B-V$ and red $V-i$ colors that cannot simultaneously be fit with a blackbody -- likely due to suppression of $B$- and $i$-band flux by Fe II/III and Ca II -- and evolve more rapidly; and (3) "early-yellows" evolve blueward, consistent with thermal heating from $\sim$ 8,000 to 13,000 K. The distributions of early-blue and early-red colors are compatible with them being either distinct populations -- with early-reds comprising (60 $\pm$ 15)% of them -- or extreme ends of one continuous population; whereas the early-yellow population identified here is clearly distinct. Compared to the other populations, early-blues in our sample differ by exhibiting excess emission within 1--2 days, nearly constant peak brightness regardless of $ΔM_{15}(B)$ after standardization, and shallower Si II features. Early-blues also prefer star-forming host environments, while early-yellows and, to a lesser extent, early-reds prefer quiescent ones. These preferences appear to indicate at least two Type Ia SN production channels based on stellar population age, while early-reds and early-blues may still share a common origin. △ Less

Submitted 12 August, 2024; originally announced August 2024.

Comments: Submitted for publication in ApJ. 48 pages, 29 figures, 7 tables

arXiv:2408.06127 [pdf, ps, other]

On the Completely Positive Approximation Property for Non-Unital Operator Systems and the Boundary Condition for the Zero Map

Authors: Se-Jin Kim

Abstract: The purpose of this paper is two-fold: firstly, we give a characterization on the level of non-unital operator systems for when the zero map is a boundary representation. As a consequence, we show that a non-unital operator system arising from the direct limit of C*-algebras under positive maps is a C*-algebra if and only if its unitization is a C*-algebra. Secondly, we show that the completely po… ▽ More The purpose of this paper is two-fold: firstly, we give a characterization on the level of non-unital operator systems for when the zero map is a boundary representation. As a consequence, we show that a non-unital operator system arising from the direct limit of C*-algebras under positive maps is a C*-algebra if and only if its unitization is a C*-algebra. Secondly, we show that the completely positive approximation property and the completely contractive approximation property of a non-unital operator system is equivalent to its bidual being an injective von Neumann algebra. This implies in particular that all non-unital operator systems with the completely contractive approximation property must necessarily admit an abundance of positive elements. △ Less

Submitted 12 August, 2024; originally announced August 2024.

arXiv:2408.06043 [pdf, other]

Enhancing Dialogue Speech Recognition with Robust Contextual Awareness via Noise Representation Learning

Authors: Wonjun Lee, San Kim, Gary Geunbae Lee

Abstract: Recent dialogue systems rely on turn-based spoken interactions, requiring accurate Automatic Speech Recognition (ASR). Errors in ASR can significantly impact downstream dialogue tasks. To address this, using dialogue context from user and agent interactions for transcribing subsequent utterances has been proposed. This method incorporates the transcription of the user's speech and the agent's resp… ▽ More Recent dialogue systems rely on turn-based spoken interactions, requiring accurate Automatic Speech Recognition (ASR). Errors in ASR can significantly impact downstream dialogue tasks. To address this, using dialogue context from user and agent interactions for transcribing subsequent utterances has been proposed. This method incorporates the transcription of the user's speech and the agent's response as model input, using the accumulated context generated by each turn. However, this context is susceptible to ASR errors because it is generated by the ASR model in an auto-regressive fashion. Such noisy context can further degrade the benefits of context input, resulting in suboptimal ASR performance. In this paper, we introduce Context Noise Representation Learning (CNRL) to enhance robustness against noisy context, ultimately improving dialogue speech recognition accuracy. To maximize the advantage of context awareness, our approach includes decoder pre-training using text-based dialogue data and noise representation learning for a context encoder. Based on the evaluation of speech dialogues, our method shows superior results compared to baselines. Furthermore, the strength of our approach is highlighted in noisy environments where user speech is barely audible due to real-world noise, relying on contextual information to transcribe the input accurately. △ Less

Submitted 12 August, 2024; originally announced August 2024.

Comments: 11 pages, 2 figures, Accepted to SIGDIAL2024

arXiv:2408.05784 [pdf, other]

Quantum Support Vector Machine-Based Classification of GPS Signal Reception Conditions

Authors: Suhui Jeong, Sanghyun Kim, Jiwon Seo

Abstract: Global Positioning System (GPS) plays a critical role in navigation by utilizing satellite signals, but its accuracy in urban environments is often compromised by signal obstructions. Previous research has categorized GPS reception conditions into line-of-sight (LOS), non-line-of-sight (NLOS), and LOS+NLOS scenarios to enhance accuracy. This paper introduces a novel approach using quantum support… ▽ More Global Positioning System (GPS) plays a critical role in navigation by utilizing satellite signals, but its accuracy in urban environments is often compromised by signal obstructions. Previous research has categorized GPS reception conditions into line-of-sight (LOS), non-line-of-sight (NLOS), and LOS+NLOS scenarios to enhance accuracy. This paper introduces a novel approach using quantum support vector machines (QSVM) with a ZZ feature map and fidelity quantum kernel to classify urban GPS signal reception conditions, comparing its performance against classical SVM methods. While classical SVM has been previously explored for this purpose, our study is the first to apply QSVM to this classification task. We conducted experiments using datasets from two distinct urban locations to train and evaluate SVM and QSVM models. Our results demonstrate that QSVM achieves superior classification accuracy compared to classical SVM for urban GPS signal datasets. Additionally, we emphasize the importance of appropriately scaling raw data when utilizing QSVM. △ Less

Submitted 11 August, 2024; originally announced August 2024.

Comments: Submitted to IEEE QCE 2024

arXiv:2408.05749 [pdf, other]

Efficient and Versatile Robust Fine-Tuning of Zero-shot Models

Authors: Sungyeon Kim, Boseung Jeong, Donghyun Kim, Suha Kwak

Abstract: Large-scale image-text pre-trained models enable zero-shot classification and provide consistent accuracy across various data distributions. Nonetheless, optimizing these models in downstream tasks typically requires fine-tuning, which reduces generalization to out-of-distribution (OOD) data and demands extensive computational resources. We introduce Robust Adapter (R-Adapter), a novel method for… ▽ More Large-scale image-text pre-trained models enable zero-shot classification and provide consistent accuracy across various data distributions. Nonetheless, optimizing these models in downstream tasks typically requires fine-tuning, which reduces generalization to out-of-distribution (OOD) data and demands extensive computational resources. We introduce Robust Adapter (R-Adapter), a novel method for fine-tuning zero-shot models to downstream tasks while simultaneously addressing both these issues. Our method integrates lightweight modules into the pre-trained model and employs novel self-ensemble techniques to boost OOD robustness and reduce storage expenses substantially. Furthermore, we propose MPM-NCE loss designed for fine-tuning on vision-language downstream tasks. It ensures precise alignment of multiple image-text pairs and discriminative feature learning. By extending the benchmark for robust fine-tuning beyond classification to include diverse tasks such as cross-modal retrieval and open vocabulary segmentation, we demonstrate the broad applicability of R-Adapter. Our extensive experiments demonstrate that R-Adapter achieves state-of-the-art performance across a diverse set of tasks, tuning only 13% of the parameters of the CLIP encoders. △ Less

Submitted 11 August, 2024; originally announced August 2024.

Comments: Accepted to ECCV 2024

arXiv:2408.05616 [pdf, other]

Emergence of Meron Kekulé lattices in twisted Néel antiferromagnets

Authors: Kyoung-Min Kim, Se Kwon Kim

Abstract: A Kekulé lattice is an exotic, distorted lattice structure distinguished by alternating bond lengths in contrast to naturally formed atomic crystals. While this structure has been explored through atomic crystals and metamaterials, the possibility of forming a Kekulé lattice from topological solitons in magnetic systems has remained elusive. Here, we propose twisted bilayer easy-plane Néel antifer… ▽ More A Kekulé lattice is an exotic, distorted lattice structure distinguished by alternating bond lengths in contrast to naturally formed atomic crystals. While this structure has been explored through atomic crystals and metamaterials, the possibility of forming a Kekulé lattice from topological solitons in magnetic systems has remained elusive. Here, we propose twisted bilayer easy-plane Néel antiferromagnets as a promising platform for achieving a "Meron Kekulé lattice" -- a distorted topological soliton lattice comprised of antiferromagnetic merons as its lattice elements. Using atomistic spin simulations on these magnets, we demonstrate that due to the moiré-induced intricate pattern of interlayer exchange coupling, the cores of these merons are stabilized into the Kekulé-O pattern with different intracell and intercell bond lengths across moiré supercells, hence forming a Meron Kekulé lattice. Furthermore, we showcase that the two bond lengths of the Meron Kekulé lattice can be fine-tuned by adjusting the twist angle and specifics of the interlayer exchange coupling, suggesting extensive control over the meron lattice configuration in contrast to conventional magnetic systems. These discoveries pave the way for exploring topological solitons with distinctive Kekulé attributes, offering intriguing opportunities at the intersection of topological solitons and Kekulé physics. △ Less

Submitted 10 August, 2024; originally announced August 2024.

Comments: Supplementary Information is included in the published version

arXiv:2408.05531 [pdf, other]

Electroweak Primordial Magnetic Blackhole: Cosmic Production and Physical Implication

Authors: Y. M. Cho, Sang-Woo Kim, Seung Hun Oh

Abstract: The electroweak monopole, when coupled to gravity, turns to the Reissner-Nordstrom type primordial magnetic blackhole whose mass is bounded below, with the lower bound $M_P \sqrt α$. This changes the overall picture of the monopole production mechanism in the early universe drastically and has deep implications in cosmolpgy. In particular, this enhances the possibility that the electroweak monopol… ▽ More The electroweak monopole, when coupled to gravity, turns to the Reissner-Nordstrom type primordial magnetic blackhole whose mass is bounded below, with the lower bound $M_P \sqrt α$. This changes the overall picture of the monopole production mechanism in the early universe drastically and has deep implications in cosmolpgy. In particular, this enhances the possibility that the electroweak monopoles turned to the primordial magnetic blackholes could become the seed of stellar objects and galaxies, and account for the dark matter of the universe. Moreover, this tells that we have a new type of primordial blackhole different from the popular primordial blackhole in cosmology, the electroweak primordial magnetic blackhole based on a totally different production mechanism. We discuss the physical implications of the electroweak primordial magnetic blackhole. △ Less

Submitted 14 August, 2024; v1 submitted 10 August, 2024; originally announced August 2024.

Showing 1–50 of 7,618 results for author: Kim, S