Search | arXiv e-print repository

Shock-driven amorphization and melt in Fe$_2$O$_3$

Authors: Céline Crépisson, Alexis Amouretti, Marion Harmand, Chrystèle Sanloup, Patrick Heighway, Sam Azadi, David McGonegle, Thomas Campbell, David Alexander Chin, Ethan Smith, Linda Hansen, Alessandro Forte, Thomas Gawne, Hae Ja Lee, Bob Nagler, YuanFeng Shi, Guillaume Fiquet, François Guyot, Mikako Makita, Alessandra Benuzzi-Mounaix, Tommaso Vinci, Kohei Miyanishi, Norimasa Ozaki, Tatiana Pikuz, Hirotaka Nakamura , et al. (6 additional authors not shown)

Abstract: We present measurements on Fe$_2$O$_3$ amorphization and melt under laser-driven shock compression up to 209(10) GPa via time-resolved in situ x-ray diffraction. At 122(3) GPa, a diffuse signal is observed indicating the presence of a non-crystalline phase. Structure factors have been extracted up to 182(6) GPa showing the presence of two well-defined peaks. A rapid change in the intensity ratio o… ▽ More We present measurements on Fe$_2$O$_3$ amorphization and melt under laser-driven shock compression up to 209(10) GPa via time-resolved in situ x-ray diffraction. At 122(3) GPa, a diffuse signal is observed indicating the presence of a non-crystalline phase. Structure factors have been extracted up to 182(6) GPa showing the presence of two well-defined peaks. A rapid change in the intensity ratio of the two peaks is identified between 145(10) and 151(10) GPa, indicative of a phase change. Present DFT+$U$ calculations of temperatures along Fe$_2$O$_3$ Hugoniot are in agreement with SESAME 7440 and indicate relatively low temperatures, below 2000 K, up to 150 GPa. The non-crystalline diffuse scattering is thus consistent with the - as yet unreported - shock amorphization of Fe$_2$O$_3$ between 122(3) and 145(10) GPa, followed by an amorphous-to-liquid transition above 151(10) GPa. Upon release, a non-crystalline phase is observed alongside crystalline $α$-Fe$_2$O$_3$. The extracted structure factor and pair distribution function of this release phase resemble those reported for Fe$_2$O$_3$ melt at ambient pressure. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: 11 pages, 4 figures, under review

arXiv:2408.16793 [pdf, other]

Separating Super-Puffs vs. Hot Jupiters Among Young Puffy Planets

Authors: Amalia Karalis, Eve J. Lee, Daniel P. Thorngren

Abstract: Discoveries of close-in young puffy (R$_p \gtrsim$ 6 R$_\oplus$) planets raise the question of whether they are bona fide hot Jupiters or puffed-up Neptunes, potentially placing constraints on the formation location and timescale of hot Jupiters. Obtaining mass measurements for these planets is challenging due to stellar activity and noisy spectra. Therefore, we aim to provide independent theoreti… ▽ More Discoveries of close-in young puffy (R$_p \gtrsim$ 6 R$_\oplus$) planets raise the question of whether they are bona fide hot Jupiters or puffed-up Neptunes, potentially placing constraints on the formation location and timescale of hot Jupiters. Obtaining mass measurements for these planets is challenging due to stellar activity and noisy spectra. Therefore, we aim to provide independent theoretical constraints on the masses of these young planets based on their radii, incident fluxes, and ages, benchmarking to the planets of age $<$1 Gyr detected by Kepler, K2 and TESS. Through a combination of interior structure models, considerations of photoevaporative mass loss, and empirical mass-metallicity trends, we present the range of possible masses for 24 planets of age $\sim$10-900 Myr and radii $\sim$6-16 R$_\oplus$. We generally find that our mass estimates are in agreement with the measured masses and upper limits where applicable. There exist some outliers including super-puffs Kepler-51 b, c and V1298 Tau d, b, e, for which we outline their likely formation conditions. Our analyses demonstrate that most of the youngest planets ($\lesssim$ 100 Myr) tend to be puffed-up, Neptune-mass planets, while the true hot Jupiters are typically found around stars aged at least a few hundred Myr, suggesting the dominant origin of hot Jupiters to be late-stage high eccentricity migration. △ Less

Submitted 28 August, 2024; originally announced August 2024.

Comments: Submitted to AAS journals. Comments welcome

arXiv:2408.16749 [pdf]

Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge

Authors: Beidi Dong, Jin R. Lee, Ziwei Zhu, Balassubramanian Srinivasan

Abstract: The United States has experienced a significant increase in violent extremism, prompting the need for automated tools to detect and limit the spread of extremist ideology online. This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collect… ▽ More The United States has experienced a significant increase in violent extremism, prompting the need for automated tools to detect and limit the spread of extremist ideology online. This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collected social media posts containing "far-right" and "far-left" ideological keywords and manually labeled them as extremist or non-extremist. Extremist posts were further classified into one or more of five contributing elements of extremism based on a working definitional framework. The BERT model's performance was evaluated based on training data size and knowledge transfer between categories. We also compared the performance of GPT 3.5 and GPT 4 models using different prompts: naïve, layperson-definition, role-playing, and professional-definition. Results showed that the best performing GPT models outperformed the best performing BERT models, with more detailed prompts generally yielding better results. However, overly complex prompts may impair performance. Different versions of GPT have unique sensitives to what they consider extremist. GPT 3.5 performed better at classifying far-left extremist posts, while GPT 4 performed better at classifying far-right extremist posts. Large language models, represented by GPT models, hold significant potential for online extremism classification tasks, surpassing traditional BERT models in a zero-shot setting. Future research should explore human-computer interactions in optimizing GPT models for extremist detection and classification tasks to develop more efficient (e.g., quicker, less effort) and effective (e.g., fewer errors or mistakes) methods for identifying extremist content. △ Less

Submitted 29 August, 2024; originally announced August 2024.

arXiv:2408.16729 [pdf, other]

Prediction-Feedback DETR for Temporal Action Detection

Authors: Jihwan Kim, Miso Lee, Cheol-Ho Cho, Jihyun Lee, Jae-Pil Heo

Abstract: Temporal Action Detection (TAD) is fundamental yet challenging for real-world video applications. Leveraging the unique benefits of transformers, various DETR-based approaches have been adopted in TAD. However, it has recently been identified that the attention collapse in self-attention causes the performance degradation of DETR for TAD. Building upon previous research, this paper newly addresses… ▽ More Temporal Action Detection (TAD) is fundamental yet challenging for real-world video applications. Leveraging the unique benefits of transformers, various DETR-based approaches have been adopted in TAD. However, it has recently been identified that the attention collapse in self-attention causes the performance degradation of DETR for TAD. Building upon previous research, this paper newly addresses the attention collapse problem in cross-attention within DETR-based TAD methods. Moreover, our findings reveal that cross-attention exhibits patterns distinct from predictions, indicating a short-cut phenomenon. To resolve this, we propose a new framework, Prediction-Feedback DETR (Pred-DETR), which utilizes predictions to restore the collapse and align the cross- and self-attention with predictions. Specifically, we devise novel prediction-feedback objectives using guidance from the relations of the predictions. As a result, Pred-DETR significantly alleviates the collapse and achieves state-of-the-art performance among DETR-based methods on various challenging benchmarks including THUMOS14, ActivityNet-v1.3, HACS, and FineAction. △ Less

Submitted 29 August, 2024; originally announced August 2024.

arXiv:2408.16598 [pdf, other]

Signatures of Amorphous Shiba State in FeTe$_{0.55}$Se$_{0.45}$

Authors: Jinwon Lee, Sanghun Lee, Andreas Kreisel, Jens Paaske, Brian M. Andersen, Koen M. Bastiaans, Damianos Chatzopoulos, Genda Gu, Doohee Cho, Milan P. Allan

Abstract: The iron-based superconductor FeTe$_{0.55}$Se$_{0.45}$ is a peculiar material: it hosts a surface state with a Dirac dispersion, is a putative topological superconductor hosting Majorana modes in vortices, and has an unusually low Fermi energy. The superconducting state is generally thought to be characterized by three gaps in different bands, with the usual homogenous, spatially extended Bogoliub… ▽ More The iron-based superconductor FeTe$_{0.55}$Se$_{0.45}$ is a peculiar material: it hosts a surface state with a Dirac dispersion, is a putative topological superconductor hosting Majorana modes in vortices, and has an unusually low Fermi energy. The superconducting state is generally thought to be characterized by three gaps in different bands, with the usual homogenous, spatially extended Bogoliubov excitations -- in this work, we uncover evidence that it is instead of a very different nature. Our scanning tunneling spectroscopy data shows several peaks in the density of states above a full gap, and by analyzing the spatial and junction-resistance dependence of the peaks, we conclude that the peaks above the first one are not coherence peaks from different bands. Instead, comparisons with our simulations indicate that they originate from generalized Shiba states that are spatially overlapping. This can lead to an amorphous state of Bogoliubov quasiparticles, reminiscent of impurity bands in semiconductors. We discuss the origin and implications of this new state. △ Less

Submitted 29 August, 2024; originally announced August 2024.

Comments: 6 pages, 4 figures

arXiv:2408.16493 [pdf, other]

Learning from Negative Samples in Generative Biomedical Entity Linking

Authors: Chanhwi Kim, Hyunjae Kim, Sihyeon Park, Jiwoo Lee, Mujeen Sung, Jaewoo Kang

Abstract: Generative models have become widely used in biomedical entity linking (BioEL) due to their excellent performance and efficient memory usage. However, these models are usually trained only with positive samples--entities that match the input mention's identifier--and do not explicitly learn from hard negative samples, which are entities that look similar but have different meanings. To address thi… ▽ More Generative models have become widely used in biomedical entity linking (BioEL) due to their excellent performance and efficient memory usage. However, these models are usually trained only with positive samples--entities that match the input mention's identifier--and do not explicitly learn from hard negative samples, which are entities that look similar but have different meanings. To address this limitation, we introduce ANGEL (Learning from Negative Samples in Generative Biomedical Entity Linking), the first framework that trains generative BioEL models using negative samples. Specifically, a generative model is initially trained to generate positive samples from the knowledge base for given input entities. Subsequently, both correct and incorrect outputs are gathered from the model's top-k predictions. The model is then updated to prioritize the correct predictions through direct preference optimization. Our models fine-tuned with ANGEL outperform the previous best baseline models by up to an average top-1 accuracy of 1.4% on five benchmarks. When incorporating our framework into pre-training, the performance improvement further increases to 1.7%, demonstrating its effectiveness in both the pre-training and fine-tuning stages. Our code is available at https://github.com/dmis-lab/ANGEL. △ Less

Submitted 29 August, 2024; originally announced August 2024.

arXiv:2408.15811 [pdf, ps, other]

Identifying Influential and Vulnerable Nodes in Interaction Networks through Estimation of Transfer Entropy Between Univariate and Multivariate Time Series

Authors: Julian Lee

Abstract: Transfer entropy (TE) is a powerful tool for measuring causal relationships within interaction networks. Traditionally, TE and its conditional variants are applied pairwise between dynamic variables to infer these causal relationships. However, identifying the most influential or vulnerable node in a system requires measuring the causal influence of each component on the entire system and vice ver… ▽ More Transfer entropy (TE) is a powerful tool for measuring causal relationships within interaction networks. Traditionally, TE and its conditional variants are applied pairwise between dynamic variables to infer these causal relationships. However, identifying the most influential or vulnerable node in a system requires measuring the causal influence of each component on the entire system and vice versa. In this paper, I propose using outgoing and incoming transfer entropy-where outgoing TE quantifies the influence of a node on the rest of the system, and incoming TE measures the influence of the rest of the system on the node. The node with the highest outgoing TE is identified as the most influential, or "hub", while the node with the highest incoming TE is the most vulnerable, or "anti-hub". Since these measures involve transfer entropy between univariate and multivariate time series, naive estimation methods can result in significant errors, particularly when the number of variables is comparable to or exceeds the number of samples. To address this, I introduce a novel estimation scheme that computes outgoing and incoming TE only between significantly interacting partners. The feasibility of this approach is demonstrated by using synthetic data, and by applying it to a real data of oral microbiota. The method successfully identifies the bacterial species known to be key players in the bacterial community, demonstrating the power of the new method. △ Less

Submitted 28 August, 2024; originally announced August 2024.

Comments: 35 pages, 18 figures

arXiv:2408.15620 [pdf, other]

CAPER: Enhancing Career Trajectory Prediction using Temporal Knowledge Graph and Ternary Relationship

Authors: Yeon-Chang Lee, JaeHyun Lee, Michiharu Yamashita, Dongwon Lee, Sang-Wook Kim

Abstract: The problem of career trajectory prediction (CTP) aims to predict one's future employer or job position. While several CTP methods have been developed for this problem, we posit that none of these methods (1) jointly considers the mutual ternary dependency between three key units (i.e., user, position, and company) of a career and (2) captures the characteristic shifts of key units in career over… ▽ More The problem of career trajectory prediction (CTP) aims to predict one's future employer or job position. While several CTP methods have been developed for this problem, we posit that none of these methods (1) jointly considers the mutual ternary dependency between three key units (i.e., user, position, and company) of a career and (2) captures the characteristic shifts of key units in career over time, leading to an inaccurate understanding of the job movement patterns in the labor market. To address the above challenges, we propose a novel solution, named as CAPER, that solves the challenges via sophisticated temporal knowledge graph (TKG) modeling. It enables the utilization of a graph-structured knowledge base with rich expressiveness, effectively preserving the changes in job movement patterns. Furthermore, we devise an extrapolated career reasoning task on TKG for a realistic evaluation. The experiments on a real-world career trajectory dataset demonstrate that CAPER consistently and significantly outperforms four baselines, two recent TKG reasoning methods, and five state-of-the-art CTP methods in predicting one's future companies and positions-i.e., on average, yielding 6.80% and 34.58% more accurate predictions, respectively. △ Less

Submitted 28 August, 2024; originally announced August 2024.

arXiv:2408.15522 [pdf]

Digital cytometry: extraction of forward and side scattering signals from holotomography

Authors: Jaepil Jo, Herve Hugonnet, Mahn Jae Lee, YongKeun Park

Abstract: Flow cytometry is a cornerstone technique in medical and biological research, providing crucial information about cell size and granularity through forward scatter (FSC) and side scatter (SSC) signals. Despite its widespread use, the precise relationship between these scatter signals and corresponding microscopic images remains underexplored. Here, we investigate this intrinsic relationship by uti… ▽ More Flow cytometry is a cornerstone technique in medical and biological research, providing crucial information about cell size and granularity through forward scatter (FSC) and side scatter (SSC) signals. Despite its widespread use, the precise relationship between these scatter signals and corresponding microscopic images remains underexplored. Here, we investigate this intrinsic relationship by utilizing scattering theory and holotomography, a three-dimensional quantitative phase imaging (QPI) technique. We demonstrate the extraction of FSC and SSC signals from individual, unlabeled cells by analyzing their three-dimensional refractive index distributions obtained through holotomography. Additionally, we introduce a method for digitally windowing SSC signals to facilitate effective segmentation and morphology-based cell type classification. Our approach bridges the gap between flow cytometry and microscopic imaging, offering a new perspective on analyzing cellular characteristics with high accuracy and without the need for labeling. △ Less

Submitted 28 August, 2024; originally announced August 2024.

arXiv:2408.14855 [pdf, other]

Enhancing Analogical Reasoning in the Abstraction and Reasoning Corpus via Model-Based RL

Authors: Jihwan Lee, Woochang Sim, Sejin Kim, Sundong Kim

Abstract: This paper demonstrates that model-based reinforcement learning (model-based RL) is a suitable approach for the task of analogical reasoning. We hypothesize that model-based RL can solve analogical reasoning tasks more efficiently through the creation of internal models. To test this, we compared DreamerV3, a model-based RL method, with Proximal Policy Optimization, a model-free RL method, on the… ▽ More This paper demonstrates that model-based reinforcement learning (model-based RL) is a suitable approach for the task of analogical reasoning. We hypothesize that model-based RL can solve analogical reasoning tasks more efficiently through the creation of internal models. To test this, we compared DreamerV3, a model-based RL method, with Proximal Policy Optimization, a model-free RL method, on the Abstraction and Reasoning Corpus (ARC) tasks. Our results indicate that model-based RL not only outperforms model-free RL in learning and generalizing from single tasks but also shows significant advantages in reasoning across similar tasks. △ Less

Submitted 27 August, 2024; originally announced August 2024.

Comments: Accepted to IJCAI 2024 IARML Workshop

arXiv:2408.14688 [pdf, other]

Lowering threshold of NaI(Tl) scintillator to 0.7 keV in the COSINE-100 experiment

Authors: G. H. Yu, N. Carlin, J. Y. Cho, J. J. Choi, S. Choi, A. C. Ezeribe, L. E. França, C. Ha, I. S. Hahn, S. J. Hollick, E. J. Jeon, H. W. Joo, W. G. Kang, M. Kauer, B. H. Kim, H. J. Kim, J. Kim, K. W. Kim, S. H. Kim, S. K. Kim, W. K. Kim, Y. D. Kim, Y. H. Kim, Y. J. Ko, D. H. Lee , et al. (34 additional authors not shown)

Abstract: COSINE-100 is a direct dark matter search experiment, with the primary goal of testing the annual modulation signal observed by DAMA/LIBRA, using the same target material, NaI(Tl). In previous analyses, we achieved the same 1 keV energy threshold used in the DAMA/LIBRA's analysis that reported an annual modulation signal with 11.6$σ$ significance. In this article, we report an improved analysis th… ▽ More COSINE-100 is a direct dark matter search experiment, with the primary goal of testing the annual modulation signal observed by DAMA/LIBRA, using the same target material, NaI(Tl). In previous analyses, we achieved the same 1 keV energy threshold used in the DAMA/LIBRA's analysis that reported an annual modulation signal with 11.6$σ$ significance. In this article, we report an improved analysis that lowered the threshold to 0.7 keV, thanks to the application of Multi-Layer Perception network and a new likelihood parameter with waveforms in the frequency domain. The lower threshold would enable a better comparison of COSINE-100 with new DAMA results with a 0.75 keV threshold and account for differences in quenching factors. Furthermore the lower threshold can enhance COSINE-100's sensitivity to sub-GeV dark matter searches. △ Less

Submitted 26 August, 2024; originally announced August 2024.

arXiv:2408.14608 [pdf, other]

Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

Authors: Lazar Atanackovic, Xi Zhang, Brandon Amos, Mathieu Blanchette, Leo J. Lee, Yoshua Bengio, Alexander Tong, Kirill Neklyudov

Abstract: Numerous biological and physical processes can be modeled as systems of interacting entities evolving continuously over time, e.g. the dynamics of communicating cells or physical particles. Learning the dynamics of such systems is essential for predicting the temporal evolution of populations across novel samples and unseen environments. Flow-based models allow for learning these dynamics at the p… ▽ More Numerous biological and physical processes can be modeled as systems of interacting entities evolving continuously over time, e.g. the dynamics of communicating cells or physical particles. Learning the dynamics of such systems is essential for predicting the temporal evolution of populations across novel samples and unseen environments. Flow-based models allow for learning these dynamics at the population level - they model the evolution of the entire distribution of samples. However, current flow-based models are limited to a single initial population and a set of predefined conditions which describe different dynamics. We argue that multiple processes in natural sciences have to be represented as vector fields on the Wasserstein manifold of probability densities. That is, the change of the population at any moment in time depends on the population itself due to the interactions between samples. In particular, this is crucial for personalized medicine where the development of diseases and their respective treatment response depends on the microenvironment of cells specific to each patient. We propose Meta Flow Matching (MFM), a practical approach to integrating along these vector fields on the Wasserstein manifold by amortizing the flow model over the initial populations. Namely, we embed the population of samples using a Graph Neural Network (GNN) and use these embeddings to train a Flow Matching model. This gives MFM the ability to generalize over the initial distributions unlike previously proposed methods. We demonstrate the ability of MFM to improve prediction of individual treatment responses on a large scale multi-patient single-cell drug screen dataset. △ Less

Submitted 26 August, 2024; originally announced August 2024.

arXiv:2408.14423 [pdf, other]

DualSpeech: Enhancing Speaker-Fidelity and Text-Intelligibility Through Dual Classifier-Free Guidance

Authors: Jinhyeok Yang, Junhyeok Lee, Hyeong-Seok Choi, Seunghun Ji, Hyeongju Kim, Juheon Lee

Abstract: Text-to-Speech (TTS) models have advanced significantly, aiming to accurately replicate human speech's diversity, including unique speaker identities and linguistic nuances. Despite these advancements, achieving an optimal balance between speaker-fidelity and text-intelligibility remains a challenge, particularly when diverse control demands are considered. Addressing this, we introduce DualSpeech… ▽ More Text-to-Speech (TTS) models have advanced significantly, aiming to accurately replicate human speech's diversity, including unique speaker identities and linguistic nuances. Despite these advancements, achieving an optimal balance between speaker-fidelity and text-intelligibility remains a challenge, particularly when diverse control demands are considered. Addressing this, we introduce DualSpeech, a TTS model that integrates phoneme-level latent diffusion with dual classifier-free guidance. This approach enables exceptional control over speaker-fidelity and text-intelligibility. Experimental results demonstrate that by utilizing the sophisticated control, DualSpeech surpasses existing state-of-the-art TTS models in performance. Demos are available at https://bit.ly/48Ewoib. △ Less

Submitted 27 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

Comments: Accepted to INTERSPEECH 2024

arXiv:2408.12799 [pdf, other]

Less for More: Enhancing Preference Learning in Generative Language Models with Automated Self-Curation of Training Corpora

Authors: JoonHo Lee, JuYoun Son, Juree Seok, Wooseok Jang, Yeong-Dae Kwon

Abstract: Ambiguity in language presents challenges in developing more enhanced language models, particularly in preference learning, where variability among annotators results in inconsistently annotated datasets used for model alignment. To address this issue, we introduce a self-curation method that preprocesses annotated datasets by leveraging proxy models trained directly on these datasets. Our method… ▽ More Ambiguity in language presents challenges in developing more enhanced language models, particularly in preference learning, where variability among annotators results in inconsistently annotated datasets used for model alignment. To address this issue, we introduce a self-curation method that preprocesses annotated datasets by leveraging proxy models trained directly on these datasets. Our method enhances preference learning by automatically detecting and removing ambiguous annotations within the dataset. The proposed approach is validated through extensive experiments, demonstrating a marked improvement in performance across various instruction-following tasks. Our work provides a straightforward and reliable method to overcome annotation inconsistencies, serving as an initial step towards the development of more advanced preference learning techniques. △ Less

Submitted 22 August, 2024; originally announced August 2024.

arXiv:2408.12727 [pdf, other]

BankTweak: Adversarial Attack against Multi-Object Trackers by Manipulating Feature Banks

Authors: Woojin Shin, Donghwa Kang, Daejin Choi, Brent Kang, Jinkyu Lee, Hyeongboo Baek

Abstract: Multi-object tracking (MOT) aims to construct moving trajectories for objects, and modern multi-object trackers mainly utilize the tracking-by-detection methodology. Initial approaches to MOT attacks primarily aimed to degrade the detection quality of the frames under attack, thereby reducing accuracy only in those specific frames, highlighting a lack of \textit{efficiency}. To improve efficiency,… ▽ More Multi-object tracking (MOT) aims to construct moving trajectories for objects, and modern multi-object trackers mainly utilize the tracking-by-detection methodology. Initial approaches to MOT attacks primarily aimed to degrade the detection quality of the frames under attack, thereby reducing accuracy only in those specific frames, highlighting a lack of \textit{efficiency}. To improve efficiency, recent advancements manipulate object positions to cause persistent identity (ID) switches during the association phase, even after the attack ends within a few frames. However, these position-manipulating attacks have inherent limitations, as they can be easily counteracted by adjusting distance-related parameters in the association phase, revealing a lack of \textit{robustness}. In this paper, we present \textsf{BankTweak}, a novel adversarial attack designed for MOT trackers, which features efficiency and robustness. \textsf{BankTweak} focuses on the feature extractor in the association phase and reveals vulnerability in the Hungarian matching method used by feature-based MOT systems. Exploiting the vulnerability, \textsf{BankTweak} induces persistent ID switches (addressing \textit{efficiency}) even after the attack ends by strategically injecting altered features into the feature banks without modifying object positions (addressing \textit{robustness}). To demonstrate the applicability, we apply \textsf{BankTweak} to three multi-object trackers (DeepSORT, StrongSORT, and MOTDT) with one-stage, two-stage, anchor-free, and transformer detectors. Extensive experiments on the MOT17 and MOT20 datasets show that our method substantially surpasses existing attacks, exposing the vulnerability of the tracking-by-detection framework to \textsf{BankTweak}. △ Less

Submitted 22 August, 2024; originally announced August 2024.

arXiv:2408.12726 [pdf, other]

Macro-Queries: An Exploration into Guided Chart Generation from High Level Prompts

Authors: Christopher J. Lee, Giorgio Tran, Roderick Tabalba, Jason Leigh, Ryan Longman

Abstract: This paper explores the intersection of data visualization and Large Language Models (LLMs). Driven by the need to make a broader range of data visualization types accessible for novice users, we present a guided LLM-based pipeline designed to transform data, guided by high-level user questions (referred to as macro-queries), into a diverse set of useful visualizations. This approach leverages var… ▽ More This paper explores the intersection of data visualization and Large Language Models (LLMs). Driven by the need to make a broader range of data visualization types accessible for novice users, we present a guided LLM-based pipeline designed to transform data, guided by high-level user questions (referred to as macro-queries), into a diverse set of useful visualizations. This approach leverages various prompting techniques, fine-tuning inspired by Abela's Chart Taxonomy, and integrated SQL tool usage. △ Less

Submitted 22 August, 2024; originally announced August 2024.

arXiv:2408.12708 [pdf, other]

Revisiting Cross-Domain Problem for LiDAR-based 3D Object Detection

Authors: Ruixiao Zhang, Juheon Lee, Xiaohao Cai, Adam Prugel-Bennett

Abstract: Deep learning models such as convolutional neural networks and transformers have been widely applied to solve 3D object detection problems in the domain of autonomous driving. While existing models have achieved outstanding performance on most open benchmarks, the generalization ability of these deep networks is still in doubt. To adapt models to other domains including different cities, countries… ▽ More Deep learning models such as convolutional neural networks and transformers have been widely applied to solve 3D object detection problems in the domain of autonomous driving. While existing models have achieved outstanding performance on most open benchmarks, the generalization ability of these deep networks is still in doubt. To adapt models to other domains including different cities, countries, and weather, retraining with the target domain data is currently necessary, which hinders the wide application of autonomous driving. In this paper, we deeply analyze the cross-domain performance of the state-of-the-art models. We observe that most models will overfit the training domains and it is challenging to adapt them to other domains directly. Existing domain adaptation methods for 3D object detection problems are actually shifting the models' knowledge domain instead of improving their generalization ability. We then propose additional evaluation metrics -- the side-view and front-view AP -- to better analyze the core issues of the methods' heavy drops in accuracy levels. By using the proposed metrics and further evaluating the cross-domain performance in each dimension, we conclude that the overfitting problem happens more obviously on the front-view surface and the width dimension which usually faces the sensor and has more 3D points surrounding it. Meanwhile, our experiments indicate that the density of the point cloud data also significantly influences the models' cross-domain performance. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: Accepted by the ICONIP 2024

arXiv:2408.12612 [pdf, ps, other]

Stringy Scaling

Authors: Sheng-Hong Lai, Jen-Chi Lee, Yi Yang

Abstract: We discover a general stringy scaling behavior for 1. All n-point hard string scattering amplitudes (HSSA) and 2. A class of n-point Regge string scattering amplitudes (RSSA) to all string loop orders. The number of independent kinematics variables is found to be reduced by dim M. We discover a general stringy scaling behavior for 1. All n-point hard string scattering amplitudes (HSSA) and 2. A class of n-point Regge string scattering amplitudes (RSSA) to all string loop orders. The number of independent kinematics variables is found to be reduced by dim M. △ Less

Submitted 8 August, 2024; originally announced August 2024.

Comments: 11 pages, 1 figure. Talk given by Jen-Chi Lee at Komaba, Tokyo university, Jan.29/2024

arXiv:2408.12293 [pdf, other]

AT-SNN: Adaptive Tokens for Vision Transformer on Spiking Neural Network

Authors: Donghwa Kang, Youngmoon Lee, Eun-Kyu Lee, Brent Kang, Jinkyu Lee, Hyeongboo Baek

Abstract: In the training and inference of spiking neural networks (SNNs), direct training and lightweight computation methods have been orthogonally developed, aimed at reducing power consumption. However, only a limited number of approaches have applied these two mechanisms simultaneously and failed to fully leverage the advantages of SNN-based vision transformers (ViTs) since they were originally designe… ▽ More In the training and inference of spiking neural networks (SNNs), direct training and lightweight computation methods have been orthogonally developed, aimed at reducing power consumption. However, only a limited number of approaches have applied these two mechanisms simultaneously and failed to fully leverage the advantages of SNN-based vision transformers (ViTs) since they were originally designed for convolutional neural networks (CNNs). In this paper, we propose AT-SNN designed to dynamically adjust the number of tokens processed during inference in SNN-based ViTs with direct training, wherein power consumption is proportional to the number of tokens. We first demonstrate the applicability of adaptive computation time (ACT), previously limited to RNNs and ViTs, to SNN-based ViTs, enhancing it to discard less informative spatial tokens selectively. Also, we propose a new token-merge mechanism that relies on the similarity of tokens, which further reduces the number of tokens while enhancing accuracy. We implement AT-SNN to Spikformer and show the effectiveness of AT-SNN in achieving high energy efficiency and accuracy compared to state-of-the-art approaches on the image classification tasks, CIFAR10, CIFAR-100, and TinyImageNet. For example, our approach uses up to 42.4% fewer tokens than the existing best-performing method on CIFAR-100, while conserving higher accuracy. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: 8 pages

arXiv:2408.12150 [pdf, other]

DeepHQ: Learned Hierarchical Quantizer for Progressive Deep Image Coding

Authors: Jooyoung Lee, Se Yoon Jeong, Munchurl Kim

Abstract: Unlike fixed- or variable-rate image coding, progressive image coding (PIC) aims to compress various qualities of images into a single bitstream, increasing the versatility of bitstream utilization and providing high compression efficiency compared to simulcast compression. Research on neural network (NN)-based PIC is in its early stages, mainly focusing on applying varying quantization step sizes… ▽ More Unlike fixed- or variable-rate image coding, progressive image coding (PIC) aims to compress various qualities of images into a single bitstream, increasing the versatility of bitstream utilization and providing high compression efficiency compared to simulcast compression. Research on neural network (NN)-based PIC is in its early stages, mainly focusing on applying varying quantization step sizes to the transformed latent representations in a hierarchical manner. These approaches are designed to compress only the progressively added information as the quality improves, considering that a wider quantization interval for lower-quality compression includes multiple narrower sub-intervals for higher-quality compression. However, the existing methods are based on handcrafted quantization hierarchies, resulting in sub-optimal compression efficiency. In this paper, we propose an NN-based progressive coding method that firstly utilizes learned quantization step sizes via learning for each quantization layer. We also incorporate selective compression with which only the essential representation components are compressed for each quantization layer. We demonstrate that our method achieves significantly higher coding efficiency than the existing approaches with decreased decoding time and reduced model size. △ Less

Submitted 22 August, 2024; originally announced August 2024.

arXiv:2408.12080 [pdf, other]

Exploring the Feasibility of Automated Data Standardization using Large Language Models for Seamless Positioning

Authors: Max J. L. Lee, Ju Lin, Li-Ta Hsu

Abstract: We propose a feasibility study for real-time automated data standardization leveraging Large Language Models (LLMs) to enhance seamless positioning systems in IoT environments. By integrating and standardizing heterogeneous sensor data from smartphones, IoT devices, and dedicated systems such as Ultra-Wideband (UWB), our study ensures data compatibility and improves positioning accuracy using the… ▽ More We propose a feasibility study for real-time automated data standardization leveraging Large Language Models (LLMs) to enhance seamless positioning systems in IoT environments. By integrating and standardizing heterogeneous sensor data from smartphones, IoT devices, and dedicated systems such as Ultra-Wideband (UWB), our study ensures data compatibility and improves positioning accuracy using the Extended Kalman Filter (EKF). The core components include the Intelligent Data Standardization Module (IDSM), which employs a fine-tuned LLM to convert varied sensor data into a standardized format, and the Transformation Rule Generation Module (TRGM), which automates the creation of transformation rules and scripts for ongoing data standardization. Evaluated in real-time environments, our study demonstrates adaptability and scalability, enhancing operational efficiency and accuracy in seamless navigation. This study underscores the potential of advanced LLMs in overcoming sensor data integration complexities, paving the way for more scalable and precise IoT navigation solutions. △ Less