-
Shock-driven amorphization and melt in Fe$_2$O$_3$
Authors:
Céline Crépisson,
Alexis Amouretti,
Marion Harmand,
Chrystèle Sanloup,
Patrick Heighway,
Sam Azadi,
David McGonegle,
Thomas Campbell,
David Alexander Chin,
Ethan Smith,
Linda Hansen,
Alessandro Forte,
Thomas Gawne,
Hae Ja Lee,
Bob Nagler,
YuanFeng Shi,
Guillaume Fiquet,
François Guyot,
Mikako Makita,
Alessandra Benuzzi-Mounaix,
Tommaso Vinci,
Kohei Miyanishi,
Norimasa Ozaki,
Tatiana Pikuz,
Hirotaka Nakamura
, et al. (6 additional authors not shown)
Abstract:
We present measurements on Fe$_2$O$_3$ amorphization and melt under laser-driven shock compression up to 209(10) GPa via time-resolved in situ x-ray diffraction. At 122(3) GPa, a diffuse signal is observed indicating the presence of a non-crystalline phase. Structure factors have been extracted up to 182(6) GPa showing the presence of two well-defined peaks. A rapid change in the intensity ratio o…
▽ More
We present measurements on Fe$_2$O$_3$ amorphization and melt under laser-driven shock compression up to 209(10) GPa via time-resolved in situ x-ray diffraction. At 122(3) GPa, a diffuse signal is observed indicating the presence of a non-crystalline phase. Structure factors have been extracted up to 182(6) GPa showing the presence of two well-defined peaks. A rapid change in the intensity ratio of the two peaks is identified between 145(10) and 151(10) GPa, indicative of a phase change. Present DFT+$U$ calculations of temperatures along Fe$_2$O$_3$ Hugoniot are in agreement with SESAME 7440 and indicate relatively low temperatures, below 2000 K, up to 150 GPa. The non-crystalline diffuse scattering is thus consistent with the - as yet unreported - shock amorphization of Fe$_2$O$_3$ between 122(3) and 145(10) GPa, followed by an amorphous-to-liquid transition above 151(10) GPa. Upon release, a non-crystalline phase is observed alongside crystalline $α$-Fe$_2$O$_3$. The extracted structure factor and pair distribution function of this release phase resemble those reported for Fe$_2$O$_3$ melt at ambient pressure.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Separating Super-Puffs vs. Hot Jupiters Among Young Puffy Planets
Authors:
Amalia Karalis,
Eve J. Lee,
Daniel P. Thorngren
Abstract:
Discoveries of close-in young puffy (R$_p \gtrsim$ 6 R$_\oplus$) planets raise the question of whether they are bona fide hot Jupiters or puffed-up Neptunes, potentially placing constraints on the formation location and timescale of hot Jupiters. Obtaining mass measurements for these planets is challenging due to stellar activity and noisy spectra. Therefore, we aim to provide independent theoreti…
▽ More
Discoveries of close-in young puffy (R$_p \gtrsim$ 6 R$_\oplus$) planets raise the question of whether they are bona fide hot Jupiters or puffed-up Neptunes, potentially placing constraints on the formation location and timescale of hot Jupiters. Obtaining mass measurements for these planets is challenging due to stellar activity and noisy spectra. Therefore, we aim to provide independent theoretical constraints on the masses of these young planets based on their radii, incident fluxes, and ages, benchmarking to the planets of age $<$1 Gyr detected by Kepler, K2 and TESS. Through a combination of interior structure models, considerations of photoevaporative mass loss, and empirical mass-metallicity trends, we present the range of possible masses for 24 planets of age $\sim$10-900 Myr and radii $\sim$6-16 R$_\oplus$. We generally find that our mass estimates are in agreement with the measured masses and upper limits where applicable. There exist some outliers including super-puffs Kepler-51 b, c and V1298 Tau d, b, e, for which we outline their likely formation conditions. Our analyses demonstrate that most of the youngest planets ($\lesssim$ 100 Myr) tend to be puffed-up, Neptune-mass planets, while the true hot Jupiters are typically found around stars aged at least a few hundred Myr, suggesting the dominant origin of hot Jupiters to be late-stage high eccentricity migration.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge
Authors:
Beidi Dong,
Jin R. Lee,
Ziwei Zhu,
Balassubramanian Srinivasan
Abstract:
The United States has experienced a significant increase in violent extremism, prompting the need for automated tools to detect and limit the spread of extremist ideology online. This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collect…
▽ More
The United States has experienced a significant increase in violent extremism, prompting the need for automated tools to detect and limit the spread of extremist ideology online. This study evaluates the performance of Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) in detecting and classifying online domestic extremist posts. We collected social media posts containing "far-right" and "far-left" ideological keywords and manually labeled them as extremist or non-extremist. Extremist posts were further classified into one or more of five contributing elements of extremism based on a working definitional framework. The BERT model's performance was evaluated based on training data size and knowledge transfer between categories. We also compared the performance of GPT 3.5 and GPT 4 models using different prompts: naïve, layperson-definition, role-playing, and professional-definition. Results showed that the best performing GPT models outperformed the best performing BERT models, with more detailed prompts generally yielding better results. However, overly complex prompts may impair performance. Different versions of GPT have unique sensitives to what they consider extremist. GPT 3.5 performed better at classifying far-left extremist posts, while GPT 4 performed better at classifying far-right extremist posts. Large language models, represented by GPT models, hold significant potential for online extremism classification tasks, surpassing traditional BERT models in a zero-shot setting. Future research should explore human-computer interactions in optimizing GPT models for extremist detection and classification tasks to develop more efficient (e.g., quicker, less effort) and effective (e.g., fewer errors or mistakes) methods for identifying extremist content.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Prediction-Feedback DETR for Temporal Action Detection
Authors:
Jihwan Kim,
Miso Lee,
Cheol-Ho Cho,
Jihyun Lee,
Jae-Pil Heo
Abstract:
Temporal Action Detection (TAD) is fundamental yet challenging for real-world video applications. Leveraging the unique benefits of transformers, various DETR-based approaches have been adopted in TAD. However, it has recently been identified that the attention collapse in self-attention causes the performance degradation of DETR for TAD. Building upon previous research, this paper newly addresses…
▽ More
Temporal Action Detection (TAD) is fundamental yet challenging for real-world video applications. Leveraging the unique benefits of transformers, various DETR-based approaches have been adopted in TAD. However, it has recently been identified that the attention collapse in self-attention causes the performance degradation of DETR for TAD. Building upon previous research, this paper newly addresses the attention collapse problem in cross-attention within DETR-based TAD methods. Moreover, our findings reveal that cross-attention exhibits patterns distinct from predictions, indicating a short-cut phenomenon. To resolve this, we propose a new framework, Prediction-Feedback DETR (Pred-DETR), which utilizes predictions to restore the collapse and align the cross- and self-attention with predictions. Specifically, we devise novel prediction-feedback objectives using guidance from the relations of the predictions. As a result, Pred-DETR significantly alleviates the collapse and achieves state-of-the-art performance among DETR-based methods on various challenging benchmarks including THUMOS14, ActivityNet-v1.3, HACS, and FineAction.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Signatures of Amorphous Shiba State in FeTe$_{0.55}$Se$_{0.45}$
Authors:
Jinwon Lee,
Sanghun Lee,
Andreas Kreisel,
Jens Paaske,
Brian M. Andersen,
Koen M. Bastiaans,
Damianos Chatzopoulos,
Genda Gu,
Doohee Cho,
Milan P. Allan
Abstract:
The iron-based superconductor FeTe$_{0.55}$Se$_{0.45}$ is a peculiar material: it hosts a surface state with a Dirac dispersion, is a putative topological superconductor hosting Majorana modes in vortices, and has an unusually low Fermi energy. The superconducting state is generally thought to be characterized by three gaps in different bands, with the usual homogenous, spatially extended Bogoliub…
▽ More
The iron-based superconductor FeTe$_{0.55}$Se$_{0.45}$ is a peculiar material: it hosts a surface state with a Dirac dispersion, is a putative topological superconductor hosting Majorana modes in vortices, and has an unusually low Fermi energy. The superconducting state is generally thought to be characterized by three gaps in different bands, with the usual homogenous, spatially extended Bogoliubov excitations -- in this work, we uncover evidence that it is instead of a very different nature. Our scanning tunneling spectroscopy data shows several peaks in the density of states above a full gap, and by analyzing the spatial and junction-resistance dependence of the peaks, we conclude that the peaks above the first one are not coherence peaks from different bands. Instead, comparisons with our simulations indicate that they originate from generalized Shiba states that are spatially overlapping. This can lead to an amorphous state of Bogoliubov quasiparticles, reminiscent of impurity bands in semiconductors. We discuss the origin and implications of this new state.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Learning from Negative Samples in Generative Biomedical Entity Linking
Authors:
Chanhwi Kim,
Hyunjae Kim,
Sihyeon Park,
Jiwoo Lee,
Mujeen Sung,
Jaewoo Kang
Abstract:
Generative models have become widely used in biomedical entity linking (BioEL) due to their excellent performance and efficient memory usage. However, these models are usually trained only with positive samples--entities that match the input mention's identifier--and do not explicitly learn from hard negative samples, which are entities that look similar but have different meanings. To address thi…
▽ More
Generative models have become widely used in biomedical entity linking (BioEL) due to their excellent performance and efficient memory usage. However, these models are usually trained only with positive samples--entities that match the input mention's identifier--and do not explicitly learn from hard negative samples, which are entities that look similar but have different meanings. To address this limitation, we introduce ANGEL (Learning from Negative Samples in Generative Biomedical Entity Linking), the first framework that trains generative BioEL models using negative samples. Specifically, a generative model is initially trained to generate positive samples from the knowledge base for given input entities. Subsequently, both correct and incorrect outputs are gathered from the model's top-k predictions. The model is then updated to prioritize the correct predictions through direct preference optimization. Our models fine-tuned with ANGEL outperform the previous best baseline models by up to an average top-1 accuracy of 1.4% on five benchmarks. When incorporating our framework into pre-training, the performance improvement further increases to 1.7%, demonstrating its effectiveness in both the pre-training and fine-tuning stages. Our code is available at https://github.com/dmis-lab/ANGEL.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Identifying Influential and Vulnerable Nodes in Interaction Networks through Estimation of Transfer Entropy Between Univariate and Multivariate Time Series
Authors:
Julian Lee
Abstract:
Transfer entropy (TE) is a powerful tool for measuring causal relationships within interaction networks. Traditionally, TE and its conditional variants are applied pairwise between dynamic variables to infer these causal relationships. However, identifying the most influential or vulnerable node in a system requires measuring the causal influence of each component on the entire system and vice ver…
▽ More
Transfer entropy (TE) is a powerful tool for measuring causal relationships within interaction networks. Traditionally, TE and its conditional variants are applied pairwise between dynamic variables to infer these causal relationships. However, identifying the most influential or vulnerable node in a system requires measuring the causal influence of each component on the entire system and vice versa. In this paper, I propose using outgoing and incoming transfer entropy-where outgoing TE quantifies the influence of a node on the rest of the system, and incoming TE measures the influence of the rest of the system on the node. The node with the highest outgoing TE is identified as the most influential, or "hub", while the node with the highest incoming TE is the most vulnerable, or "anti-hub". Since these measures involve transfer entropy between univariate and multivariate time series, naive estimation methods can result in significant errors, particularly when the number of variables is comparable to or exceeds the number of samples. To address this, I introduce a novel estimation scheme that computes outgoing and incoming TE only between significantly interacting partners. The feasibility of this approach is demonstrated by using synthetic data, and by applying it to a real data of oral microbiota. The method successfully identifies the bacterial species known to be key players in the bacterial community, demonstrating the power of the new method.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
CAPER: Enhancing Career Trajectory Prediction using Temporal Knowledge Graph and Ternary Relationship
Authors:
Yeon-Chang Lee,
JaeHyun Lee,
Michiharu Yamashita,
Dongwon Lee,
Sang-Wook Kim
Abstract:
The problem of career trajectory prediction (CTP) aims to predict one's future employer or job position. While several CTP methods have been developed for this problem, we posit that none of these methods (1) jointly considers the mutual ternary dependency between three key units (i.e., user, position, and company) of a career and (2) captures the characteristic shifts of key units in career over…
▽ More
The problem of career trajectory prediction (CTP) aims to predict one's future employer or job position. While several CTP methods have been developed for this problem, we posit that none of these methods (1) jointly considers the mutual ternary dependency between three key units (i.e., user, position, and company) of a career and (2) captures the characteristic shifts of key units in career over time, leading to an inaccurate understanding of the job movement patterns in the labor market. To address the above challenges, we propose a novel solution, named as CAPER, that solves the challenges via sophisticated temporal knowledge graph (TKG) modeling. It enables the utilization of a graph-structured knowledge base with rich expressiveness, effectively preserving the changes in job movement patterns. Furthermore, we devise an extrapolated career reasoning task on TKG for a realistic evaluation. The experiments on a real-world career trajectory dataset demonstrate that CAPER consistently and significantly outperforms four baselines, two recent TKG reasoning methods, and five state-of-the-art CTP methods in predicting one's future companies and positions-i.e., on average, yielding 6.80% and 34.58% more accurate predictions, respectively.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Digital cytometry: extraction of forward and side scattering signals from holotomography
Authors:
Jaepil Jo,
Herve Hugonnet,
Mahn Jae Lee,
YongKeun Park
Abstract:
Flow cytometry is a cornerstone technique in medical and biological research, providing crucial information about cell size and granularity through forward scatter (FSC) and side scatter (SSC) signals. Despite its widespread use, the precise relationship between these scatter signals and corresponding microscopic images remains underexplored. Here, we investigate this intrinsic relationship by uti…
▽ More
Flow cytometry is a cornerstone technique in medical and biological research, providing crucial information about cell size and granularity through forward scatter (FSC) and side scatter (SSC) signals. Despite its widespread use, the precise relationship between these scatter signals and corresponding microscopic images remains underexplored. Here, we investigate this intrinsic relationship by utilizing scattering theory and holotomography, a three-dimensional quantitative phase imaging (QPI) technique. We demonstrate the extraction of FSC and SSC signals from individual, unlabeled cells by analyzing their three-dimensional refractive index distributions obtained through holotomography. Additionally, we introduce a method for digitally windowing SSC signals to facilitate effective segmentation and morphology-based cell type classification. Our approach bridges the gap between flow cytometry and microscopic imaging, offering a new perspective on analyzing cellular characteristics with high accuracy and without the need for labeling.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Enhancing Analogical Reasoning in the Abstraction and Reasoning Corpus via Model-Based RL
Authors:
Jihwan Lee,
Woochang Sim,
Sejin Kim,
Sundong Kim
Abstract:
This paper demonstrates that model-based reinforcement learning (model-based RL) is a suitable approach for the task of analogical reasoning. We hypothesize that model-based RL can solve analogical reasoning tasks more efficiently through the creation of internal models. To test this, we compared DreamerV3, a model-based RL method, with Proximal Policy Optimization, a model-free RL method, on the…
▽ More
This paper demonstrates that model-based reinforcement learning (model-based RL) is a suitable approach for the task of analogical reasoning. We hypothesize that model-based RL can solve analogical reasoning tasks more efficiently through the creation of internal models. To test this, we compared DreamerV3, a model-based RL method, with Proximal Policy Optimization, a model-free RL method, on the Abstraction and Reasoning Corpus (ARC) tasks. Our results indicate that model-based RL not only outperforms model-free RL in learning and generalizing from single tasks but also shows significant advantages in reasoning across similar tasks.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Lowering threshold of NaI(Tl) scintillator to 0.7 keV in the COSINE-100 experiment
Authors:
G. H. Yu,
N. Carlin,
J. Y. Cho,
J. J. Choi,
S. Choi,
A. C. Ezeribe,
L. E. França,
C. Ha,
I. S. Hahn,
S. J. Hollick,
E. J. Jeon,
H. W. Joo,
W. G. Kang,
M. Kauer,
B. H. Kim,
H. J. Kim,
J. Kim,
K. W. Kim,
S. H. Kim,
S. K. Kim,
W. K. Kim,
Y. D. Kim,
Y. H. Kim,
Y. J. Ko,
D. H. Lee
, et al. (34 additional authors not shown)
Abstract:
COSINE-100 is a direct dark matter search experiment, with the primary goal of testing the annual modulation signal observed by DAMA/LIBRA, using the same target material, NaI(Tl). In previous analyses, we achieved the same 1 keV energy threshold used in the DAMA/LIBRA's analysis that reported an annual modulation signal with 11.6$σ$ significance. In this article, we report an improved analysis th…
▽ More
COSINE-100 is a direct dark matter search experiment, with the primary goal of testing the annual modulation signal observed by DAMA/LIBRA, using the same target material, NaI(Tl). In previous analyses, we achieved the same 1 keV energy threshold used in the DAMA/LIBRA's analysis that reported an annual modulation signal with 11.6$σ$ significance. In this article, we report an improved analysis that lowered the threshold to 0.7 keV, thanks to the application of Multi-Layer Perception network and a new likelihood parameter with waveforms in the frequency domain. The lower threshold would enable a better comparison of COSINE-100 with new DAMA results with a 0.75 keV threshold and account for differences in quenching factors. Furthermore the lower threshold can enhance COSINE-100's sensitivity to sub-GeV dark matter searches.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold
Authors:
Lazar Atanackovic,
Xi Zhang,
Brandon Amos,
Mathieu Blanchette,
Leo J. Lee,
Yoshua Bengio,
Alexander Tong,
Kirill Neklyudov
Abstract:
Numerous biological and physical processes can be modeled as systems of interacting entities evolving continuously over time, e.g. the dynamics of communicating cells or physical particles. Learning the dynamics of such systems is essential for predicting the temporal evolution of populations across novel samples and unseen environments. Flow-based models allow for learning these dynamics at the p…
▽ More
Numerous biological and physical processes can be modeled as systems of interacting entities evolving continuously over time, e.g. the dynamics of communicating cells or physical particles. Learning the dynamics of such systems is essential for predicting the temporal evolution of populations across novel samples and unseen environments. Flow-based models allow for learning these dynamics at the population level - they model the evolution of the entire distribution of samples. However, current flow-based models are limited to a single initial population and a set of predefined conditions which describe different dynamics. We argue that multiple processes in natural sciences have to be represented as vector fields on the Wasserstein manifold of probability densities. That is, the change of the population at any moment in time depends on the population itself due to the interactions between samples. In particular, this is crucial for personalized medicine where the development of diseases and their respective treatment response depends on the microenvironment of cells specific to each patient. We propose Meta Flow Matching (MFM), a practical approach to integrating along these vector fields on the Wasserstein manifold by amortizing the flow model over the initial populations. Namely, we embed the population of samples using a Graph Neural Network (GNN) and use these embeddings to train a Flow Matching model. This gives MFM the ability to generalize over the initial distributions unlike previously proposed methods. We demonstrate the ability of MFM to improve prediction of individual treatment responses on a large scale multi-patient single-cell drug screen dataset.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
DualSpeech: Enhancing Speaker-Fidelity and Text-Intelligibility Through Dual Classifier-Free Guidance
Authors:
Jinhyeok Yang,
Junhyeok Lee,
Hyeong-Seok Choi,
Seunghun Ji,
Hyeongju Kim,
Juheon Lee
Abstract:
Text-to-Speech (TTS) models have advanced significantly, aiming to accurately replicate human speech's diversity, including unique speaker identities and linguistic nuances. Despite these advancements, achieving an optimal balance between speaker-fidelity and text-intelligibility remains a challenge, particularly when diverse control demands are considered. Addressing this, we introduce DualSpeech…
▽ More
Text-to-Speech (TTS) models have advanced significantly, aiming to accurately replicate human speech's diversity, including unique speaker identities and linguistic nuances. Despite these advancements, achieving an optimal balance between speaker-fidelity and text-intelligibility remains a challenge, particularly when diverse control demands are considered. Addressing this, we introduce DualSpeech, a TTS model that integrates phoneme-level latent diffusion with dual classifier-free guidance. This approach enables exceptional control over speaker-fidelity and text-intelligibility. Experimental results demonstrate that by utilizing the sophisticated control, DualSpeech surpasses existing state-of-the-art TTS models in performance. Demos are available at https://bit.ly/48Ewoib.
△ Less
Submitted 27 August, 2024; v1 submitted 26 August, 2024;
originally announced August 2024.
-
Less for More: Enhancing Preference Learning in Generative Language Models with Automated Self-Curation of Training Corpora
Authors:
JoonHo Lee,
JuYoun Son,
Juree Seok,
Wooseok Jang,
Yeong-Dae Kwon
Abstract:
Ambiguity in language presents challenges in developing more enhanced language models, particularly in preference learning, where variability among annotators results in inconsistently annotated datasets used for model alignment. To address this issue, we introduce a self-curation method that preprocesses annotated datasets by leveraging proxy models trained directly on these datasets. Our method…
▽ More
Ambiguity in language presents challenges in developing more enhanced language models, particularly in preference learning, where variability among annotators results in inconsistently annotated datasets used for model alignment. To address this issue, we introduce a self-curation method that preprocesses annotated datasets by leveraging proxy models trained directly on these datasets. Our method enhances preference learning by automatically detecting and removing ambiguous annotations within the dataset. The proposed approach is validated through extensive experiments, demonstrating a marked improvement in performance across various instruction-following tasks. Our work provides a straightforward and reliable method to overcome annotation inconsistencies, serving as an initial step towards the development of more advanced preference learning techniques.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
BankTweak: Adversarial Attack against Multi-Object Trackers by Manipulating Feature Banks
Authors:
Woojin Shin,
Donghwa Kang,
Daejin Choi,
Brent Kang,
Jinkyu Lee,
Hyeongboo Baek
Abstract:
Multi-object tracking (MOT) aims to construct moving trajectories for objects, and modern multi-object trackers mainly utilize the tracking-by-detection methodology. Initial approaches to MOT attacks primarily aimed to degrade the detection quality of the frames under attack, thereby reducing accuracy only in those specific frames, highlighting a lack of \textit{efficiency}. To improve efficiency,…
▽ More
Multi-object tracking (MOT) aims to construct moving trajectories for objects, and modern multi-object trackers mainly utilize the tracking-by-detection methodology. Initial approaches to MOT attacks primarily aimed to degrade the detection quality of the frames under attack, thereby reducing accuracy only in those specific frames, highlighting a lack of \textit{efficiency}. To improve efficiency, recent advancements manipulate object positions to cause persistent identity (ID) switches during the association phase, even after the attack ends within a few frames. However, these position-manipulating attacks have inherent limitations, as they can be easily counteracted by adjusting distance-related parameters in the association phase, revealing a lack of \textit{robustness}. In this paper, we present \textsf{BankTweak}, a novel adversarial attack designed for MOT trackers, which features efficiency and robustness. \textsf{BankTweak} focuses on the feature extractor in the association phase and reveals vulnerability in the Hungarian matching method used by feature-based MOT systems. Exploiting the vulnerability, \textsf{BankTweak} induces persistent ID switches (addressing \textit{efficiency}) even after the attack ends by strategically injecting altered features into the feature banks without modifying object positions (addressing \textit{robustness}). To demonstrate the applicability, we apply \textsf{BankTweak} to three multi-object trackers (DeepSORT, StrongSORT, and MOTDT) with one-stage, two-stage, anchor-free, and transformer detectors. Extensive experiments on the MOT17 and MOT20 datasets show that our method substantially surpasses existing attacks, exposing the vulnerability of the tracking-by-detection framework to \textsf{BankTweak}.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
Macro-Queries: An Exploration into Guided Chart Generation from High Level Prompts
Authors:
Christopher J. Lee,
Giorgio Tran,
Roderick Tabalba,
Jason Leigh,
Ryan Longman
Abstract:
This paper explores the intersection of data visualization and Large Language Models (LLMs). Driven by the need to make a broader range of data visualization types accessible for novice users, we present a guided LLM-based pipeline designed to transform data, guided by high-level user questions (referred to as macro-queries), into a diverse set of useful visualizations. This approach leverages var…
▽ More
This paper explores the intersection of data visualization and Large Language Models (LLMs). Driven by the need to make a broader range of data visualization types accessible for novice users, we present a guided LLM-based pipeline designed to transform data, guided by high-level user questions (referred to as macro-queries), into a diverse set of useful visualizations. This approach leverages various prompting techniques, fine-tuning inspired by Abela's Chart Taxonomy, and integrated SQL tool usage.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
Revisiting Cross-Domain Problem for LiDAR-based 3D Object Detection
Authors:
Ruixiao Zhang,
Juheon Lee,
Xiaohao Cai,
Adam Prugel-Bennett
Abstract:
Deep learning models such as convolutional neural networks and transformers have been widely applied to solve 3D object detection problems in the domain of autonomous driving. While existing models have achieved outstanding performance on most open benchmarks, the generalization ability of these deep networks is still in doubt. To adapt models to other domains including different cities, countries…
▽ More
Deep learning models such as convolutional neural networks and transformers have been widely applied to solve 3D object detection problems in the domain of autonomous driving. While existing models have achieved outstanding performance on most open benchmarks, the generalization ability of these deep networks is still in doubt. To adapt models to other domains including different cities, countries, and weather, retraining with the target domain data is currently necessary, which hinders the wide application of autonomous driving. In this paper, we deeply analyze the cross-domain performance of the state-of-the-art models. We observe that most models will overfit the training domains and it is challenging to adapt them to other domains directly. Existing domain adaptation methods for 3D object detection problems are actually shifting the models' knowledge domain instead of improving their generalization ability. We then propose additional evaluation metrics -- the side-view and front-view AP -- to better analyze the core issues of the methods' heavy drops in accuracy levels. By using the proposed metrics and further evaluating the cross-domain performance in each dimension, we conclude that the overfitting problem happens more obviously on the front-view surface and the width dimension which usually faces the sensor and has more 3D points surrounding it. Meanwhile, our experiments indicate that the density of the point cloud data also significantly influences the models' cross-domain performance.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
Stringy Scaling
Authors:
Sheng-Hong Lai,
Jen-Chi Lee,
Yi Yang
Abstract:
We discover a general stringy scaling behavior for 1. All n-point hard string scattering amplitudes (HSSA) and 2. A class of n-point Regge string scattering amplitudes (RSSA) to all string loop orders. The number of independent kinematics variables is found to be reduced by dim M.
We discover a general stringy scaling behavior for 1. All n-point hard string scattering amplitudes (HSSA) and 2. A class of n-point Regge string scattering amplitudes (RSSA) to all string loop orders. The number of independent kinematics variables is found to be reduced by dim M.
△ Less
Submitted 8 August, 2024;
originally announced August 2024.
-
AT-SNN: Adaptive Tokens for Vision Transformer on Spiking Neural Network
Authors:
Donghwa Kang,
Youngmoon Lee,
Eun-Kyu Lee,
Brent Kang,
Jinkyu Lee,
Hyeongboo Baek
Abstract:
In the training and inference of spiking neural networks (SNNs), direct training and lightweight computation methods have been orthogonally developed, aimed at reducing power consumption. However, only a limited number of approaches have applied these two mechanisms simultaneously and failed to fully leverage the advantages of SNN-based vision transformers (ViTs) since they were originally designe…
▽ More
In the training and inference of spiking neural networks (SNNs), direct training and lightweight computation methods have been orthogonally developed, aimed at reducing power consumption. However, only a limited number of approaches have applied these two mechanisms simultaneously and failed to fully leverage the advantages of SNN-based vision transformers (ViTs) since they were originally designed for convolutional neural networks (CNNs). In this paper, we propose AT-SNN designed to dynamically adjust the number of tokens processed during inference in SNN-based ViTs with direct training, wherein power consumption is proportional to the number of tokens. We first demonstrate the applicability of adaptive computation time (ACT), previously limited to RNNs and ViTs, to SNN-based ViTs, enhancing it to discard less informative spatial tokens selectively. Also, we propose a new token-merge mechanism that relies on the similarity of tokens, which further reduces the number of tokens while enhancing accuracy. We implement AT-SNN to Spikformer and show the effectiveness of AT-SNN in achieving high energy efficiency and accuracy compared to state-of-the-art approaches on the image classification tasks, CIFAR10, CIFAR-100, and TinyImageNet. For example, our approach uses up to 42.4% fewer tokens than the existing best-performing method on CIFAR-100, while conserving higher accuracy.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
DeepHQ: Learned Hierarchical Quantizer for Progressive Deep Image Coding
Authors:
Jooyoung Lee,
Se Yoon Jeong,
Munchurl Kim
Abstract:
Unlike fixed- or variable-rate image coding, progressive image coding (PIC) aims to compress various qualities of images into a single bitstream, increasing the versatility of bitstream utilization and providing high compression efficiency compared to simulcast compression. Research on neural network (NN)-based PIC is in its early stages, mainly focusing on applying varying quantization step sizes…
▽ More
Unlike fixed- or variable-rate image coding, progressive image coding (PIC) aims to compress various qualities of images into a single bitstream, increasing the versatility of bitstream utilization and providing high compression efficiency compared to simulcast compression. Research on neural network (NN)-based PIC is in its early stages, mainly focusing on applying varying quantization step sizes to the transformed latent representations in a hierarchical manner. These approaches are designed to compress only the progressively added information as the quality improves, considering that a wider quantization interval for lower-quality compression includes multiple narrower sub-intervals for higher-quality compression. However, the existing methods are based on handcrafted quantization hierarchies, resulting in sub-optimal compression efficiency. In this paper, we propose an NN-based progressive coding method that firstly utilizes learned quantization step sizes via learning for each quantization layer. We also incorporate selective compression with which only the essential representation components are compressed for each quantization layer. We demonstrate that our method achieves significantly higher coding efficiency than the existing approaches with decreased decoding time and reduced model size.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
Exploring the Feasibility of Automated Data Standardization using Large Language Models for Seamless Positioning
Authors:
Max J. L. Lee,
Ju Lin,
Li-Ta Hsu
Abstract:
We propose a feasibility study for real-time automated data standardization leveraging Large Language Models (LLMs) to enhance seamless positioning systems in IoT environments. By integrating and standardizing heterogeneous sensor data from smartphones, IoT devices, and dedicated systems such as Ultra-Wideband (UWB), our study ensures data compatibility and improves positioning accuracy using the…
▽ More
We propose a feasibility study for real-time automated data standardization leveraging Large Language Models (LLMs) to enhance seamless positioning systems in IoT environments. By integrating and standardizing heterogeneous sensor data from smartphones, IoT devices, and dedicated systems such as Ultra-Wideband (UWB), our study ensures data compatibility and improves positioning accuracy using the Extended Kalman Filter (EKF). The core components include the Intelligent Data Standardization Module (IDSM), which employs a fine-tuned LLM to convert varied sensor data into a standardized format, and the Transformation Rule Generation Module (TRGM), which automates the creation of transformation rules and scripts for ongoing data standardization. Evaluated in real-time environments, our study demonstrates adaptability and scalability, enhancing operational efficiency and accuracy in seamless navigation. This study underscores the potential of advanced LLMs in overcoming sensor data integration complexities, paving the way for more scalable and precise IoT navigation solutions.
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound
Authors:
Junwon Lee,
Jaekwon Im,
Dabin Kim,
Juhan Nam
Abstract:
Foley sound synthesis is crucial for multimedia production, enhancing user experience by synchronizing audio and video both temporally and semantically. Recent studies on automating this labor-intensive process through video-to-sound generation face significant challenges. Systems lacking explicit temporal features suffer from poor controllability and alignment, while timestamp-based models requir…
▽ More
Foley sound synthesis is crucial for multimedia production, enhancing user experience by synchronizing audio and video both temporally and semantically. Recent studies on automating this labor-intensive process through video-to-sound generation face significant challenges. Systems lacking explicit temporal features suffer from poor controllability and alignment, while timestamp-based models require costly and subjective human annotation. We propose Video-Foley, a video-to-sound system using Root Mean Square (RMS) as a temporal event condition with semantic timbre prompts (audio or text). RMS, a frame-level intensity envelope feature closely related to audio semantics, ensures high controllability and synchronization. The annotation-free self-supervised learning framework consists of two stages, Video2RMS and RMS2Sound, incorporating novel ideas including RMS discretization and RMS-ControlNet with a pretrained text-to-audio model. Our extensive evaluation shows that Video-Foley achieves state-of-the-art performance in audio-visual alignment and controllability for sound timing, intensity, timbre, and nuance. Code, model weights, and demonstrations are available on the accompanying website. (https://jnwnlee.github.io/video-foley-demo)
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
Spin-orbit-splitting-driven nonlinear Hall effect in NbIrTe4
Authors:
Ji-Eun Lee,
Aifeng Wang,
Shuzhang Chen,
Minseong Kwon,
Jinwoong Hwang,
Minhyun Cho,
Ki-Hoon Son,
Dong-Soo Han,
Jun Woo Choi,
Young Duck Kim,
Sung-Kwan Mo,
Cedomir Petrovic,
Choongyu Hwang,
Se Young Park,
Chaun Jang,
Hyejin Ryu
Abstract:
The Berry curvature dipole (BCD) serves as a one of the fundamental contributors to emergence of the nonlinear Hall effect (NLHE). Despite intense interest due to its potential for new technologies reaching beyond the quantum efficiency limit, the interplay between BCD and NLHE has been barely understood yet in the absence of a systematic study on the electronic band structure. Here, we report NLH…
▽ More
The Berry curvature dipole (BCD) serves as a one of the fundamental contributors to emergence of the nonlinear Hall effect (NLHE). Despite intense interest due to its potential for new technologies reaching beyond the quantum efficiency limit, the interplay between BCD and NLHE has been barely understood yet in the absence of a systematic study on the electronic band structure. Here, we report NLHE realized in NbIrTe4 that persists above room temperature coupled with a sign change in the Hall conductivity at 150 K. First-principles calculations combined with angle-resolved photoemission spectroscopy (ARPES) measurements show that BCD tuned by the partial occupancy of spin-orbit split bands via temperature is responsible for the temperature-dependent NLHE. Our findings highlight the correlation between BCD and the electronic band structure, providing a viable route to create and engineer the non-trivial Hall effect by tuning the geometric properties of quasiparticles in transition-metal chalcogen compounds.
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
Losses resistant verification of quantum non-Gaussian photon statistics
Authors:
Riccardo Checchinato,
Jan-Heinrich Littmann,
Lukáš Lachman,
Jaewon Lee,
Sven Höfling,
Christian Schneider,
Radim Filip,
Ana Predojević
Abstract:
Quantum non-Gaussian states of light have fundamental properties that are essential for a multitude of applications in quantum technology. However, many of these features are difficult to detect using standard criteria due to optical losses and detector inefficiency. As the statistics of light are unknown, the loss correction on the data is unreliable, despite the fact that the losses can be preci…
▽ More
Quantum non-Gaussian states of light have fundamental properties that are essential for a multitude of applications in quantum technology. However, many of these features are difficult to detect using standard criteria due to optical losses and detector inefficiency. As the statistics of light are unknown, the loss correction on the data is unreliable, despite the fact that the losses can be precisely measured. To address this issue, we employ a loss-mitigated verification technique utilising quantum non-Gaussian witnesses, which incorporate the known optical losses and detector inefficiency into their derivation. This approach allows us to address the considerable challenge of experimentally demonstrating unheralded quantum non-Gaussian states of single photons and photon pairs.
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
Video Diffusion Models are Strong Video Inpainter
Authors:
Minhyeok Lee,
Suhwan Cho,
Chajin Shin,
Jungho Lee,
Sunghun Yang,
Sangyoun Lee
Abstract:
Propagation-based video inpainting using optical flow at the pixel or feature level has recently garnered significant attention. However, it has limitations such as the inaccuracy of optical flow prediction and the propagation of noise over time. These issues result in non-uniform noise and time consistency problems throughout the video, which are particularly pronounced when the removed area is l…
▽ More
Propagation-based video inpainting using optical flow at the pixel or feature level has recently garnered significant attention. However, it has limitations such as the inaccuracy of optical flow prediction and the propagation of noise over time. These issues result in non-uniform noise and time consistency problems throughout the video, which are particularly pronounced when the removed area is large and involves substantial movement. To address these issues, we propose a novel First Frame Filling Video Diffusion Inpainting model (FFF-VDI). We design FFF-VDI inspired by the capabilities of pre-trained image-to-video diffusion models that can transform the first frame image into a highly natural video. To apply this to the video inpainting task, we propagate the noise latent information of future frames to fill the masked areas of the first frame's noise latent code. Next, we fine-tune the pre-trained image-to-video diffusion model to generate the inpainted video. The proposed model addresses the limitations of existing methods that rely on optical flow quality, producing much more natural and temporally consistent videos. This proposed approach is the first to effectively integrate image-to-video diffusion models into video inpainting tasks. Through various comparative experiments, we demonstrate that the proposed model can robustly handle diverse inpainting types with high quality.
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
GeoReasoner: Reasoning On Geospatially Grounded Context For Natural Language Understanding
Authors:
Yibo Yan,
Joey Lee
Abstract:
In human reading and communication, individuals tend to engage in geospatial reasoning, which involves recognizing geographic entities and making informed inferences about their interrelationships. To mimic such cognitive process, current methods either utilize conventional natural language understanding toolkits, or directly apply models pretrained on geo-related natural language corpora. However…
▽ More
In human reading and communication, individuals tend to engage in geospatial reasoning, which involves recognizing geographic entities and making informed inferences about their interrelationships. To mimic such cognitive process, current methods either utilize conventional natural language understanding toolkits, or directly apply models pretrained on geo-related natural language corpora. However, these methods face two significant challenges: i) they do not generalize well to unseen geospatial scenarios, and ii) they overlook the importance of integrating geospatial context from geographical databases with linguistic information from the Internet. To handle these challenges, we propose GeoReasoner, a language model capable of reasoning on geospatially grounded natural language. Specifically, it first leverages Large Language Models (LLMs) to generate a comprehensive location description based on linguistic and geospatial information. It also encodes direction and distance information into spatial embedding via treating them as pseudo-sentences. Consequently, the model is trained on both anchor-level and neighbor-level inputs to learn geo-entity representation. Extensive experimental results demonstrate GeoReasoner's superiority in three tasks: toponym recognition, toponym linking, and geo-entity typing, compared to the state-of-the-art baselines.
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
TWLV-I: Analysis and Insights from Holistic Evaluation on Video Foundation Models
Authors:
Hyeongmin Lee,
Jin-Young Kim,
Kyungjune Baek,
Jihwan Kim,
Hyojun Go,
Seongsu Ha,
Seokjin Han,
Jiho Jang,
Raehyuk Jung,
Daewoo Kim,
GeunOh Kim,
JongMok Kim,
Jongseok Kim,
Junwan Kim,
Soonwoo Kwon,
Jangwon Lee,
Seungjoon Park,
Minjoon Seo,
Jay Suh,
Jaehyuk Yi,
Aiden Lee
Abstract:
In this work, we discuss evaluating video foundation models in a fair and robust manner. Unlike language or image foundation models, many video foundation models are evaluated with differing parameters (such as sampling rate, number of frames, pretraining steps, etc.), making fair and robust comparisons challenging. Therefore, we present a carefully designed evaluation framework for measuring two…
▽ More
In this work, we discuss evaluating video foundation models in a fair and robust manner. Unlike language or image foundation models, many video foundation models are evaluated with differing parameters (such as sampling rate, number of frames, pretraining steps, etc.), making fair and robust comparisons challenging. Therefore, we present a carefully designed evaluation framework for measuring two core capabilities of video comprehension: appearance and motion understanding. Our findings reveal that existing video foundation models, whether text-supervised like UMT or InternVideo2, or self-supervised like V-JEPA, exhibit limitations in at least one of these capabilities. As an alternative, we introduce TWLV-I, a new video foundation model that constructs robust visual representations for both motion- and appearance-based videos. Based on the average top-1 accuracy of linear probing on five action recognition benchmarks, pretrained only on publicly accessible datasets, our model shows a 4.6%p improvement compared to V-JEPA (ViT-L) and a 7.7%p improvement compared to UMT (ViT-L). Even when compared to much larger models, our model demonstrates a 7.2%p improvement compared to DFN (ViT-H), a 2.7%p improvement compared to V-JEPA (ViT-H) and a 2.8%p improvement compared to InternVideo2 (ViT-g). We provide embedding vectors obtained by TWLV-I from videos of several commonly used video benchmarks, along with evaluation source code that can directly utilize these embeddings. The code is available at https://github.com/twelvelabs-io/video-embeddings-evaluation-framework.
△ Less
Submitted 22 August, 2024; v1 submitted 20 August, 2024;
originally announced August 2024.
-
Motor-driven microtubule diffusion in a photobleached dynamical coordinate system
Authors:
Soichi Hirokawa,
Heun Jin Lee,
Rachel A Banks,
Ana I Duarte,
Bibi Najma,
Matt Thomson,
Rob Phillips
Abstract:
Motor-driven cytoskeletal remodeling in cellular systems can often be accompanied by a diffusive-like effect at local scales, but distinguishing the contributions of the ordering process, such as active contraction of a network, from this active diffusion is difficult to achieve. Using light-dimerizable kinesin motors to spatially control the formation and contraction of a microtubule network, we…
▽ More
Motor-driven cytoskeletal remodeling in cellular systems can often be accompanied by a diffusive-like effect at local scales, but distinguishing the contributions of the ordering process, such as active contraction of a network, from this active diffusion is difficult to achieve. Using light-dimerizable kinesin motors to spatially control the formation and contraction of a microtubule network, we deliberately photobleach a grid pattern onto the filament network serving as a transient and dynamic coordinate system to observe the deformation and translation of the remaining fluorescent squares of microtubules. We find that the network contracts at a rate set by motor speed but is accompanied by a diffusive-like spread throughout the bulk of the contracting network with effective diffusion constant two orders of magnitude lower than that for a freely-diffusing microtubule. We further find that on micron scales, the diffusive timescale is only a factor of approximately 3 slower than that of advection regardless of conditions, showing that the global contraction and long-time relaxation from this diffusive behavior are both motor-driven but exhibit local competition within the network bulk.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Measurement of inclusive jet cross section and substructure in $p$$+$$p$ collisions at $\sqrt{s_{_{NN}}}=200$ GeV
Authors:
PHENIX Collaboration,
N. J. Abdulameer,
U. Acharya,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
R. Akimoto,
J. Alexander,
M. Alfred,
V. Andrieux,
S. Antsupov,
K. Aoki,
N. Apadula,
H. Asano,
E. T. Atomssa,
T. C. Awes,
B. Azmoun,
V. Babintsev,
M. Bai,
X. Bai,
N. S. Bandara,
B. Bannier,
E. Bannikov,
K. N. Barish,
S. Bathe
, et al. (422 additional authors not shown)
Abstract:
The jet cross-section and jet-substructure observables in $p$$+$$p$ collisions at $\sqrt{s}=200$ GeV were measured by the PHENIX Collaboration at the Relativistic Heavy Ion Collider (RHIC). Jets are reconstructed from charged-particle tracks and electromagnetic-calorimeter clusters using the anti-$k_{t}$ algorithm with a jet radius $R=0.3$ for jets with transverse momentum within $8.0<p_T<40.0$ Ge…
▽ More
The jet cross-section and jet-substructure observables in $p$$+$$p$ collisions at $\sqrt{s}=200$ GeV were measured by the PHENIX Collaboration at the Relativistic Heavy Ion Collider (RHIC). Jets are reconstructed from charged-particle tracks and electromagnetic-calorimeter clusters using the anti-$k_{t}$ algorithm with a jet radius $R=0.3$ for jets with transverse momentum within $8.0<p_T<40.0$ GeV/$c$ and pseudorapidity $|η|<0.15$. Measurements include the jet cross section, as well as distributions of SoftDrop-groomed momentum fraction ($z_g$), charged-particle transverse momentum with respect to jet axis ($j_T$), and radial distributions of charged particles within jets ($r$). Also meaureed was the distribution of $ξ=-ln(z)$, where $z$ is the fraction of the jet momentum carried by the charged particle. The measurements are compared to theoretical next-to and next-to-next-to-leading-order calculatios, PYTHIA event generator, and to other existing experimental results. Indicated from these meaurements is a lower particle multiplicity in jets at RHIC energies when compared to models. Also noted are implications for future jet measurements with sPHENIX at RHIC as well as at the future Election-Ion Collider.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
An Efficient Sign Language Translation Using Spatial Configuration and Motion Dynamics with LLMs
Authors:
Eui Jun Hwang,
Sukmin Cho,
Junmyeong Lee,
Jong C. Park
Abstract:
Gloss-free Sign Language Translation (SLT) converts sign videos directly into spoken language sentences without relying on glosses. Recently, Large Language Models (LLMs) have shown remarkable translation performance in gloss-free methods by harnessing their powerful natural language generation capabilities. However, these methods often rely on domain-specific fine-tuning of visual encoders to ach…
▽ More
Gloss-free Sign Language Translation (SLT) converts sign videos directly into spoken language sentences without relying on glosses. Recently, Large Language Models (LLMs) have shown remarkable translation performance in gloss-free methods by harnessing their powerful natural language generation capabilities. However, these methods often rely on domain-specific fine-tuning of visual encoders to achieve optimal results. By contrast, this paper emphasizes the importance of capturing the spatial configurations and motion dynamics inherent in sign language. With this in mind, we introduce Spatial and Motion-based Sign Language Translation (SpaMo), a novel LLM-based SLT framework. The core idea of SpaMo is simple yet effective. We first extract spatial and motion features using off-the-shelf visual encoders and then input these features into an LLM with a language prompt. Additionally, we employ a visual-text alignment process as a warm-up before the SLT supervision. Our experiments demonstrate that SpaMo achieves state-of-the-art performance on two popular datasets, PHOENIX14T and How2Sign.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
AI-Driven Review Systems: Evaluating LLMs in Scalable and Bias-Aware Academic Reviews
Authors:
Keith Tyser,
Ben Segev,
Gaston Longhitano,
Xin-Yu Zhang,
Zachary Meeks,
Jason Lee,
Uday Garg,
Nicholas Belsten,
Avi Shporer,
Madeleine Udell,
Dov Te'eni,
Iddo Drori
Abstract:
Automatic reviewing helps handle a large volume of papers, provides early feedback and quality control, reduces bias, and allows the analysis of trends. We evaluate the alignment of automatic paper reviews with human reviews using an arena of human preferences by pairwise comparisons. Gathering human preference may be time-consuming; therefore, we also use an LLM to automatically evaluate reviews…
▽ More
Automatic reviewing helps handle a large volume of papers, provides early feedback and quality control, reduces bias, and allows the analysis of trends. We evaluate the alignment of automatic paper reviews with human reviews using an arena of human preferences by pairwise comparisons. Gathering human preference may be time-consuming; therefore, we also use an LLM to automatically evaluate reviews to increase sample efficiency while reducing bias. In addition to evaluating human and LLM preferences among LLM reviews, we fine-tune an LLM to predict human preferences, predicting which reviews humans will prefer in a head-to-head battle between LLMs. We artificially introduce errors into papers and analyze the LLM's responses to identify limitations, use adaptive review questions, meta prompting, role-playing, integrate visual and textual analysis, use venue-specific reviewing materials, and predict human preferences, improving upon the limitations of the traditional review processes. We make the reviews of publicly available arXiv and open-access Nature journal papers available online, along with a free service which helps authors review and revise their research papers and improve their quality. This work develops proof-of-concept LLM reviewing systems that quickly deliver consistent, high-quality reviews and evaluate their quality. We mitigate the risks of misuse, inflated review scores, overconfident ratings, and skewed score distributions by augmenting the LLM with multiple documents, including the review form, reviewer guide, code of ethics and conduct, area chair guidelines, and previous year statistics, by finding which errors and shortcomings of the paper may be detected by automated reviews, and evaluating pairwise reviewer preferences. This work identifies and addresses the limitations of using LLMs as reviewers and evaluators and enhances the quality of the reviewing process.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Implementing OpenMP for Zig to enable its use in HPC context
Authors:
David Kacs,
Nick Brown,
Joseph Lee
Abstract:
This extended abstract explores supporting OpenMP in the Zig programming language. Whilst, C and Fortran are currently the main languages used to implement HPC applications, Zig provides a similar level of performance complimented with several modern language features, such as enforcing memory safety. However, Zig lacks support for OpenMP which is the de facto threaded programming technology.
Le…
▽ More
This extended abstract explores supporting OpenMP in the Zig programming language. Whilst, C and Fortran are currently the main languages used to implement HPC applications, Zig provides a similar level of performance complimented with several modern language features, such as enforcing memory safety. However, Zig lacks support for OpenMP which is the de facto threaded programming technology.
Leveraging Zig's LLVM compiler tooling, we have added partial support for OpenMP to the Zig compiler and demonstrated that the performance attained by using Zig with OpenMP is comparable to, and in come cases exceeds, that of conventional HPC languages. Consequently we demonstrate that Zig is a viable and important programming technology to use for HPC, and this work paves the way for more HPC features to be added to Zig, ultimately providing HPC developers with the option of using a safer, more modern language for creating high performance applications.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Improved background modeling for dark matter search with COSINE-100
Authors:
G. H. Yu,
N. Carlin,
J. Y. Cho,
J. J. Choi,
S. Choi,
A. C. Ezeribe,
L. E. Franca,
C. Ha,
I. S. Hahn,
S. J. Hollick,
E. J. Jeon,
H. W. Joo,
W. G. Kang,
M. Kauer,
B. H. Kim,
H. J. Kim,
J. Kim,
K. W. Kim,
S. H. Kim,
S. K. Kim,
W. K. Kim,
Y. D. Kim,
Y. H. Kim,
Y. J. Ko,
D. H. Lee
, et al. (33 additional authors not shown)
Abstract:
COSINE-100 aims to conclusively test the claimed dark matter annual modulation signal detected by DAMA/LIBRA collaboration. DAMA/LIBRA has released updated analysis results by lowering the energy threshold to 0.75 keV through various upgrades. They have consistently claimed to have observed the annual modulation. In COSINE-100, it is crucial to lower the energy threshold for a direct comparison wi…
▽ More
COSINE-100 aims to conclusively test the claimed dark matter annual modulation signal detected by DAMA/LIBRA collaboration. DAMA/LIBRA has released updated analysis results by lowering the energy threshold to 0.75 keV through various upgrades. They have consistently claimed to have observed the annual modulation. In COSINE-100, it is crucial to lower the energy threshold for a direct comparison with DAMA/LIBRA, which also enhances the sensitivity of the search for low-mass dark matter, enabling COSINE-100 to explore this area. Therefore, it is essential to have a precise and quantitative understanding of the background spectrum across all energy ranges. This study expands the background modeling from 0.7 to 4000 keV using 2.82 years of COSINE-100 data. The modeling has been improved to describe the background spectrum across all energy ranges accurately. Assessments of the background spectrum are presented, considering the nonproportionality of NaI(Tl) crystals at both low and high energies and the characteristic X-rays produced by the interaction of external backgrounds with materials such as copper. Additionally, constraints on the fit parameters obtained from the alpha spectrum modeling fit are integrated into this model. These improvements are detailed in the paper.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Hear Your Face: Face-based voice conversion with F0 estimation
Authors:
Jaejun Lee,
Yoori Oh,
Injune Hwang,
Kyogu Lee
Abstract:
This paper delves into the emerging field of face-based voice conversion, leveraging the unique relationship between an individual's facial features and their vocal characteristics. We present a novel face-based voice conversion framework that particularly utilizes the average fundamental frequency of the target speaker, derived solely from their facial images. Through extensive analysis, our fram…
▽ More
This paper delves into the emerging field of face-based voice conversion, leveraging the unique relationship between an individual's facial features and their vocal characteristics. We present a novel face-based voice conversion framework that particularly utilizes the average fundamental frequency of the target speaker, derived solely from their facial images. Through extensive analysis, our framework demonstrates superior speech generation quality and the ability to align facial features with voice characteristics, including tracking of the target speaker's fundamental frequency.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Volatile MoS${_2}$ Memristors with Lateral Silver Ion Migration for Artificial Neuron Applications
Authors:
Sofia Cruces,
Mohit D. Ganeriwala,
Jimin Lee,
Lukas Völkel,
Dennis Braun,
Annika Grundmann,
Ke Ran,
Enrique G. Marín,
Holger Kalisch,
Michael Heuken,
Andrei Vescan,
Joachim Mayer,
Andrés Godoy,
Alwin Daus,
Max C. Lemme
Abstract:
Layered two-dimensional (2D) semiconductors have shown enhanced ion migration capabilities along their van der Waals (vdW) gaps and on their surfaces. This effect can be employed for resistive switching (RS) in devices for emerging memories, selectors, and neuromorphic computing. To date, all lateral molybdenum disulfide (MoS${_2}$)-based volatile RS devices with silver (Ag) ion migration have bee…
▽ More
Layered two-dimensional (2D) semiconductors have shown enhanced ion migration capabilities along their van der Waals (vdW) gaps and on their surfaces. This effect can be employed for resistive switching (RS) in devices for emerging memories, selectors, and neuromorphic computing. To date, all lateral molybdenum disulfide (MoS${_2}$)-based volatile RS devices with silver (Ag) ion migration have been demonstrated using exfoliated, single-crystal MoS${_2}$ flakes requiring a forming step to enable RS. Here, we present volatile RS with multilayer MoS${_2}$ grown by metal-organic chemical vapor deposition (MOCVD) with repeatable forming-free operation. The devices show highly reproducible volatile RS with low operating voltages of approximately 2 V and fast switching times down to 130 ns considering their micrometer scale dimensions. We investigate the switching mechanism based on Ag ion surface migration through transmission electron microscopy, electronic transport modeling, and density functional theory. Finally, we develop a physics-based compact model and explore the implementation of our volatile memristors as artificial neurons in neuromorphic systems.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Partial-Multivariate Model for Forecasting
Authors:
Jaehoon Lee,
Hankook Lee,
Sungik Choi,
Sungjun Cho,
Moontae Lee
Abstract:
When solving forecasting problems including multiple time-series features, existing approaches often fall into two extreme categories, depending on whether to utilize inter-feature information: univariate and complete-multivariate models. Unlike univariate cases which ignore the information, complete-multivariate models compute relationships among a complete set of features. However, despite the p…
▽ More
When solving forecasting problems including multiple time-series features, existing approaches often fall into two extreme categories, depending on whether to utilize inter-feature information: univariate and complete-multivariate models. Unlike univariate cases which ignore the information, complete-multivariate models compute relationships among a complete set of features. However, despite the potential advantage of leveraging the additional information, complete-multivariate models sometimes underperform univariate ones. Therefore, our research aims to explore a middle ground between these two by introducing what we term Partial-Multivariate models where a neural network captures only partial relationships, that is, dependencies within subsets of all features. To this end, we propose PMformer, a Transformer-based partial-multivariate model, with its training algorithm. We demonstrate that PMformer outperforms various univariate and complete-multivariate models, providing a theoretical rationale and empirical analysis for its superiority. Additionally, by proposing an inference technique for PMformer, the forecasting accuracy is further enhanced. Finally, we highlight other advantages of PMformer: efficiency and robustness under missing features.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
CusADi: A GPU Parallelization Framework for Symbolic Expressions and Optimal Control
Authors:
Se Hwan Jeon,
Seungwoo Hong,
Ho Jae Lee,
Charles Khazoom,
Sangbae Kim
Abstract:
The parallelism afforded by GPUs presents significant advantages in training controllers through reinforcement learning (RL). However, integrating model-based optimization into this process remains challenging due to the complexity of formulating and solving optimization problems across thousands of instances. In this work, we present CusADi, an extension of the CasADi symbolic framework to suppor…
▽ More
The parallelism afforded by GPUs presents significant advantages in training controllers through reinforcement learning (RL). However, integrating model-based optimization into this process remains challenging due to the complexity of formulating and solving optimization problems across thousands of instances. In this work, we present CusADi, an extension of the CasADi symbolic framework to support the parallelization of arbitrary closed-form expressions on GPUs with CUDA. We also formulate a closed-form approximation for solving general optimal control problems, enabling large-scale parallelization and evaluation of MPC controllers. Our results show a ten-fold speedup relative to similar MPC implementation on the CPU, and we demonstrate the use of CusADi for various applications, including parallel simulation, parameter sweeps, and policy training.
△ Less
Submitted 18 August, 2024;
originally announced August 2024.
-
On the Formation of Planets in the Milky Way's Thick Disk
Authors:
Tim Hallatt,
Eve J. Lee
Abstract:
Exoplanet demographic surveys have revealed that close-in (${\lesssim}$1 au) small planets orbiting stars in the Milky Way's thick disk are ${\sim}50\%$ less abundant than those orbiting stars in the Galactic thin disk. One key difference between the two stellar populations is the time at which they emerged: thick disk stars are the likely product of cosmic noon (redshift $z {\sim}2$), an era char…
▽ More
Exoplanet demographic surveys have revealed that close-in (${\lesssim}$1 au) small planets orbiting stars in the Milky Way's thick disk are ${\sim}50\%$ less abundant than those orbiting stars in the Galactic thin disk. One key difference between the two stellar populations is the time at which they emerged: thick disk stars are the likely product of cosmic noon (redshift $z {\sim}2$), an era characterized by high star formation rate, massive and dense molecular clouds, and strong supersonic turbulence. Solving for the background radiation field in these early star-forming regions, we demonstrate that protoplanetary disks at cosmic noon experienced radiation fields up to ${\sim}7$ orders of magnitude more intense than in solar neighborhood conditions. Coupling the radiation field to a one-dimensional protoplanetary disk evolution model, we find that external UV photoevaporation destroys protoplanetary disks in just ${\sim}$0.2--0.5 Myr, limiting the timescale over which planets can assemble. Disk temperatures exceed the sublimation temperatures of common volatile species for ${\gtrsim}$Myr timescales, predicting more spatial homogeneity in gas chemical composition. Our calculations imply that the deficit in planet occurrence around thick disk stars should be even more pronounced for giant planets, particularly those at wide orbital separations, predicting a higher rocky-to-giant planet ratio in the Galactic thick disk vs.~thin disk.
△ Less
Submitted 17 August, 2024;
originally announced August 2024.
-
Narrowing the Focus: Learned Optimizers for Pretrained Models
Authors:
Gus Kristiansen,
Mark Sandler,
Andrey Zhmoginov,
Nolan Miller,
Anirudh Goyal,
Jihwan Lee,
Max Vladymyrov
Abstract:
In modern deep learning, the models are learned by applying gradient updates using an optimizer, which transforms the updates based on various statistics. Optimizers are often hand-designed and tuning their hyperparameters is a big part of the training process. Learned optimizers have shown some initial promise, but are generally unsuccessful as a general optimization mechanism applicable to every…
▽ More
In modern deep learning, the models are learned by applying gradient updates using an optimizer, which transforms the updates based on various statistics. Optimizers are often hand-designed and tuning their hyperparameters is a big part of the training process. Learned optimizers have shown some initial promise, but are generally unsuccessful as a general optimization mechanism applicable to every problem. In this work we explore a different direction: instead of learning general optimizers, we instead specialize them to a specific training environment. We propose a novel optimizer technique that learns a layer-specific linear combination of update directions provided by a set of base optimizers, effectively adapting its strategy to the specific model and dataset. When evaluated on image classification tasks, this specialized optimizer significantly outperforms both traditional off-the-shelf methods such as Adam, as well as existing general learned optimizers. Moreover, it demonstrates robust generalization with respect to model initialization, evaluating on unseen datasets, and training durations beyond its meta-training horizon.
△ Less
Submitted 21 August, 2024; v1 submitted 17 August, 2024;
originally announced August 2024.
-
Learning to Explore for Stochastic Gradient MCMC
Authors:
SeungHyun Kim,
Seohyeon Jung,
Seonghyeon Kim,
Juho Lee
Abstract:
Bayesian Neural Networks(BNNs) with high-dimensional parameters pose a challenge for posterior inference due to the multi-modality of the posterior distributions. Stochastic Gradient MCMC(SGMCMC) with cyclical learning rate scheduling is a promising solution, but it requires a large number of sampling steps to explore high-dimensional multi-modal posteriors, making it computationally expensive. In…
▽ More
Bayesian Neural Networks(BNNs) with high-dimensional parameters pose a challenge for posterior inference due to the multi-modality of the posterior distributions. Stochastic Gradient MCMC(SGMCMC) with cyclical learning rate scheduling is a promising solution, but it requires a large number of sampling steps to explore high-dimensional multi-modal posteriors, making it computationally expensive. In this paper, we propose a meta-learning strategy to build \gls{sgmcmc} which can efficiently explore the multi-modal target distributions. Our algorithm allows the learned SGMCMC to quickly explore the high-density region of the posterior landscape. Also, we show that this exploration property is transferrable to various tasks, even for the ones unseen during a meta-training stage. Using popular image classification benchmarks and a variety of downstream tasks, we demonstrate that our method significantly improves the sampling efficiency, achieving better performance than vanilla \gls{sgmcmc} without incurring significant computational overhead.
△ Less
Submitted 17 August, 2024;
originally announced August 2024.
-
Cosmological Quasiparticles and the Cosmological Collider
Authors:
Jay Hubisz,
Seung J. Lee,
He Li,
Bharath Sambasivam
Abstract:
The interplay between cosmology and strongly coupled dynamics can yield transient spectral features that vanish at late times, but which may leave behind phenomenological signatures in the spectrum of primordial fluctuations. Of particular interest are strongly coupled extensions of the standard model featuring approximate conformal invariance. In flat space, the spectral density for a scalar oper…
▽ More
The interplay between cosmology and strongly coupled dynamics can yield transient spectral features that vanish at late times, but which may leave behind phenomenological signatures in the spectrum of primordial fluctuations. Of particular interest are strongly coupled extensions of the standard model featuring approximate conformal invariance. In flat space, the spectral density for a scalar operator in a conformal field theory is characterized by a continuum with scaling law governed by the dimension of the operator, and is otherwise featureless. AdS/CFT arguments suggest that for large $N$, in an inflationary background with Hubble rate $H$, this continuum is gapped. We demonstrate that there can be additional peak structures that become sharp and particle-like at phenomenologically interesting regions in parameter space, and we estimate their contribution to cosmological observables. We find phenomena that are potentially observable in future experiments that are unique to these models, including displaced oscillatory features in the squeezed limit of the bi-spectrum. These particles can be either fundamental, and localized to a UV brane, or composite at the Hubble scale, $H$, and bound to a horizon in the bulk of the 5D geometry. We comment on how stabilization of conformal symmetry breaking vacua can be correlated with these spectral features and their phenomenology.
△ Less
Submitted 16 August, 2024;
originally announced August 2024.
-
Microwave Andreev bound state spectroscopy in a semiconductor-based Planar Josephson junction
Authors:
Bassel Heiba Elfeky,
Krishna Dindial,
David S. Brandão,
Barış Pekerten,
Jaewoo Lee,
William M. Strickland,
Patrick J. Strohbeen,
Alisa Danilenko,
Lukas Baker,
Melissa Mikalsen,
William Schiela,
Zixuan Liang,
Jacob Issokson,
Ido Levy,
Igor Zutic,
Javad Shabani
Abstract:
By coupling a semiconductor-based planar Josephson junction to a superconducting resonator, we investigate the Andreev bound states in the junction using dispersive readout techniques. Using electrostatic gating to create a narrow constriction in the junction, our measurements unveil a strong coupling interaction between the resonator and the Andreev bound states. This enables the mapping of isola…
▽ More
By coupling a semiconductor-based planar Josephson junction to a superconducting resonator, we investigate the Andreev bound states in the junction using dispersive readout techniques. Using electrostatic gating to create a narrow constriction in the junction, our measurements unveil a strong coupling interaction between the resonator and the Andreev bound states. This enables the mapping of isolated tunable Andreev bound states, with an observed transparency of up to 99.94\% along with an average induced superconducting gap of $\sim 150 μ$eV. Exploring the gate parameter space further elucidates a non-monotonic evolution of multiple Andreev bound states with varying gate voltage. Complimentary tight-binding calculations of an Al-InAs planar Josephson junction with strong Rashba spin-orbit coupling provide insight into possible mechanisms responsible for such behavior. Our findings highlight the subtleties of the Andreev spectrum of Josephson junctions fabricated on superconductor-semiconductor heterostructures and offering potential applications in probing topological states in these hybrid platforms.
△ Less
Submitted 15 August, 2024;
originally announced August 2024.
-
Cluster Formations of Free and Congested Flows in Urban Road Networks
Authors:
Yongsung Kwon,
Minjin Lee,
Mi Jin Lee,
Seung-Woo Son
Abstract:
Understanding traffic behavior is crucial for enhancing the stable functioning and safety of transportation systems. Previous percolation-based transportation studies have analyzed transition behaviors from free-flow to traffic-jam states, with a focus on robustness and resilience during congestion. However, relatively less attention is paid to the percolation analysis of the free-flow states, spe…
▽ More
Understanding traffic behavior is crucial for enhancing the stable functioning and safety of transportation systems. Previous percolation-based transportation studies have analyzed transition behaviors from free-flow to traffic-jam states, with a focus on robustness and resilience during congestion. However, relatively less attention is paid to the percolation analysis of the free-flow states, specifically how free-flow clusters form and grow. In this study, we investigate the percolation patterns of two opposing traffic scenarios -- traffic jam state and free-flow state -- within the same road network using Chengdu taxi data and compare their percolating behaviors. Our analysis reveals differences between the two scenarios in the growth patterns of the giant connected component (GCC), which is captured by a persistent gap between the GCC size curves, particularly during peak hours. We attribute these disparities to a long-range spatial correlation of traffic speed within a road network. Empirically, we find distinct long-range spatial correlations in traffic, using rescaled taxi speeds on roads, and we examine their relationship with each percolation pattern. Our analysis provides an integrated view of traffic dynamics and uncovers intrinsic traffic correlations within urban areas that drive these intriguing percolation patterns. Our findings also offer valuable metrics for effective traffic management and accident prevention strategies, aligning with urban transportation safety and reliability goals. These insights are beneficial for assessing and designing resilient urban road networks that maintain functionality under stress, ultimately improving the reliability of traffic assessments and reducing accidents.
△ Less
Submitted 15 August, 2024;
originally announced August 2024.
-
Network analysis reveals news press landscape and asymmetric user polarization
Authors:
Byunghwee Lee,
Hyo-sun Ryu,
Jae Kook Lee,
Hawoong Jeong,
Beom Jun Kim
Abstract:
Unlike traditional media, online news platforms allow users to consume content that suits their tastes and to facilitate interactions with other people. However, as more personalized consumption of information and interaction with like-minded users increase, ideological bias can inadvertently increase and contribute to the formation of echo chambers, reinforcing the polarization of opinions. Altho…
▽ More
Unlike traditional media, online news platforms allow users to consume content that suits their tastes and to facilitate interactions with other people. However, as more personalized consumption of information and interaction with like-minded users increase, ideological bias can inadvertently increase and contribute to the formation of echo chambers, reinforcing the polarization of opinions. Although the structural characteristics of polarization among different ideological groups in online spaces have been extensively studied, research into how these groups emotionally interact with each other has not been as thoroughly explored. From this perspective, we investigate both structural and affective polarization between news media user groups on Naver News, South Korea's largest online news portal, during the period of 2022 Korean presidential election. By utilizing the dataset comprising 333,014 articles and over 36 million user comments, we uncover two distinct groups of users characterized by opposing political leanings and reveal significant bias and polarization among them. Additionally, we reveal the existence of echo chambers within co-commenting networks and investigate the asymmetric affective interaction patterns between the two polarized groups. Classification task of news media articles based on the distinct comment response patterns support the notion that different political groups may employ distinct communication strategies. Our approach based on network analysis on large-scale comment dataset offers novel insights into characteristics of user polarization in the online news platforms and the nuanced interaction nature between user groups.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
Certifiable Deep Learning for Reachability Using a New Lipschitz Continuous Value Function
Authors:
Jingqi Li,
Donggun Lee,
Jaewon Lee,
Kris Shengjun Dong,
Somayeh Sojoudi,
Claire Tomlin
Abstract:
We propose a new reachability learning framework for high-dimensional nonlinear systems, focusing on reach-avoid problems. These problems require computing the reach-avoid set, which ensures that all its elements can safely reach a target set despite any disturbance within pre-specified bounds. Our framework has two main parts: offline learning of a newly designed reach-avoid value function and po…
▽ More
We propose a new reachability learning framework for high-dimensional nonlinear systems, focusing on reach-avoid problems. These problems require computing the reach-avoid set, which ensures that all its elements can safely reach a target set despite any disturbance within pre-specified bounds. Our framework has two main parts: offline learning of a newly designed reach-avoid value function and post-learning certification. Compared to prior works, our new value function is Lipschitz continuous and its associated Bellman operator is a contraction mapping, both of which improve the learning performance. To ensure deterministic guarantees of our learned reach-avoid set, we introduce two efficient post-learning certification methods. Both methods can be used online for real-time local certification or offline for comprehensive certification. We validate our framework in a 12-dimensional crazyflie drone racing hardware experiment and a simulated 10-dimensional highway takeover example.
△ Less
Submitted 19 August, 2024; v1 submitted 14 August, 2024;
originally announced August 2024.
-
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Authors:
Jiri Hron,
Laura Culp,
Gamaleldin Elsayed,
Rosanne Liu,
Ben Adlam,
Maxwell Bileschi,
Bernd Bohnet,
JD Co-Reyes,
Noah Fiedel,
C. Daniel Freeman,
Izzeddin Gur,
Kathleen Kenealy,
Jaehoon Lee,
Peter J. Liu,
Gaurav Mishra,
Igor Mordatch,
Azade Nova,
Roman Novak,
Aaron Parisi,
Jeffrey Pennington,
Alex Rizkowsky,
Isabelle Simpson,
Hanie Sedghi,
Jascha Sohl-dickstein,
Kevin Swersky
, et al. (6 additional authors not shown)
Abstract:
While many capabilities of language models (LMs) improve with increased training budget, the influence of scale on hallucinations is not yet fully understood. Hallucinations come in many forms, and there is no universally accepted definition. We thus focus on studying only those hallucinations where a correct answer appears verbatim in the training set. To fully control the training data content,…
▽ More
While many capabilities of language models (LMs) improve with increased training budget, the influence of scale on hallucinations is not yet fully understood. Hallucinations come in many forms, and there is no universally accepted definition. We thus focus on studying only those hallucinations where a correct answer appears verbatim in the training set. To fully control the training data content, we construct a knowledge graph (KG)-based dataset, and use it to train a set of increasingly large LMs. We find that for a fixed dataset, larger and longer-trained LMs hallucinate less. However, hallucinating on $\leq5$% of the training data requires an order of magnitude larger model, and thus an order of magnitude more compute, than Hoffmann et al. (2022) reported was optimal. Given this costliness, we study how hallucination detectors depend on scale. While we see detector size improves performance on fixed LM's outputs, we find an inverse relationship between the scale of the LM and the detectability of its hallucinations.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
LPU: A Latency-Optimized and Highly Scalable Processor for Large Language Model Inference
Authors:
Seungjae Moon,
Jung-Hoon Kim,
Junsoo Kim,
Seongmin Hong,
Junseo Cha,
Minsu Kim,
Sukbin Lim,
Gyubin Choi,
Dongjin Seo,
Jongho Kim,
Hunjong Lee,
Hyunjun Park,
Ryeowook Ko,
Soongyu Choi,
Jongse Park,
Jinwon Lee,
Joo-Young Kim
Abstract:
The explosive arrival of OpenAI's ChatGPT has fueled the globalization of large language model (LLM), which consists of billions of pretrained parameters that embodies the aspects of syntax and semantics. HyperAccel introduces latency processing unit (LPU), a latency-optimized and highly scalable processor architecture for the acceleration of LLM inference. LPU perfectly balances the memory bandwi…
▽ More
The explosive arrival of OpenAI's ChatGPT has fueled the globalization of large language model (LLM), which consists of billions of pretrained parameters that embodies the aspects of syntax and semantics. HyperAccel introduces latency processing unit (LPU), a latency-optimized and highly scalable processor architecture for the acceleration of LLM inference. LPU perfectly balances the memory bandwidth and compute logic with streamlined dataflow to maximize performance and efficiency. LPU is equipped with expandable synchronization link (ESL) that hides data synchronization latency between multiple LPUs. HyperDex complements LPU as an intuitive software framework to run LLM applications. LPU achieves 1.25 ms/token and 20.9 ms/token for 1.3B and 66B model, respectively, which is 2.09x and 1.37x faster than the GPU. LPU, synthesized using Samsung 4nm process, has total area of 0.824 mm2 and power consumption of 284.31 mW. LPU-based servers achieve 1.33x and 1.32x energy efficiency over NVIDIA H100 and L4 servers, respectively.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
Star-formation Properties of z ~ 1 Galaxy Clusters and Groups from Horizon Run 5
Authors:
Seong-Kook Lee,
Changbom Park,
Juhan Kim,
Jaehyun Lee,
Brad K. Gibson,
Yonghwi Kim,
C. Gareth Few
Abstract:
Quiescent galaxies are predominantly observed in local galaxy clusters. However, the fraction of quiescent galaxies in high-redshift clusters significantly varies among different clusters. In this study, we present the results of an analysis of the star formation (SF) properties of $z \sim 0.87$ clusters and groups from a cosmological hydrodynamical simulation Horizon Run 5. We investigate the cor…
▽ More
Quiescent galaxies are predominantly observed in local galaxy clusters. However, the fraction of quiescent galaxies in high-redshift clusters significantly varies among different clusters. In this study, we present the results of an analysis of the star formation (SF) properties of $z \sim 0.87$ clusters and groups from a cosmological hydrodynamical simulation Horizon Run 5. We investigate the correlation between the quiescent galaxy fraction (QF) of these model clusters/groups and their various internal or external properties. We find that halo mass is one of the most important characteristics as higher mass clusters and groups have higher QFs. We also find that other properties such as stellar-mass ratio and Friends-of-Friends fraction, which measures the proportion of the area around a cluster occupied by dense structures, may mildly affect the QFs of clusters and groups. This may indicate that the evolutionary history as well as the large-scale environment of clusters and groups also play a certain role in determining the SF status of high-redshift galaxy clusters and groups.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
Imagen 3
Authors:
Imagen-Team-Google,
:,
Jason Baldridge,
Jakob Bauer,
Mukul Bhutani,
Nicole Brichtova,
Andrew Bunner,
Kelvin Chan,
Yichang Chen,
Sander Dieleman,
Yuqing Du,
Zach Eaton-Rosen,
Hongliang Fei,
Nando de Freitas,
Yilin Gao,
Evgeny Gladchenko,
Sergio Gómez Colmenarejo,
Mandy Guo,
Alex Haig,
Will Hawkins,
Hexiang Hu,
Huilian Huang,
Tobenna Peter Igwe,
Christos Kaplanis,
Siavash Khodadadeh
, et al. (227 additional authors not shown)
Abstract:
We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.
We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
Intersection of orbits for polynomials in characteristic $p$
Authors:
Simone Coccia,
Dragos Ghioca,
Jungin Lee,
Gyeonghyeon Nam
Abstract:
In [GTZ08, GTZ12], the following result was established: given polynomials $f,g\in\mathbb{C}[x]$ of degrees larger than $1$, if there exist $α,β\in\mathbb{C}$ such that their corresponding orbits $\mathcal{O}_f(α)$ and $\mathcal{O}_g(β)$ (under the action of $f$, respectively of $g$) intersect in infinitely many points, then $f$ and $g$ must share a common iterate, i.e., $f^m=g^n$ for some…
▽ More
In [GTZ08, GTZ12], the following result was established: given polynomials $f,g\in\mathbb{C}[x]$ of degrees larger than $1$, if there exist $α,β\in\mathbb{C}$ such that their corresponding orbits $\mathcal{O}_f(α)$ and $\mathcal{O}_g(β)$ (under the action of $f$, respectively of $g$) intersect in infinitely many points, then $f$ and $g$ must share a common iterate, i.e., $f^m=g^n$ for some $m,n\in\mathbb{N}$. If one replaces $\mathbb{C}$ with a field $K$ of characteristic $p$, then the conclusion fails; we provide numerous examples showing the complexity of the problem over a field of positive characteristic. We advance a modified conjecture regarding polynomials $f$ and $g$ which admit two orbits with infinite intersection over a field of characteristic $p$. Then we present various partial results, along with connections with another deep conjecture in the area, the dynamical Mordell-Lang conjecture.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.