Search | arXiv e-print repository

SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields

Authors: Yuze Wang, Junyi Wang, Chen Wang, Wantong Duan, Yongtang Bao, Yue Qi

Abstract: This paper introduces a novel continual learning framework for synthesising novel views of multiple scenes, learning multiple 3D scenes incrementally, and updating the network parameters only with the training data of the upcoming new scene. We build on Neural Radiance Fields (NeRF), which uses multi-layer perceptron to model the density and radiance field of a scene as the implicit function. Whil… ▽ More This paper introduces a novel continual learning framework for synthesising novel views of multiple scenes, learning multiple 3D scenes incrementally, and updating the network parameters only with the training data of the upcoming new scene. We build on Neural Radiance Fields (NeRF), which uses multi-layer perceptron to model the density and radiance field of a scene as the implicit function. While NeRF and its extensions have shown a powerful capability of rendering photo-realistic novel views in a single 3D scene, managing these growing 3D NeRF assets efficiently is a new scientific problem. Very few works focus on the efficient representation or continuous learning capability of multiple scenes, which is crucial for the practical applications of NeRF. To achieve these goals, our key idea is to represent multiple scenes as the linear combination of a cross-scene weight matrix and a set of scene-specific weight matrices generated from a global parameter generator. Furthermore, we propose an uncertain surface knowledge distillation strategy to transfer the radiance field knowledge of previous scenes to the new model. Representing multiple 3D scenes with such weight matrices significantly reduces memory requirements. At the same time, the uncertain surface distillation strategy greatly overcomes the catastrophic forgetting problem and maintains the photo-realistic rendering quality of previous scenes. Experiments show that the proposed approach achieves state-of-the-art rendering quality of continual learning NeRF on NeRF-Synthetic, LLFF, and TanksAndTemples datasets while preserving extra low storage cost. △ Less

Submitted 5 September, 2024; originally announced September 2024.

arXiv:2409.01156 [pdf, other]

TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval

Authors: Leqi Shen, Tianxiang Hao, Sicheng Zhao, Yifeng Zhang, Pengzhang Liu, Yongjun Bao, Guiguang Ding

Abstract: Most text-video retrieval methods utilize the text-image pre-trained CLIP as a backbone, incorporating complex modules that result in high computational overhead. As a result, many studies focus on efficient fine-tuning. The primary challenge in efficient adaption arises from the inherent differences between image and video modalities. Each sampled video frame must be processed by the image encode… ▽ More Most text-video retrieval methods utilize the text-image pre-trained CLIP as a backbone, incorporating complex modules that result in high computational overhead. As a result, many studies focus on efficient fine-tuning. The primary challenge in efficient adaption arises from the inherent differences between image and video modalities. Each sampled video frame must be processed by the image encoder independently, which increases complexity and complicates practical deployment. Although existing efficient methods fine-tune with small trainable parameters, they still incur high inference costs due to the large token number. In this work, we argue that temporal redundancy significantly contributes to the model's high complexity due to the repeated information in consecutive frames. Existing token compression methods for image models fail to solve the unique challenges, as they overlook temporal redundancy across frames. To tackle these problems, we propose Temporal Token Merging (TempMe) to reduce temporal redundancy. Specifically, we introduce a progressive multi-granularity framework. By gradually combining neighboring clips, we merge temporal tokens across different frames and learn video-level features, leading to lower complexity and better performance. Extensive experiments validate the superiority of our TempMe. Compared to previous efficient text-video retrieval methods, TempMe significantly reduces output tokens by 95% and GFLOPs by 51%, while achieving a 1.8X speedup and a 4.4% R-Sum improvement. Additionally, TempMe exhibits robust generalization capabilities by integrating effectively with both efficient and full fine-tuning methods. With full fine-tuning, TempMe achieves a significant 7.9% R-Sum improvement, trains 1.57X faster, and utilizes 75.2% GPU memory usage. Our code will be released. △ Less

Submitted 2 September, 2024; originally announced September 2024.

arXiv:2409.00256 [pdf, other]

Accurate, precise pressure sensing with tethered optomechanics

Authors: Olivia R. Green, Yiliang Bao, John R. Lawall, Jason J. Gorman, Daniel S. Barker

Abstract: We show that optomechanical systems can be primary pressure sensors with uncertainty as low as 1.1 % of reading via comparison with a pressure transfer standard. Our silicon nitride and silicon carbide sensors are short-term and long-term stable, displaying Allan deviations compatible with better than 1 % precision and baseline drift significantly lower than the transfer standard. We also investig… ▽ More We show that optomechanical systems can be primary pressure sensors with uncertainty as low as 1.1 % of reading via comparison with a pressure transfer standard. Our silicon nitride and silicon carbide sensors are short-term and long-term stable, displaying Allan deviations compatible with better than 1 % precision and baseline drift significantly lower than the transfer standard. We also investigate the performance of optomechanical devices as calibrated gauges, finding that they can achieve total uncertainty less than 1 %. The calibration procedure also yields the thin-film density of our sensors with state-of-the-art precision, aiding development of other calibration-free optomechanical sensors. Our results demonstrate that optomechanical pressure sensors can achieve accuracy, precision, and drift sufficient to replace high performance legacy gauges. △ Less

Submitted 30 August, 2024; originally announced September 2024.

Comments: 15 pages, 7 figures

arXiv:2408.09518 [pdf]

Metasurface-Based Full-Parameter Optical Multiplexing

Authors: Rui Wei, Hongsheng Shi, Boyou Wang, Baojun Li, Yanjun Bao

Abstract: Optical multiplexing is a key technique that enhances the capacity of optical systems by independently modulating various optical parameters to carry distinct information. Among these parameters, wavelength, polarization, and angle are the primary ones for multiplexing in plane waves with uniform cross-sectional distribution. While metasurfaces have recently emerged as a powerful platform for opti… ▽ More Optical multiplexing is a key technique that enhances the capacity of optical systems by independently modulating various optical parameters to carry distinct information. Among these parameters, wavelength, polarization, and angle are the primary ones for multiplexing in plane waves with uniform cross-sectional distribution. While metasurfaces have recently emerged as a powerful platform for optical multiplexing, they are typically restricted to partial parameter multiplexing and exhibit a low number of multiplexing channels. In this work, we propose and experimentally demonstrate the full-parameter multiplexing of polarization, wavelength, and angle, achieving hundreds of distinct multiplexing channels,the largest reported to date. Our design utilizes a gradient-based optimization algorithm to enable high-efficiency performance and independent functionalities with minimal cross-talk among channels. This approach represents a significant advancement in metasurface design and optical multiplexing, with potential applications in complex and dynamic optical systems. △ Less

Submitted 18 August, 2024; originally announced August 2024.

arXiv:2408.09509 [pdf]

High-Capacity Metasurface at Limits of Polarization and Wavelength Multiplexing

Authors: Yanjun Bao, Hongsheng Shi, Rui Wei, Boyou Wang, Zhou Zhou, Cheng-Wei Qiu, Baojun Li

Abstract: Polarization and wavelength multiplexing are the two most widely employed techniques to improve the capacity in the metasurfaces. Existing works have pushed each technique to its individual limits. For example, the polarization multiplexing channels working at a single wavelength have been significantly increased by using noise engineering. However, it is still challenging to achieve the multiplex… ▽ More Polarization and wavelength multiplexing are the two most widely employed techniques to improve the capacity in the metasurfaces. Existing works have pushed each technique to its individual limits. For example, the polarization multiplexing channels working at a single wavelength have been significantly increased by using noise engineering. However, it is still challenging to achieve the multiplexing limits of wavelength and polarization simultaneously. Besides, such multiplexing methods suffer from computational inefficiencies, hindering their application in tasks like image recognition that require extensive training computation. In this work, we introduce a gradient-based optimization algorithm using deep neural network (DNN) to achieve the limits of both polarization and wavelength multiplexing with high computational efficiency. We experimentally demonstrate this capability, achieving a record-breaking capacity of 15 holographic images across five wavelengths and the maximum of three independent polarization channels, as well as 18 holographic images across three wavelengths and six corelated polarization channels. Moreover, leveraging the high computational efficiency of our DNN-based method, which is well-suited for processing large datasets, we implement large-scale image recognition tasks across 36 classes encoded in a record of nine multiplexed channels (three wavelengths * three polarizations), achieving 96% classification accuracy in calculations and 91.5% in experiments. This work sets a new benchmark for high-capacity multiplexing with metasurfaces and demonstrates the power of gradient-based inverse design for realizing multi-functional optical elements. △ Less

Submitted 18 August, 2024; originally announced August 2024.

arXiv:2408.08948 [pdf, other]

Accidental Suppression of Wilson Coefficients in Higgs Coupling

Authors: Yunjia Bao, Jiayin Gu, Zhen Liu, Chi Shu, Lian-Tao Wang

Abstract: Higgs couplings are essential probes for physics beyond the Standard Model (BSM) since they can be modified by new physics, such as through the Higgs portal interaction $|H|^2\mathcal{O}$. These modifications influence Higgs interactions via dimension-6 operators of the form $ \left(\partial |H|^2\right)^2$ and $|H|^6$, which are generally expected to be of comparable size. This paper discusses a… ▽ More Higgs couplings are essential probes for physics beyond the Standard Model (BSM) since they can be modified by new physics, such as through the Higgs portal interaction $|H|^2\mathcal{O}$. These modifications influence Higgs interactions via dimension-6 operators of the form $ \left(\partial |H|^2\right)^2$ and $|H|^6$, which are generally expected to be of comparable size. This paper discusses a phenomenon of accidental suppression, where the $|H|^6$ coupling is significantly smaller than $\left(\partial |H|^2\right)^2$. This suppression, arising from the truncation of the tree-level effective potential, lacks a clear symmetry explanation but persists in portal models. This paper aims to inspire further studies on additional instances of accidental suppression without symmetry explanations or a general framework to characterize such suppression. We also discuss constraints, at the HL-LHC and future colliders, on the Wilson coefficients of the two dimension-6 operators for various benchmark scenarios of the concrete model. △ Less

Submitted 16 August, 2024; originally announced August 2024.

Comments: 31 pages, 1 figure, 2 tables

Report number: UMN-TH-4327/24

arXiv:2408.08615 [pdf]

Single-Shot Simultaneous Intensity, Phase, and Polarization Imaging with Metasurface

Authors: Yanjun Bao, Baojun Li

Abstract: Optical imaging of the intensity, phase and polarization distributions of optical field is fundamental to numerous applications. Traditional methods rely on bulky optical components and require multiple measurements. Recently, metasurface-based (MS-based) imaging strategies have emerged as a promising solution to address these challenges. However, they have been primarily limited to capturing part… ▽ More Optical imaging of the intensity, phase and polarization distributions of optical field is fundamental to numerous applications. Traditional methods rely on bulky optical components and require multiple measurements. Recently, metasurface-based (MS-based) imaging strategies have emerged as a promising solution to address these challenges. However, they have been primarily limited to capturing partial information of the three parameters, tailored to specific optical fields, which poses challenges when addressing with arbitrary field distributions and achieving three-parameter imaging. In this study, we introduce a MS-based approach for single-shot optical imaging that simultaneously captures all the three parameters of optical fields with arbitrary intensity, phase, and polarization distributions. We experimentally validate the versatility of our method by conducting imaging of various types of optical fields with arbitrary well-defined distributions. The strategy presented in our work is expected to open up promising avenues for diverse applications, including imaging, optical communications, and beyond. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.08306 [pdf, other]

Accelerated Image-Aware Generative Diffusion Modeling

Authors: Tanmay Asthana, Yufang Bao, Hamid Krim

Abstract: We propose in this paper an analytically new construct of a diffusion model whose drift and diffusion parameters yield an exponentially time-decaying Signal to Noise Ratio in the forward process. In reverse, the construct cleverly carries out the learning of the diffusion coefficients on the structure of clean images using an autoencoder. The proposed methodology significantly accelerates the diff… ▽ More We propose in this paper an analytically new construct of a diffusion model whose drift and diffusion parameters yield an exponentially time-decaying Signal to Noise Ratio in the forward process. In reverse, the construct cleverly carries out the learning of the diffusion coefficients on the structure of clean images using an autoencoder. The proposed methodology significantly accelerates the diffusion process, reducing the required diffusion time steps from around 1000 seen in conventional models to 200-500 without compromising image quality in the reverse-time diffusion. In a departure from conventional models which typically use time-consuming multiple runs, we introduce a parallel data-driven model to generate a reverse-time diffusion trajectory in a single run of the model. The resulting collective block-sequential generative model eliminates the need for MCMC-based sub-sampling correction for safeguarding and improving image quality, to further improve the acceleration of image generation. Collectively, these advancements yield a generative model that is an order of magnitude faster than conventional approaches, while maintaining high fidelity and diversity in generated images, hence promising widespread applicability in rapid image synthesis tasks. △ Less

Submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.07125 [pdf, ps, other]

Emergent Gauge Fields and the "Choi-Spin Liquids" in Steady States

Authors: Kaixiang Su, Yimu Bao, Cenke Xu

Abstract: We demonstrate that the steady states of the evolution of a class of Lindbladians can be mapped to the "Gutzwiller projected" wave functions in the doubled Hilbert space, i.e. the representation of the density matrix through the Choi-Jamiolkowski isomorphism. A Gutzwiller projection is a standard approach of constructing spin liquid states. For example, if one starts with a gapless free fermion pu… ▽ More We demonstrate that the steady states of the evolution of a class of Lindbladians can be mapped to the "Gutzwiller projected" wave functions in the doubled Hilbert space, i.e. the representation of the density matrix through the Choi-Jamiolkowski isomorphism. A Gutzwiller projection is a standard approach of constructing spin liquid states. For example, if one starts with a gapless free fermion pure quantum state, the steady state of the Lindbladian evolution in the doubled Hilbert space is an analog of an algebraic spin liquid, which is dubbed the "Choi-spin liquid". The Choi-spin liquid can also be produced through strong measurement without post-selection. Predictions of the Choi-spin liquids can be made based on the understanding on spin liquids, and we will design the experimental protocol to test these predictions. If one starts with a Chern insulator, theory predicts that the steady state of the Lindbladian evolution is expected to have a spontaneous "strong-to-weak" U(1) symmetry breaking, which corresponds to a superconductor in the doubled Hilbert space. △ Less

Submitted 26 August, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

Comments: 10 pages

arXiv:2407.19298 [pdf, ps, other]

Operad and cohomology of associative algebras with generalized derivations

Authors: Jiang-Nan Xu, Yan-Hong Bao

Abstract: An associative algebra with a generalized derivation is called an AsGDer triple. We introduce the operad that encodes AsGDer triples, and prove it is a Koszul operad. Using its Koszul dual cooperad, we introduce the homotopy version of AsGDer triples. As an application, we construct the AsGDer cohomology theory for AsGDer triples, and show that the formal deformation of an AsGDer triple is control… ▽ More An associative algebra with a generalized derivation is called an AsGDer triple. We introduce the operad that encodes AsGDer triples, and prove it is a Koszul operad. Using its Koszul dual cooperad, we introduce the homotopy version of AsGDer triples. As an application, we construct the AsGDer cohomology theory for AsGDer triples, and show that the formal deformation of an AsGDer triple is controlled by the AsGDer cohomology. △ Less

Submitted 5 August, 2024; v1 submitted 27 July, 2024; originally announced July 2024.

MSC Class: 16R10; 16R99; 18M60

arXiv:2407.17525 [pdf, other]

Crescendo Beyond the Horizon: More Gravitational Waves from Domain Walls Bounded by Inflated Cosmic Strings

Authors: Yunjia Bao, Keisuke Harigaya, Lian-Tao Wang

Abstract: Gravitational-wave (GW) signals offer a unique window into the dynamics of the early universe. GWs may be generated by the topological defects produced in the early universe, which contain information on the symmetry of UV physics. We consider the case in which a two-step phase transition produces a network of domain walls bounded by cosmic strings. Specifically, we focus on the case in which ther… ▽ More Gravitational-wave (GW) signals offer a unique window into the dynamics of the early universe. GWs may be generated by the topological defects produced in the early universe, which contain information on the symmetry of UV physics. We consider the case in which a two-step phase transition produces a network of domain walls bounded by cosmic strings. Specifically, we focus on the case in which there is a hierarchy in the symmetry-breaking scales, and a period of inflation pushes the cosmic string generated in the first phase transition outside the horizon before the second phase transition. We show that the GW signal from the evolution and collapse of this string-wall network has a unique spectrum, and the resulting signal strength can be sizeable. In particular, depending on the model parameters, the resulting signal can show up in a broad range of frequencies and can be discovered by a multitude of future probes, including the pulsar timing arrays and space- and ground-based GW observatories. As an example that naturally gives rise to this scenario, we present a model with the first phase transition followed by a brief period of thermal inflation driven by the field responsible for the second stage of symmetry breaking. The model can be embedded into a supersymmetric setup, which provides a natural realization of this scenario. In this case, the successful detection of the peak of the GW spectrum probes the soft supersymmetry breaking scale and the wall tension. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: 63 pages, 11 figures, 3 tables

arXiv:2407.17418 [pdf, other]

3D Gaussian Splatting: Survey, Technologies, Challenges, and Opportunities

Authors: Yanqi Bao, Tianyu Ding, Jing Huo, Yaoli Liu, Yuxin Li, Wenbin Li, Yang Gao, Jiebo Luo

Abstract: 3D Gaussian Splatting (3DGS) has emerged as a prominent technique with the potential to become a mainstream method for 3D representations. It can effectively transform multi-view images into explicit 3D Gaussian representations through efficient training, and achieve real-time rendering of novel views. This survey aims to analyze existing 3DGS-related works from multiple intersecting perspectives,… ▽ More 3D Gaussian Splatting (3DGS) has emerged as a prominent technique with the potential to become a mainstream method for 3D representations. It can effectively transform multi-view images into explicit 3D Gaussian representations through efficient training, and achieve real-time rendering of novel views. This survey aims to analyze existing 3DGS-related works from multiple intersecting perspectives, including related tasks, technologies, challenges, and opportunities. The primary objective is to provide newcomers with a rapid understanding of the field and to assist researchers in methodically organizing existing technologies and challenges. Specifically, we delve into the optimization, application, and extension of 3DGS, categorizing them based on their focuses or motivations. Additionally, we summarize and classify nine types of technical modules and corresponding improvements identified in existing works. Based on these analyses, we further examine the common challenges and technologies across various tasks, proposing potential research opportunities. △ Less

Submitted 24 July, 2024; originally announced July 2024.

arXiv:2407.16944 [pdf, ps, other]

Adaptive Gradient Regularization: A Faster and Generalizable Optimization Technique for Deep Neural Networks

Authors: Huixiu Jiang, Ling Yang, Yu Bao, Rutong Si, Sikun Yang

Abstract: Stochastic optimization plays a crucial role in the advancement of deep learning technologies. Over the decades, significant effort has been dedicated to improving the training efficiency and robustness of deep neural networks, via various strategies including gradient normalization (GN) and gradient centralization (GC). Nevertheless, to the best of our knowledge, no one has considered to capture… ▽ More Stochastic optimization plays a crucial role in the advancement of deep learning technologies. Over the decades, significant effort has been dedicated to improving the training efficiency and robustness of deep neural networks, via various strategies including gradient normalization (GN) and gradient centralization (GC). Nevertheless, to the best of our knowledge, no one has considered to capture the optimal gradient descent trajectory, by adaptively controlling gradient descent direction. To address this concern, this paper is the first attempt to study a new optimization technique for deep neural networks, using the sum normalization of a gradient vector as coefficients, to dynamically regularize gradients and thus to effectively control optimization direction. The proposed technique is hence named as the adaptive gradient regularization (AGR). It can be viewed as an adaptive gradient clipping method. The theoretical analysis reveals that the AGR can effectively smooth the loss landscape, and hence can significantly improve the training efficiency and model generalization performance. We note that AGR can greatly improve the training efficiency of vanilla optimizers' including Adan and AdamW, by adding only three lines of code. The final experiments conducted on image generation, image classification, and language representation, demonstrate that the AGR method can not only improve the training efficiency but also enhance the model generalization performance. △ Less

Submitted 19 August, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

Comments: 12 pages, 13 figures

arXiv:2407.16462 [pdf, other]

doi 10.1364/OL.527857

Source-independent quantum secret sharing with entangled photon pair networks

Authors: Yi-Ran Xiao, Zhao-Ying Jia, Yu-Chen Song, Yu Bao, Yao Fu, Hua-Lei Yin, Zeng-Bing Chen

Abstract: The large-scale deployment of quantum secret sharing (QSS) in quantum networks is currently challenging due to the requirements for the generation and distribution of multipartite entanglement states. Here we present an efficient source-independent QSS protocol utilizing entangled photon pairs in quantum networks. Through the post-matching method, which means the measurement events in the same bas… ▽ More The large-scale deployment of quantum secret sharing (QSS) in quantum networks is currently challenging due to the requirements for the generation and distribution of multipartite entanglement states. Here we present an efficient source-independent QSS protocol utilizing entangled photon pairs in quantum networks. Through the post-matching method, which means the measurement events in the same basis are matched, the key rate is almost independent of the number of participants. In addition, the unconditional security of our QSS against internal and external eavesdroppers can be proved by introducing an equivalent virtual protocol. Our protocol has great performance and technical advantages in future quantum networks. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: 13 pages, 9 figures

Journal ref: Optics Letters 49, 4210 (2024)

arXiv:2407.14492 [pdf, other]

Adaptive Uncertainty Quantification for Scenario-based Control Using Meta-learning of Bayesian Neural Networks

Authors: Yajie Bao, Javad Mohammadpour Velni

Abstract: Scenario-based optimization and control has proven to be an efficient approach to account for system uncertainty. In particular, the performance of scenario-based model predictive control (MPC) schemes depends on the accuracy of uncertainty quantification. However, current learning- and scenario-based MPC (sMPC) approaches employ a single timeinvariant probabilistic model (learned offline), which… ▽ More Scenario-based optimization and control has proven to be an efficient approach to account for system uncertainty. In particular, the performance of scenario-based model predictive control (MPC) schemes depends on the accuracy of uncertainty quantification. However, current learning- and scenario-based MPC (sMPC) approaches employ a single timeinvariant probabilistic model (learned offline), which may not accurately describe time-varying uncertainties. Instead, this paper presents a model-agnostic meta-learning (MAML) of Bayesian neural networks (BNN) for adaptive uncertainty quantification that would be subsequently used for adaptive-scenario-tree model predictive control design of nonlinear systems with unknown dynamics to enhance control performance. In particular, the proposed approach learns both a global BNN model and an updating law to refine the BNN model. At each time step, the updating law transforms the global BNN model into more precise local BNN models in real time. The adapted local model is then used to generate scenarios for sMPC design at each time step. A probabilistic safety certificate is incorporated in the scenario generation to ensure that the trajectories of the generated scenarios contain the real trajectory of the system and that all the scenarios adhere to the constraints with a high probability. Experiments using closed-loop simulations of a numerical example demonstrate that the proposed approach can improve the performance of scenario-based MPC compared to using only one BNN model learned offline for all time steps. △ Less

Submitted 19 July, 2024; originally announced July 2024.

Comments: Accepted by 2024 Modeling, Estimation and Control Conference. This work was done during the PhD period of the first author

arXiv:2407.14007 [pdf, other]

Multi-modal Relation Distillation for Unified 3D Representation Learning

Authors: Huiqun Wang, Yiping Bao, Panwang Pan, Zeming Li, Xiao Liu, Ruijie Yang, Di Huang

Abstract: Recent advancements in multi-modal pre-training for 3D point clouds have demonstrated promising results by aligning heterogeneous features across 3D shapes and their corresponding 2D images and language descriptions. However, current straightforward solutions often overlook intricate structural relations among samples, potentially limiting the full capabilities of multi-modal learning. To address… ▽ More Recent advancements in multi-modal pre-training for 3D point clouds have demonstrated promising results by aligning heterogeneous features across 3D shapes and their corresponding 2D images and language descriptions. However, current straightforward solutions often overlook intricate structural relations among samples, potentially limiting the full capabilities of multi-modal learning. To address this issue, we introduce Multi-modal Relation Distillation (MRD), a tri-modal pre-training framework, which is designed to effectively distill reputable large Vision-Language Models (VLM) into 3D backbones. MRD aims to capture both intra-relations within each modality as well as cross-relations between different modalities and produce more discriminative 3D shape representations. Notably, MRD achieves significant improvements in downstream zero-shot classification tasks and cross-modality retrieval tasks, delivering new state-of-the-art performance. △ Less

Submitted 18 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV2024

arXiv:2407.13981 [pdf, other]

Decomposed Direct Preference Optimization for Structure-Based Drug Design

Authors: Xiwei Cheng, Xiangxin Zhou, Yuwei Yang, Yu Bao, Quanquan Gu

Abstract: Diffusion models have achieved promising results for Structure-Based Drug Design (SBDD). Nevertheless, high-quality protein subpocket and ligand data are relatively scarce, which hinders the models' generation capabilities. Recently, Direct Preference Optimization (DPO) has emerged as a pivotal tool for the alignment of generative models such as large language models and diffusion models, providin… ▽ More Diffusion models have achieved promising results for Structure-Based Drug Design (SBDD). Nevertheless, high-quality protein subpocket and ligand data are relatively scarce, which hinders the models' generation capabilities. Recently, Direct Preference Optimization (DPO) has emerged as a pivotal tool for the alignment of generative models such as large language models and diffusion models, providing greater flexibility and accuracy by directly aligning model outputs with human preferences. Building on this advancement, we introduce DPO to SBDD in this paper. We tailor diffusion models to pharmaceutical needs by aligning them with elaborately designed chemical score functions. We propose a new structure-based molecular optimization method called DecompDPO, which decomposes the molecule into arms and scaffolds and performs preference optimization at both local substructure and global molecule levels, allowing for more precise control with fine-grained preferences. Notably, DecompDPO can be effectively used for two main purposes: (1) fine-tuning pretrained diffusion models for molecule generation across various protein families, and (2) molecular optimization given a specific protein subpocket after generation. Extensive experiments on the CrossDocked2020 benchmark show that DecompDPO significantly improves model performance in both molecule generation and optimization, with up to 100% Median High Affinity and a 54.9% Success Rate. △ Less

Submitted 18 July, 2024; originally announced July 2024.

arXiv:2407.07882 [pdf, other]

Information dynamics in decohered quantum memory with repeated syndrome measurements: a dual approach

Authors: Jacob Hauser, Yimu Bao, Shengqi Sang, Ali Lavasani, Utkarsh Agrawal, Matthew P. A. Fisher

Abstract: Measurements can detect errors in a decohered quantum memory allowing active error correction to increase the memory time. Previous understanding of this mechanism has focused on evaluating the performance of error correction algorithms based on measurement results. In this work, we instead intrinsically characterize the information dynamics in a quantum memory under repeated measurements, using c… ▽ More Measurements can detect errors in a decohered quantum memory allowing active error correction to increase the memory time. Previous understanding of this mechanism has focused on evaluating the performance of error correction algorithms based on measurement results. In this work, we instead intrinsically characterize the information dynamics in a quantum memory under repeated measurements, using coherent information and relative entropy. We consider the dynamics of a $d$-dimensional stabilizer code subject to Pauli errors and noisy stabilizer measurements and develop a $(d+1)$-dimensional statistical mechanics model for the information-theoretic diagnostics. Our model is dual to the model previously obtained for the optimal decoding algorithm, and the potential decoding transition in the quantum memory again manifests as a thermal phase transition in the statistical mechanics model. We explicitly derive the model and study the phase transition in information encoding in three examples: surface codes, repetition codes, and the XZZX code. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 27 pages, 9 figures

arXiv:2407.02829 [pdf, other]

Mirage Sources and Large TeV Halo-Pulsar Offsets: Exploring the Parameter Space

Authors: Yiwei Bao, Ruo-Yu Liu, Gwenael Giacinti, Hai-Ming Zhang, Yang Chen

Abstract: We investigate the asymmetric propagation of 100 TeV electrons (whose radiation mainly concentrates on 20--30 TeV) in turbulent magnetic fields around pulsars, using GPU-accelerated simulations to explore their trajectories and interactions within pulsar wind nebulae and the interstellar medium. Key results include the identification of ``mirage'' sources indicating significant offsets in high-ene… ▽ More We investigate the asymmetric propagation of 100 TeV electrons (whose radiation mainly concentrates on 20--30 TeV) in turbulent magnetic fields around pulsars, using GPU-accelerated simulations to explore their trajectories and interactions within pulsar wind nebulae and the interstellar medium. Key results include the identification of ``mirage'' sources indicating significant offsets in high-energy emissions from their originating pulsars, challenging the results of traditional symmetric diffusion models. By varying parameters like source distance, magnetic field strength, and electron injection spectral index, the study delineates their effects on observable phenomena such as the probability that a source has at least one mirage around it, as well as the source separation. Our results offer insights into some puzzling sources observed recently by the Large High Altitude Air Shower Observatory (LHAASO), and shed light on the cosmic-ray transport mechanism in the interstellar medium. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02727 [pdf, other]

Long-lived magnetization in an atomic spin chain tuned to a diabolic point

Authors: R. J. G. Elbertse, D. Borodin, J. Oh, T. Ahn, J. Hwang, J. C. Rietveld, A. J. Heinrich, F. Delgado, S. Otte, Y. Bae

Abstract: Scaling magnets down to where quantum size effects become prominent triggers quantum tunneling of magnetization (QTM), profoundly influencing magnetization dynamics. Measuring magnetization switching in an Fe atomic chain under a carefully tuned transverse magnetic field, we observe a non-monotonic variation of magnetization lifetimes around a level crossing, known as the diabolic point (DP). Near… ▽ More Scaling magnets down to where quantum size effects become prominent triggers quantum tunneling of magnetization (QTM), profoundly influencing magnetization dynamics. Measuring magnetization switching in an Fe atomic chain under a carefully tuned transverse magnetic field, we observe a non-monotonic variation of magnetization lifetimes around a level crossing, known as the diabolic point (DP). Near DPs, local environment effects causing QTM are efficiently suppressed, enhancing lifetimes by three orders of magnitude. Adjusting interatomic interactions further facilitates multiple DPs. Our study provides a deeper understanding of quantum dynamics near DPs and enhances our ability to engineer a quantum magnet. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: Main text and Supplementary

arXiv:2407.02478 [pdf, other]

Mirages and Large TeV Halo-Pulsar Offsets from Cosmic Ray Propagation

Authors: Yiwei Bao, Gwenael Giacinti, Ruo-Yu Liu, Hai-Ming Zhang, Yang Chen

Abstract: The study of extended $γ$-ray sources usually assumes symmetric diffusion of cosmic rays. However, recent observations of multiple sources near single pulsars and significant offsets between TeV halo centroids and their parent pulsars suggest that this assumption is overly simplistic. In this Letter, we demonstrate that asymmetric propagation of cosmic rays near their accelerators may create multi… ▽ More The study of extended $γ$-ray sources usually assumes symmetric diffusion of cosmic rays. However, recent observations of multiple sources near single pulsars and significant offsets between TeV halo centroids and their parent pulsars suggest that this assumption is overly simplistic. In this Letter, we demonstrate that asymmetric propagation of cosmic rays near their accelerators may create multiple TeV sources instead of a single symmetric source. This mechanism also explains the large offsets between TeV halo centroids and their pulsars. We demonstrate that several perplexing detected sources can be naturally explained without invoking additional invisible accelerators. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.17565 [pdf, other]

MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool

Authors: Cunchen Hu, Heyang Huang, Junhao Hu, Jiang Xu, Xusheng Chen, Tao Xie, Chenxi Wang, Sa Wang, Yungang Bao, Ninghui Sun, Yizhou Shan

Abstract: Large language model (LLM) serving has transformed from stateless to stateful systems, utilizing techniques like context caching and disaggregated inference. These optimizations extend the lifespan and domain of the KV cache, necessitating a new architectural approach. We present MemServe, a unified system that integrates both inter-request and intra-request optimizations. MemServe introduces MemP… ▽ More Large language model (LLM) serving has transformed from stateless to stateful systems, utilizing techniques like context caching and disaggregated inference. These optimizations extend the lifespan and domain of the KV cache, necessitating a new architectural approach. We present MemServe, a unified system that integrates both inter-request and intra-request optimizations. MemServe introduces MemPool, an elastic memory pool managing distributed memory and KV caches across serving instances. Using MemPool APIs, MemServe combines context caching with disaggregated inference for the first time, supported by a global scheduler that enhances cache reuse through a global prompt tree-based locality-aware policy. Tests show that MemServe significantly improves job completion time and time-to-first-time. △ Less

Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17267 [pdf, other]

doi 10.1364/OE.527862

Efficient source-independent quantum conference key agreement

Authors: Yu Bao, Yi-Ran Xiao, Yu-Chen Song, Yao Fu, Xiao-Yu Cao, Hua-Lei Yin, Zeng-Bing Chen

Abstract: Quantum conference key agreement (QCKA) enables the unconditional secure distribution of conference keys among multiple participants. Due to challenges in high-fidelity preparation and long-distance distribution of multi-photon entanglement, entanglement-based QCKA is facing severe limitations in both key rate and scalability. Here, we propose a source-independent QCKA scheme utilizing the post-ma… ▽ More Quantum conference key agreement (QCKA) enables the unconditional secure distribution of conference keys among multiple participants. Due to challenges in high-fidelity preparation and long-distance distribution of multi-photon entanglement, entanglement-based QCKA is facing severe limitations in both key rate and scalability. Here, we propose a source-independent QCKA scheme utilizing the post-matching method, feasible within the entangled photon pair distribution network. We introduce an equivalent distributing virtual multi-photon entanglement protocol for providing the unconditional security proof even in the case of coherent attacks. For the symmetry star-network, comparing with previous $n$-photon entanglement protocol, the conference key rate is improved from $O(η^{n})$ to $O(η^{2})$, where $η$ is the transmittance from the entanglement source to one participant. Simulation results show that the performance of our protocol has multiple orders of magnitude advantages in the intercity distance. We anticipate that our approach will demonstrate its potential in the implementation of quantum networks. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 10 pages, 6 figures

Journal ref: Optics Express 32, 24629 (2024)

arXiv:2406.12588 [pdf, other]

UIFV: Data Reconstruction Attack in Vertical Federated Learning

Authors: Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao

Abstract: Vertical Federated Learning (VFL) facilitates collaborative machine learning without the need for participants to share raw private data. However, recent studies have revealed privacy risks where adversaries might reconstruct sensitive features through data leakage during the learning process. Although data reconstruction methods based on gradient or model information are somewhat effective, they… ▽ More Vertical Federated Learning (VFL) facilitates collaborative machine learning without the need for participants to share raw private data. However, recent studies have revealed privacy risks where adversaries might reconstruct sensitive features through data leakage during the learning process. Although data reconstruction methods based on gradient or model information are somewhat effective, they reveal limitations in VFL application scenarios. This is because these traditional methods heavily rely on specific model structures and/or have strict limitations on application scenarios. To address this, our study introduces the Unified InverNet Framework into VFL, which yields a novel and flexible approach (dubbed UIFV) that leverages intermediate feature data to reconstruct original data, instead of relying on gradients or model details. The intermediate feature data is the feature exchanged by different participants during the inference phase of VFL. Experiments on four datasets demonstrate that our methods significantly outperform state-of-the-art techniques in attack precision. Our work exposes severe privacy vulnerabilities within VFL systems that pose real threats to practical VFL applications and thus confirms the necessity of further enhancing privacy protection in the VFL architecture. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.09643 [pdf, other]

Reinforced Decoder: Towards Training Recurrent Neural Networks for Time Series Forecasting

Authors: Qi Sima, Xinze Zhang, Yukun Bao, Siyue Yang, Liang Shen

Abstract: Recurrent neural network-based sequence-to-sequence models have been extensively applied for multi-step-ahead time series forecasting. These models typically involve a decoder trained using either its previous forecasts or the actual observed values as the decoder inputs. However, relying on self-generated predictions can lead to the rapid accumulation of errors over multiple steps, while using th… ▽ More Recurrent neural network-based sequence-to-sequence models have been extensively applied for multi-step-ahead time series forecasting. These models typically involve a decoder trained using either its previous forecasts or the actual observed values as the decoder inputs. However, relying on self-generated predictions can lead to the rapid accumulation of errors over multiple steps, while using the actual observations introduces exposure bias as these values are unavailable during the extrapolation stage. In this regard, this study proposes a novel training approach called reinforced decoder, which introduces auxiliary models to generate alternative decoder inputs that remain accessible when extrapolating. Additionally, a reinforcement learning algorithm is utilized to dynamically select the optimal inputs to improve accuracy. Comprehensive experiments demonstrate that our approach outperforms representative training methods over several datasets. Furthermore, the proposed approach also exhibits promising performance when generalized to self-attention-based sequence-to-sequence forecasting models. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 12 pages,8 figures

arXiv:2406.08698 [pdf, other]

Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 17 pages, 12 figures, accepted by PRL

arXiv:2406.06559 [pdf, other]

Harnessing Business and Media Insights with Large Language Models

Authors: Yujia Bao, Ankit Parag Shah, Neeru Narang, Jonathan Rivers, Rajeev Maksey, Lan Guan, Louise N. Barrere, Shelley Evenson, Rahul Basole, Connie Miao, Ankit Mehta, Fabien Boulay, Su Min Park, Natalie E. Pearson, Eldhose Joy, Tiger He, Sumiran Thakur, Koustav Ghosal, Josh On, Phoebe Morrison, Tim Major, Eva Siqi Wang, Gina Escobar, Jiaheng Wei, Tharindu Cyril Weerasooriya , et al. (8 additional authors not shown)

Abstract: This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users… ▽ More This paper introduces Fortune Analytics Language Model (FALM). FALM empowers users with direct access to comprehensive business analysis, including market trends, company performance metrics, and expert insights. Unlike generic LLMs, FALM leverages a curated knowledge base built from professional journalism, enabling it to deliver precise and in-depth answers to intricate business questions. Users can further leverage natural language queries to directly visualize financial data, generating insightful charts and graphs to understand trends across diverse business sectors clearly. FALM fosters user trust and ensures output accuracy through three novel methods: 1) Time-aware reasoning guarantees accurate event registration and prioritizes recent updates. 2) Thematic trend analysis explicitly examines topic evolution over time, providing insights into emerging business landscapes. 3) Content referencing and task decomposition enhance answer fidelity and data visualization accuracy. We conduct both automated and human evaluations, demonstrating FALM's significant performance improvements over baseline methods while prioritizing responsible AI practices. These benchmarks establish FALM as a cutting-edge LLM in the business and media domains, with exceptional accuracy and trustworthiness. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.04888 [pdf, other]

Zero-Shot Video Editing through Adaptive Sliding Score Distillation

Authors: Lianghan Zhu, Yanqi Bao, Jing Huo, Jing Wu, Yu-Kun Lai, Wenbin Li, Yang Gao

Abstract: The rapidly evolving field of Text-to-Video generation (T2V) has catalyzed renewed interest in controllable video editing research. While the application of editing prompts to guide diffusion model denoising has gained prominence, mirroring advancements in image editing, this noise-based inference process inherently compromises the original video's integrity, resulting in unintended over-editing a… ▽ More The rapidly evolving field of Text-to-Video generation (T2V) has catalyzed renewed interest in controllable video editing research. While the application of editing prompts to guide diffusion model denoising has gained prominence, mirroring advancements in image editing, this noise-based inference process inherently compromises the original video's integrity, resulting in unintended over-editing and temporal discontinuities. To address these challenges, this study proposes a novel paradigm of video-based score distillation, facilitating direct manipulation of original video content. Specifically, distinguishing it from image-based score distillation, we propose an Adaptive Sliding Score Distillation strategy, which incorporates both global and local video guidance to reduce the impact of editing errors. Combined with our proposed Image-based Joint Guidance mechanism, it has the ability to mitigate the inherent instability of the T2V model and single-step sampling. Additionally, we design a Weighted Attention Fusion module to further preserve the key features of the original video and avoid over-editing. Extensive experiments demonstrate that these strategies effectively address existing challenges, achieving superior performance compared to current state-of-the-art methods. △ Less

Submitted 6 September, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

arXiv:2406.00396 [pdf, other]

Stochastic Restarting to Overcome Overfitting in Neural Networks with Noisy Labels

Authors: Youngkyoung Bae, Yeongwoo Song, Hawoong Jeong

Abstract: Despite its prevalence, giving up and starting over may seem wasteful in many situations such as searching for a target or training deep neural networks (DNNs). Our study, though, demonstrates that restarting from a checkpoint can significantly improve generalization performance when training DNNs with noisy labels. In the presence of noisy labels, DNNs initially learn the general patterns of the… ▽ More Despite its prevalence, giving up and starting over may seem wasteful in many situations such as searching for a target or training deep neural networks (DNNs). Our study, though, demonstrates that restarting from a checkpoint can significantly improve generalization performance when training DNNs with noisy labels. In the presence of noisy labels, DNNs initially learn the general patterns of the data but then gradually overfit to the noisy labels. To combat this overfitting phenomenon, we developed a method based on stochastic restarting, which has been actively explored in the statistical physics field for finding targets efficiently. By approximating the dynamics of stochastic gradient descent into Langevin dynamics, we theoretically show that restarting can provide great improvements as the batch size and the proportion of corrupted data increase. We then empirically validate our theory, confirming the significant improvements achieved by restarting. An important aspect of our method is its ease of implementation and compatibility with other methods, while still yielding notably improved performance. We envision it as a valuable tool that can complement existing methods for handling noisy labels. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 21 pages, 10 figures

arXiv:2405.17315 [pdf, other]

All-day Depth Completion

Authors: Vadim Ezhov, Hyoungseob Park, Zhaoyang Zhang, Rishi Upadhyay, Howard Zhang, Chethan Chinder Chandrappa, Achuta Kadambi, Yunhao Ba, Julie Dorsey, Alex Wong

Abstract: We propose a method for depth estimation under different illumination conditions, i.e., day and night time. As photometry is uninformative in regions under low-illumination, we tackle the problem through a multi-sensor fusion approach, where we take as input an additional synchronized sparse point cloud (i.e., from a LiDAR) projected onto the image plane as a sparse depth map, along with a camera… ▽ More We propose a method for depth estimation under different illumination conditions, i.e., day and night time. As photometry is uninformative in regions under low-illumination, we tackle the problem through a multi-sensor fusion approach, where we take as input an additional synchronized sparse point cloud (i.e., from a LiDAR) projected onto the image plane as a sparse depth map, along with a camera image. The crux of our method lies in the use of the abundantly available synthetic data to first approximate the 3D scene structure by learning a mapping from sparse to (coarse) dense depth maps along with their predictive uncertainty - we term this, SpaDe. In poorly illuminated regions where photometric intensities do not afford the inference of local shape, the coarse approximation of scene depth serves as a prior; the uncertainty map is then used with the image to guide refinement through an uncertainty-driven residual learning (URL) scheme. The resulting depth completion network leverages complementary strengths from both modalities - depth is sparse but insensitive to illumination and in metric scale, and image is dense but sensitive with scale ambiguity. SpaDe can be used in a plug-and-play fashion, which allows for 25% improvement when augmented onto existing methods to preprocess sparse depth. We demonstrate URL on the nuScenes dataset where we improve over all baselines by an average 11.65% in all-day scenarios, 11.23% when tested specifically for daytime, and 13.12% for nighttime scenes. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 8 pages, 4 figures

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.09193 [pdf, other]

Autonomous Cooperative Levels of Multiple-Heterogeneous Unmanned Vehicle Systems

Authors: Yoo-Bin Bae, Yeong-Ung Kim, Jun-Oh Park, Hyo-Sung Ahn

Abstract: As multiple and heterogenous unmanned vehicle systems continue to play an increasingly important role in addressing complex missions in the real world, the need for effective cooperation among unmanned vehicles becomes paramount. The concept of autonomous cooperation, wherein unmanned vehicles cooperate without human intervention or human control, offers promising avenues for enhancing the efficie… ▽ More As multiple and heterogenous unmanned vehicle systems continue to play an increasingly important role in addressing complex missions in the real world, the need for effective cooperation among unmanned vehicles becomes paramount. The concept of autonomous cooperation, wherein unmanned vehicles cooperate without human intervention or human control, offers promising avenues for enhancing the efficiency and adaptability of intelligence of multiple-heterogeneous unmanned vehicle systems. Despite the growing interests in this domain, as far as the authors are concerned, there exists a notable lack of comprehensive literature on defining explicit concept and classifying levels of autonomous cooperation of multiple-heterogeneous unmanned vehicle systems. In this aspect, this article aims to define the explicit concept of autonomous cooperation of multiple-heterogeneous unmanned vehicle systems. Furthermore, we provide a novel criterion to assess the technical maturity of the developed unmanned vehicle systems by classifying the autonomous cooperative levels of multiple-heterogeneous unmanned vehicle systems. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.08385 [pdf, ps, other]

doi 10.3847/1538-4357/ad3939

Regions of suppressed diffusion around supernova remnants?

Authors: Yiwei Bao, Pasquale Blasi, Yang Chen

Abstract: The recent discovery of the so-called TeV halos has attracted much attention. The morphology of the emission requires that the region is characterized by severe suppression of the diffusion coefficient. This finding raises many questions as to its origin: 1) is the suppressed diffusion to be attributed to instabilities induced by the same radiating particles? 2) or does it actually show that the d… ▽ More The recent discovery of the so-called TeV halos has attracted much attention. The morphology of the emission requires that the region is characterized by severe suppression of the diffusion coefficient. This finding raises many questions as to its origin: 1) is the suppressed diffusion to be attributed to instabilities induced by the same radiating particles? 2) or does it actually show that the diffusion coefficient is small throughout the disc of the Galaxy? In both cases, one would expect that the surroundings of supernova remnants (SNRs) should also show evidence of reduced diffusion coefficient, since most remnants are located in the disc and are expected to be sites of effective particle acceleration. Should we expect the existence of regions of extended $γ$-ray emission from these regions as well? Here we investigate the transport of cosmic rays (CRs) escaped from SNRs in order to assess the viability of the idea of having a cocoon of suppressed diffusion around them. A comparison of our results with the $γ$-ray emission from the regions around HB9 and W28 does not provide solid evidence of reduced diffusivity. However, if indeed the phenomenon of reduced diffusivity occurs around SNRs surrounded by molecular clouds, our calculations show that the effects on the grammage of Galactic CRs can be significant. △ Less

Submitted 18 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: published in ApJ

Journal ref: 2024 ApJ 966 224

arXiv:2405.07964 [pdf, other]

Early-phase simultaneous multiband observations of the Type II supernova SN 2024ggi with Mephisto

Authors: Xinlei Chen, Brajesh Kumar, Xinzhong Er, Helong Guo, Yuan-Pei Yang, Weikang Lin, Yuan Fang, Guowang Du, Chenxu Liu, Jiewei Zhao, Tianyu Zhang, Yuxi Bao, Xingzhu Zou, Yu Pan, Yu Wang, Xufeng Zhu, Kaushik Chatterjee, Xiangkun Liu, Dezi Liu, Edoardo P. Lagioia, Geeta Rangwal, Shiyan Zhong, Jinghua Zhang, Jianhui Lian, Yongzhi Cai , et al. (2 additional authors not shown)

Abstract: We present early-phase good-cadence (hour-to-day) simultaneous multiband ($ugi$ and $vrz$ bands) imaging of the nearby supernova SN~2024ggi, which exploded in the nearby galaxy, NGC 3621. A quick follow-up was conducted within less than a day after the explosion and continued $\sim$23 days. The $uvg$ band light curves display a rapid rise ($\sim$1.4 mag day$^{-1}$) to maximum in $\sim$4 days and a… ▽ More We present early-phase good-cadence (hour-to-day) simultaneous multiband ($ugi$ and $vrz$ bands) imaging of the nearby supernova SN~2024ggi, which exploded in the nearby galaxy, NGC 3621. A quick follow-up was conducted within less than a day after the explosion and continued $\sim$23 days. The $uvg$ band light curves display a rapid rise ($\sim$1.4 mag day$^{-1}$) to maximum in $\sim$4 days and absolute magnitude $M_{g}\sim$--17.75 mag. The post-peak decay rate in redder bands is $\sim$0.01 mag day$^{-1}$. Different colors (e.g., $u-g$ and $v-r$) of SN~2024ggi are slightly redder than SN 2023ixf. A significant rise ($\sim$12.5 kK) in black-body temperature (optical) was noticed within $\sim$2 days after the explosion, which successively decreased, indicating shock break out inside a dense circumstellar medium (CSM) surrounding the progenitor. Using semianalytical modeling, the ejecta mass and progenitor radius were estimated as 1.2 $M_\odot$ and $\sim$550 $R_\odot$. The archival deep images ($g,r,i and z$ bands) from the Dark Energy Camera Legacy Survey were examined, and a possible progenitor was detected in each band ($\sim$22--22.5 mag) and had a mass range of 14--17 $M_\odot$. △ Less

Submitted 2 August, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

Comments: Pages 12, Table 1, Figures 7

Journal ref: ApJL, 2024, 971:L2

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.06598 [pdf, other]

A Lightweight Transformer for Remote Sensing Image Change Captioning

Authors: Dongwei Sun, Yajie Bao, Xiangyong Cao

Abstract: Remote sensing image change captioning (RSICC) aims to automatically generate sentences that describe content differences in remote sensing bitemporal images. Recently, attention-based transformers have become a prevalent idea for capturing the features of global change. However, existing transformer-based RSICC methods face challenges, e.g., high parameters and high computational complexity cause… ▽ More Remote sensing image change captioning (RSICC) aims to automatically generate sentences that describe content differences in remote sensing bitemporal images. Recently, attention-based transformers have become a prevalent idea for capturing the features of global change. However, existing transformer-based RSICC methods face challenges, e.g., high parameters and high computational complexity caused by the self-attention operation in the transformer encoder component. To alleviate these issues, this paper proposes a Sparse Focus Transformer (SFT) for the RSICC task. Specifically, the SFT network consists of three main components, i.e. a high-level features extractor based on a convolutional neural network (CNN), a sparse focus attention mechanism-based transformer encoder network designed to locate and capture changing regions in dual-temporal images, and a description decoder that embeds images and words to generate sentences for captioning differences. The proposed SFT network can reduce the parameter number and computational complexity by incorporating a sparse attention mechanism within the transformer encoder network. Experimental results on various datasets demonstrate that even with a reduction of over 90\% in parameters and computational complexity for the transformer encoder, our proposed network can still obtain competitive performance compared to other state-of-the-art RSICC methods. The code can be available at △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.05557 [pdf, ps, other]

Composition Rules for Strong Structural Controllability and Minimum Input Problem in Diffusively-Coupled Networks

Authors: Nam-Jin Park, Seong-Ho Kwon, Yoo-Bin Bae, Byeong-Yeon Kim, Kevin L. Moore, Hyo-Sung Ahn

Abstract: This paper presents new results and reinterpretation of existing conditions for strong structural controllability in a structured network determined by the zero/non-zero patterns of edges. For diffusively-coupled networks with self-loops, we first establish a necessary and sufficient condition for strong structural controllability, based on the concepts of dedicated and sharing nodes. Subsequently… ▽ More This paper presents new results and reinterpretation of existing conditions for strong structural controllability in a structured network determined by the zero/non-zero patterns of edges. For diffusively-coupled networks with self-loops, we first establish a necessary and sufficient condition for strong structural controllability, based on the concepts of dedicated and sharing nodes. Subsequently, we define several conditions for strong structural controllability across various graph types by decomposing them into disjoint path graphs. We further extend our findings by introducing a composition rule, facilitating the analysis of strong structural controllability in larger networks. This rule allows us to determine the strong structural controllability of connected graphs called pactus graphs (a generalization of the well-known cactus graph) by consideration of the strong structural controllability of its disjoint component graphs. In this process, we introduce the notion of a component input node, which is a state node that functions identically to an external input node. Based on this concept, we present an algorithm with approximate polynomial complexity to determine the minimum number of external input nodes required to maintain strong structural controllability in a diffusively-coupled network with self-loops. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures. arXiv admin note: substantial text overlap with arXiv:2205.05275

arXiv:2404.17837 [pdf, other]

Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs

Authors: Yiming Bao, Xu Zhao, Dahong Qian

Abstract: Temporal 3D human pose estimation from monocular videos is a challenging task in human-centered computer vision due to the depth ambiguity of 2D-to-3D lifting. To improve accuracy and address occlusion issues, inertial sensor has been introduced to provide complementary source of information. However, it remains challenging to integrate heterogeneous sensor data for producing physically rational 3… ▽ More Temporal 3D human pose estimation from monocular videos is a challenging task in human-centered computer vision due to the depth ambiguity of 2D-to-3D lifting. To improve accuracy and address occlusion issues, inertial sensor has been introduced to provide complementary source of information. However, it remains challenging to integrate heterogeneous sensor data for producing physically rational 3D human poses. In this paper, we propose a novel framework, Real-time Optimization and Fusion (RTOF), to address this issue. We first incorporate sparse inertial orientations into a parametric human skeleton to refine 3D poses in kinematics. The poses are then optimized by energy functions built on both visual and inertial observations to reduce the temporal jitters. Our framework outputs smooth and biomechanically plausible human motion. Comprehensive experiments with ablation studies demonstrate its rationality and efficiency. On Total Capture dataset, the pose estimation error is significantly decreased compared to the baseline method. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Comments: 10 pages, 5 figures, Under Review

arXiv:2404.17582 [pdf, other]

Data Quality in Crowdsourcing and Spamming Behavior Detection

Authors: Yang Ba, Michelle V. Mancenido, Erin K. Chiou, Rong Pan

Abstract: As crowdsourcing emerges as an efficient and cost-effective method for obtaining labels for machine learning datasets, it is important to assess the quality of crowd-provided data, so as to improve analysis performance and reduce biases in subsequent machine learning tasks. Given the lack of ground truth in most cases of crowdsourcing, we refer to data quality as annotators' consistency and credib… ▽ More As crowdsourcing emerges as an efficient and cost-effective method for obtaining labels for machine learning datasets, it is important to assess the quality of crowd-provided data, so as to improve analysis performance and reduce biases in subsequent machine learning tasks. Given the lack of ground truth in most cases of crowdsourcing, we refer to data quality as annotators' consistency and credibility. Unlike the simple scenarios where Kappa coefficient and intraclass correlation coefficient usually can apply, online crowdsourcing requires dealing with more complex situations. We introduce a systematic method for evaluating data quality and detecting spamming threats via variance decomposition, and we classify spammers into three categories based on their different behavioral patterns. A spammer index is proposed to assess entire data consistency and two metrics are developed to measure crowd worker's credibility by utilizing the Markov chain and generalized random effects models. Furthermore, we showcase the practicality of our techniques and their advantages by applying them on a face verification task with both simulation and real-world data collected from two crowdsourcing platforms. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: Preprint paper, under review on Behavior Research Methods. 45 pages, 10 figures

arXiv:2404.16831 [pdf, other]

The Third Monocular Depth Estimation Challenge

Authors: Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, YiPing Bao, Xiao Liu, Dohyeong Kim, Jinseong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, Jinqiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora , et al. (16 additional authors not shown)

Abstract: This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su… ▽ More This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 submissions outperforming the baseline on the test set: 10 among them submitted a report describing their approach, highlighting a diffused use of foundational models such as Depth Anything at the core of their method. The challenge winners drastically improved 3D F-Score performance, from 17.51% to 23.72%. △ Less

Submitted 27 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

Comments: To appear in CVPRW2024

arXiv:2404.11929 [pdf, other]

A Symmetric Regressor for MRI-Based Assessment of Striatal Dopamine Transporter Uptake in Parkinson's Disease

Authors: Walid Abdullah Al, Il Dong Yun, Yun Jung Bae

Abstract: Dopamine transporter (DAT) imaging is commonly used for monitoring Parkinson's disease (PD), where striatal DAT uptake amount is computed to assess PD severity. However, DAT imaging has a high cost and the risk of radiance exposure and is not available in general clinics. Recently, MRI patch of the nigral region has been proposed as a safer and easier alternative. This paper proposes a symmetric r… ▽ More Dopamine transporter (DAT) imaging is commonly used for monitoring Parkinson's disease (PD), where striatal DAT uptake amount is computed to assess PD severity. However, DAT imaging has a high cost and the risk of radiance exposure and is not available in general clinics. Recently, MRI patch of the nigral region has been proposed as a safer and easier alternative. This paper proposes a symmetric regressor for predicting the DAT uptake amount from the nigral MRI patch. Acknowledging the symmetry between the right and left nigrae, the proposed regressor incorporates a paired input-output model that simultaneously predicts the DAT uptake amounts for both the right and left striata. Moreover, it employs a symmetric loss that imposes a constraint on the difference between right-to-left predictions, resembling the high correlation in DAT uptake amounts in the two lateral sides. Additionally, we propose a symmetric Monte-Carlo (MC) dropout method for providing a fruitful uncertainty estimate of the DAT uptake prediction, which utilizes the above symmetry. We evaluated the proposed approach on 734 nigral patches, which demonstrated significantly improved performance of the symmetric regressor compared with the standard regressors while giving better explainability and feature representation. The symmetric MC dropout also gave precise uncertainty ranges with a high probability of including the true DAT uptake amounts within the range. △ Less

Submitted 30 July, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.09482 [pdf, other]

Binary microlensing by high eccentric stellar-mass black hole binaries

Authors: Kyungmin Kim, Yeong-Bok Bae, Yoon-Hyun Ryu

Abstract: Microlensing is one of the most promising tools for discovering stellar-mass black holes (BHs) in the Milky Way because it allows us to probe dark or faint celestial compact objects. While the existence of stellar-mass BHs has been confirmed through observation of X-ray binaries within our galaxy and gravitational waves from extragalactic BH binaries, a conclusive observation of microlensing event… ▽ More Microlensing is one of the most promising tools for discovering stellar-mass black holes (BHs) in the Milky Way because it allows us to probe dark or faint celestial compact objects. While the existence of stellar-mass BHs has been confirmed through observation of X-ray binaries within our galaxy and gravitational waves from extragalactic BH binaries, a conclusive observation of microlensing events caused by Galactic BH binaries has yet to be achieved. In this study, we focus on those with high eccentricity, including unbound orbits, which can dynamically form in star clusters and could potentially increase the observation rate. We demonstrate parameter estimation for simulated light curves supposing various orbital configurations of BH binary lenses. We employ a model-based fitting using the Nelder-Mead method and Bayesian inference based on the Markov chain Monte Carlo method for the demonstration. The results show that we can retrieve true values of the parameters of high eccentric BH binary lenses within the 1$σ$ uncertainty of inferred values. We conclude it is feasible to find high eccentric Galactic BH binaries from the observation of binary microlensing events. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 12 pages, 9 figures, 4 tables

arXiv:2404.08217 [pdf, other]

Avoid Arguments and Escape with Your Self: Expressive Subtyping and Decidable Bidirectional Checking for Reachability Types

Authors: Songlin Jia, Guannan Wei, Siyuan He, Yuyan Bao, Tiark Rompf

Abstract: Despite Rust's success in systems programming, its ``shared XOR mutable'' principle significantly restricts how mutable values can be used, precluding many useful functional programming idioms. Reachability types are a recent proposal to address the key limitations of Rust-style approaches by tracking, rather than prohibiting, shared, escaping, and mutable data, even in the presence of higher-orde… ▽ More Despite Rust's success in systems programming, its ``shared XOR mutable'' principle significantly restricts how mutable values can be used, precluding many useful functional programming idioms. Reachability types are a recent proposal to address the key limitations of Rust-style approaches by tracking, rather than prohibiting, shared, escaping, and mutable data, even in the presence of higher-order functions and polymorphic types. The key to enabling such expressiveness is the notion of self-references in reachability qualifiers. However, self-references present major challenges in designing expressive subtyping and decidable type checking algorithms, since self-references are neither fully covariant nor fully contravariant, yet still need to vary in certain circumstances. This lack of an effective type checking algorithm is a key impediment toward making reachability types truly practical, and leveraging them to bring the benefits of programming with lifetimes and sharing to practical higher-level languages. In this paper, we investigate the issues of subtyping and type checking of self-references for reachability types. We address key gaps in previous work by proposing a refined notion of subtyping, which more smoothly supports features such as Church-encoded datatypes, making the overall system more expressive. We also develop a sound and decidable bidirectional type checking algorithm, implemented and verified in Coq. △ Less

Submitted 15 July, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.04801 [pdf, ps, other]

doi 10.1007/s41605-024-00467-8

LHAASO-KM2A detector simulation using Geant4

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (254 additional authors not shown)

Abstract: KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with… ▽ More KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with large altitude difference (30 m) and huge coverage (1.3 km^2). In this paper, the design of the KM2A simulation code G4KM2A based on Geant4 is introduced. The process of G4KM2A is optimized mainly in memory consumption to avoid memory overffow. Some simpliffcations are used to signiffcantly speed up the execution of G4KM2A. The running time is reduced by at least 30 times compared to full detector simulation. The particle distributions and the core/angle resolution comparison between simulation and experimental data of the full KM2A array are also presented, which show good agreement. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2403.20134 [pdf, other]

User Modeling Challenges in Interactive AI Assistant Systems

Authors: Megan Su, Yuwei Bao

Abstract: Interactive Artificial Intelligent(AI) assistant systems are designed to offer timely guidance to help human users to complete a variety tasks. One of the remaining challenges is to understand user's mental states during the task for more personalized guidance. In this work, we analyze users' mental states during task executions and investigate the capabilities and challenges for large language mo… ▽ More Interactive Artificial Intelligent(AI) assistant systems are designed to offer timely guidance to help human users to complete a variety tasks. One of the remaining challenges is to understand user's mental states during the task for more personalized guidance. In this work, we analyze users' mental states during task executions and investigate the capabilities and challenges for large language models to interpret user profiles for more personalized user guidance. △ Less

Submitted 29 March, 2024; originally announced March 2024.

arXiv:2403.19821 [pdf]

Formation of Oriented Bilayer Motif -- Vanadyl Phthalocyanine on Ag(100)

Authors: William Koll, Corina Urdaniz, Kyungju Noh, Yujeong Bae, Christoph Wolf, Jay Gupta

Abstract: The adsorption and self-assembly of vanadyl phthalocyanine molecules on Ag(100) has been investigated using a combination of scanning tunneling microscopy and density functional theory. At sub-monolayer coverage, we observe two distinct adsorption configurations of isolated molecules, corresponding to the central O atom pointing toward (O-down) or away (O-up) from the substrate. Upon adsorption in… ▽ More The adsorption and self-assembly of vanadyl phthalocyanine molecules on Ag(100) has been investigated using a combination of scanning tunneling microscopy and density functional theory. At sub-monolayer coverage, we observe two distinct adsorption configurations of isolated molecules, corresponding to the central O atom pointing toward (O-down) or away (O-up) from the substrate. Upon adsorption in the O-up orientation, the otherwise achiral molecules take on a windmill-like chiral appearance due to their interaction with the substrate. At monolayer coverage, we observe a self-assembled square lattice with a mixture of O-up and O-down molecules. At higher coverage we find a strong preference for bilayer formation with O-up and O-down molecules in alternating layers, suggesting stabilization by dipolar interactions. Close inspection of the multi-layer surface reveals grain boundaries separating domains of opposite organizational chirality, and long-range ordering. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 15 pages, 5 figures

arXiv:2403.17069 [pdf, other]

Tensor network formulation of symmetry protected topological phases in mixed states

Authors: Hanyu Xue, Jong Yeon Lee, Yimu Bao

Abstract: We define and classify symmetry-protected topological (SPT) phases in mixed states based on the tensor network formulation of the density matrix. In one dimension, we introduce strong injective matrix product density operators (MPDO), which describe a broad class of short-range correlated mixed states, including the locally decohered SPT states. We map strong injective MPDO to a pure state in the… ▽ More We define and classify symmetry-protected topological (SPT) phases in mixed states based on the tensor network formulation of the density matrix. In one dimension, we introduce strong injective matrix product density operators (MPDO), which describe a broad class of short-range correlated mixed states, including the locally decohered SPT states. We map strong injective MPDO to a pure state in the doubled Hilbert space and define the SPT phases according to the cohomology class of the symmetry group in the doubled state. Although the doubled state exhibits an enlarged symmetry, the possible SPT phases are also constrained by the Hermiticity and the semi-positivity of the density matrix. We here obtain a complete classification of SPT phases with a direct product of strong $G$ and weak $K$ unitary symmetry given by the cohomology group $H^2(G, \text{U}(1))\oplus H^1(K, H^1(G, \text{U}(1)))$. The SPT phases in our definition are preserved under symmetric local circuits consisting of non-degenerate channels. This motivates an alternative definition of SPT phases according to the equivalence class of mixed states under a ``one-way" connection using symmetric non-degenerate channels. In locally purifiable MPDO with strong symmetry, we prove that this alternative definition reproduces the cohomology classification. We further extend our results to two-dimensional mixed states described by strong semi-injective tensor network density operators and classify the possible SPT phases. △ Less

Submitted 15 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: Appendix D is fixed

arXiv:2403.16541 [pdf, ps, other]

Effects of tensor spin polarization on the chiral restoration and deconfinement phase transitions

Authors: Yan-Ru Bao, Sheng-Qin Feng

Abstract: Effects of tensor spin polarization (TSP) on the chiral restoration and deconfinement phase transitions are studied in Polyakov loop extended Nambu-Jona-Lasinio (PNJL) model. For chiral phase transition, the higher the polarized degree of quark-antiquark pairs under the strong magnetic field, the higher the phase transition temperature. The TSP corrects the position of the critical end point. The… ▽ More Effects of tensor spin polarization (TSP) on the chiral restoration and deconfinement phase transitions are studied in Polyakov loop extended Nambu-Jona-Lasinio (PNJL) model. For chiral phase transition, the higher the polarized degree of quark-antiquark pairs under the strong magnetic field, the higher the phase transition temperature. The TSP corrects the position of the critical end point. The small impact of TSP on the phase transition temperature is found for the deconfinement phase transition. On the other hand, we divide the phase space into three ranges based on the phase diagram obtained from the PNJL model: the confinement phase with chiral symmetry broken, the deconfinement phase with restored chiral symmetry, and the confinement phase with restored chiral symmetry (quarkyonic phase). It is found that TSP has only a very small effect on the anisotropic pressure in the deconfined phase with chiral symmetry restored and the quarkyonic phase, but it has a very strong effect on the anisotropic pressure in the confined phase with chiral symmetry broken. This is because TSP is closely related to chiral symmetry. The restoration of chiral symmetry means the dissociation of spin polarization condensate. △ Less

Submitted 23 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: 20 pages, 7 figures

Journal ref: Physical Review D 109, 096033 (2024)

arXiv:2403.14874 [pdf, other]

WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather

Authors: Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Nathan Wei, Matthew Waliman, Yunhao Ba, Celso de Melo, Alex Wong, Achuta Kadambi

Abstract: We propose a method to infer semantic segmentation maps from images captured under adverse weather conditions. We begin by examining existing models on images degraded by weather conditions such as rain, fog, or snow, and found that they exhibit a large performance drop as compared to those captured under clear weather. To control for changes in scene structures, we propose WeatherProof, the first… ▽ More We propose a method to infer semantic segmentation maps from images captured under adverse weather conditions. We begin by examining existing models on images degraded by weather conditions such as rain, fog, or snow, and found that they exhibit a large performance drop as compared to those captured under clear weather. To control for changes in scene structures, we propose WeatherProof, the first semantic segmentation dataset with accurate clear and adverse weather image pairs that share an underlying scene. Through this dataset, we analyze the error modes in existing models and found that they were sensitive to the highly complex combination of different weather effects induced on the image during capture. To improve robustness, we propose a way to use language as guidance by identifying contributions of adverse weather conditions and injecting that as "side information". Models trained using our language guidance exhibit performance gains by up to 10.2% in mIoU on WeatherProof, up to 8.44% in mIoU on the widely used ACDC dataset compared to standard training techniques, and up to 6.21% in mIoU on the ACDC dataset as compared to previous SOTA methods. △ Less

Submitted 7 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2312.09534

arXiv:2403.14541 [pdf, other]

EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling

Authors: Shimao Zhang, Yu Bao, Shujian Huang

Abstract: Recently, Large Language Models (LLMs) have demonstrated outstanding performance across a wide range of downstream language tasks. Temperature sampling is a commonly used decoding strategy for LLMs' generation process. However, a fixed temperature parameter is used in most cases, which may not always be an optimal choice for balancing generation quality and diversity. In this paper, we propose an… ▽ More Recently, Large Language Models (LLMs) have demonstrated outstanding performance across a wide range of downstream language tasks. Temperature sampling is a commonly used decoding strategy for LLMs' generation process. However, a fixed temperature parameter is used in most cases, which may not always be an optimal choice for balancing generation quality and diversity. In this paper, we propose an effective Entropy-based Dynamic Temperature (EDT) Sampling method, to achieve a more balanced performance in terms of both generation quality and diversity by dynamically selecting the temperature parameter. Additionally, we also show model performance and comprehensive analyses for 4 different generation benchmarks. Our experiments show that EDT significantly outperforms the existing strategies across different tasks. △ Less

Submitted 3 April, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Showing 1–50 of 533 results for author: Bae, Y