Search | arXiv e-print repository

arXiv:2408.17300 [pdf, other]

Variational quantum simulation of ground states and thermal states for lattice gauge theory with multi-objective optimization

Authors: Lang-Xing Cheng, Dan-Bo Zhang

Abstract: Variational quantum algorithms provide feasible approaches for simulating quantum systems and are applied widely. For lattice gauge theory, however, variational quantum simulation faces a challenge as local gauge invariance enforces a constraint on the physical Hilbert space. In this paper, we incorporate multi-objective optimization for variational quantum simulation of lattice gauge theory at ze… ▽ More Variational quantum algorithms provide feasible approaches for simulating quantum systems and are applied widely. For lattice gauge theory, however, variational quantum simulation faces a challenge as local gauge invariance enforces a constraint on the physical Hilbert space. In this paper, we incorporate multi-objective optimization for variational quantum simulation of lattice gauge theory at zero and finite temperatures. By setting energy or free energy of the system and penalty for enforcing the local gauge invariance as two objectives, the multi-objective optimization can self-adjust the proper weighting for two objectives and thus faithfully simulate the gauge theory in the physical Hilbert space. Specifically, we propose variational quantum eigensolver and variational quantum thermalizer for preparing the ground states and thermal states of lattice gauge theory, respectively. We demonstrate the quantum algorithms for a $Z_2$ lattice gauge theory with spinless fermion in one dimension. With numeral simulations, the multi-objective optimization shows that minimizing energy~(free energy) and enforcing the local gauge invariance can be achieved simultaneously at zero temperature~(finite temperature). The multi-objective optimization suggests a feasible ingredient for quantum simulation of complicated physical systems on near-term quantum devices. △ Less

Submitted 30 August, 2024; originally announced August 2024.

arXiv:2408.17071 [pdf, other]

Search for $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0h_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (653 additional authors not shown)

Abstract: Using $(2712.4 \pm 14.3) \times 10^6~ψ$(3686) events collected with the BESIII detector operating at the BEPCII collider, we search for the hadronic transition $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0 h_c$. No significant signal is observed. We set the most stringent upper limits to date on the branching fractions $\mathcal{B}(ψ(3686)\to π^0 h_c)\times\mathcal{B}(h_c\toπ^+π^-J/ψ)$ and… ▽ More Using $(2712.4 \pm 14.3) \times 10^6~ψ$(3686) events collected with the BESIII detector operating at the BEPCII collider, we search for the hadronic transition $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0 h_c$. No significant signal is observed. We set the most stringent upper limits to date on the branching fractions $\mathcal{B}(ψ(3686)\to π^0 h_c)\times\mathcal{B}(h_c\toπ^+π^-J/ψ)$ and $\mathcal{B}(h_c \to π^+π^-J/ψ)$ at the 90$\%$ confidence level, which are determined to be $6.7\times 10^{-7}$ and $9.4 \times10^{-4}$, respectively. △ Less

Submitted 30 August, 2024; originally announced August 2024.

arXiv:2408.16654 [pdf, other]

Measurement of the Decay $Ξ^{0}\toΛγ$ with Entangled $Ξ^{0}\barΞ^{0}$ Pairs

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: In this Letter, a systematic study of the weak radiative hyperon decay $Ξ^{0}\toΛγ$ at an electron-positron collider using entangled $Ξ^{0}\barΞ^{0}$ pair events is presented. The absolute branching fraction for this decay has been measured for the first time, and is $\left(1.347 \pm 0.066_{\mathrm stat.}\pm0.054_{\mathrm syst.}\right)\times 10^{-3}$. The decay asymmetry parameter, which character… ▽ More In this Letter, a systematic study of the weak radiative hyperon decay $Ξ^{0}\toΛγ$ at an electron-positron collider using entangled $Ξ^{0}\barΞ^{0}$ pair events is presented. The absolute branching fraction for this decay has been measured for the first time, and is $\left(1.347 \pm 0.066_{\mathrm stat.}\pm0.054_{\mathrm syst.}\right)\times 10^{-3}$. The decay asymmetry parameter, which characterizes the effect of parity violation in the decay, is determined to be $-0.741 \pm 0.062_{\mathrm stat.}\pm 0.019_{\mathrm syst.}$. The obtained results are consistent with the world average values within the uncertainties, offering valuable insights into the underlying mechanism governing the weak radiative hyperon decays. The charge conjugation parity ($CP$) symmetries of branching fraction and decay asymmetry parameter in the decay are also studied. No statistically significant violation of charge conjugation parity symmetry is observed. △ Less

Submitted 29 August, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

Comments: 10 pages, 3 figures

arXiv:2408.16646 [pdf, other]

Study of the rare decay $J/ψ\to μ^+μ^-μ^+μ^-$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1096 additional authors not shown)

Abstract: The rare electromagnetic $J/ψ\to μ^+μ^-μ^+μ^-$ decay is observed with a significance greatly exceeding the discovery threshold, using proton-proton collision data collected by the LHCb experiment during 2016-2018 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of $5.4\,\text{fb}^{-1}$. The rate of this decay is measured relative to that of the $J/ψ\to μ^+μ^-$ mode.… ▽ More The rare electromagnetic $J/ψ\to μ^+μ^-μ^+μ^-$ decay is observed with a significance greatly exceeding the discovery threshold, using proton-proton collision data collected by the LHCb experiment during 2016-2018 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of $5.4\,\text{fb}^{-1}$. The rate of this decay is measured relative to that of the $J/ψ\to μ^+μ^-$ mode. Using the QED model for the four-muon decay in the efficiency estimation, its branching fraction is determined to be \begin{equation*} {\mathcal{B}}(J/ψ\to μ^+μ^-μ^+μ^-) = (1.13\pm0.10\pm0.05\pm0.01)\times 10^{-6}, \end{equation*} where the uncertainties are statistical, systematic and due to the uncertainty on the branching fraction of the $J/ψ\to μ^+μ^-$ decay. △ Less

Submitted 29 August, 2024; originally announced August 2024.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3453 (LHCb public pages)

Report number: LHCb-PAPER-2024-016, CERN-EP-2024-201

arXiv:2408.16579 [pdf, other]

A Compaction Function Analysis of CMB $μ$ distortion Constraints on Primordial Black Holes

Authors: Junyue Yang, Xiaoding Wang, Xiao-Han Ma, Dongdong Zhang, Sheng-Feng Yan, Amara Ilyas, Yi-Fu Cai

Abstract: Primordial black holes (PBHs) are considered viable candidates for dark matter and the seeds of supermassive black holes (SMBHs), with their fruitful physical influences providing significant insights into the conditions of the early Universe. Cosmic microwave background (CMB) $μ$ distortion tightly constrain the abundance of PBHs in the mass range of $10^4 \sim 10^{11} M_{\odot}$ recently, limiti… ▽ More Primordial black holes (PBHs) are considered viable candidates for dark matter and the seeds of supermassive black holes (SMBHs), with their fruitful physical influences providing significant insights into the conditions of the early Universe. Cosmic microwave background (CMB) $μ$ distortion tightly constrain the abundance of PBHs in the mass range of $10^4 \sim 10^{11} M_{\odot}$ recently, limiting their potential to serve as seeds for the SMBHs observed. Given that $μ$ distortion directly constrain the primordial power spectrum, it is crucial to employ more precise methods in computing PBH abundance to strengthen the reliability of these constraints. By a Press-Schechter (PS) type method utilizing the compaction function, we find that the abundance of PBHs could be higher than previously estimated constraints from $μ$ distortion observations. Furthermore, our analysis shows that variations in the shape of the power spectrum have a negligible impact on our conclusions within the mass ranges under consideration. This conclusion provides us a perspective for further research on the constrain of PBH by $μ$ distortion. △ Less

Submitted 29 August, 2024; originally announced August 2024.

Comments: 9 pages, 3 figures

arXiv:2408.16326 [pdf, other]

Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic

Authors: Xin Zheng, Jie Lou, Boxi Cao, Xueru Wen, Yuqiu Ji, Hongyu Lin, Yaojie Lu, Xianpei Han, Debing Zhang, Le Sun

Abstract: Self-critic has become an important mechanism for enhancing the reasoning performance of LLMs. However, current approaches mainly involve basic prompts without further training, which tend to be over-simplified, leading to limited accuracy.Moreover, there is a lack of in-depth investigation of the relationship between LLM's ability to criticism and its task-solving performance.To address these iss… ▽ More Self-critic has become an important mechanism for enhancing the reasoning performance of LLMs. However, current approaches mainly involve basic prompts without further training, which tend to be over-simplified, leading to limited accuracy.Moreover, there is a lack of in-depth investigation of the relationship between LLM's ability to criticism and its task-solving performance.To address these issues, we propose Critic-CoT, a novel framework that pushes LLMs toward System-2-like critic capability, via step-wise CoT reasoning format and distant-supervision data construction, without the need for human annotation. Experiments on GSM8K and MATH show that via filtering out invalid solutions or iterative refinement, our enhanced model boosts task-solving performance, which demonstrates the effectiveness of our method. Further, we find that training on critique and refinement alone improves the generation. We hope our work could shed light on future research on improving the reasoning and critic ability of LLMs. △ Less

Submitted 29 August, 2024; originally announced August 2024.

arXiv:2408.16303 [pdf, other]

Enhanced Control for Diffusion Bridge in Image Restoration

Authors: Conghan Yue, Zhengwei Peng, Junlong Ma, Dongyu Zhang

Abstract: Image restoration refers to the process of restoring a damaged low-quality image back to its corresponding high-quality image. Typically, we use convolutional neural networks to directly learn the mapping from low-quality images to high-quality images achieving image restoration. Recently, a special type of diffusion bridge model has achieved more advanced results in image restoration. It can tran… ▽ More Image restoration refers to the process of restoring a damaged low-quality image back to its corresponding high-quality image. Typically, we use convolutional neural networks to directly learn the mapping from low-quality images to high-quality images achieving image restoration. Recently, a special type of diffusion bridge model has achieved more advanced results in image restoration. It can transform the direct mapping from low-quality to high-quality images into a diffusion process, restoring low-quality images through a reverse process. However, the current diffusion bridge restoration models do not emphasize the idea of conditional control, which may affect performance. This paper introduces the ECDB model enhancing the control of the diffusion bridge with low-quality images as conditions. Moreover, in response to the characteristic of diffusion models having low denoising level at larger values of $\bm t $, we also propose a Conditional Fusion Schedule, which more effectively handles the conditional feature information of various modules. Experimental results prove that the ECDB model has achieved state-of-the-art results in many image restoration tasks, including deraining, inpainting and super-resolution. Code is avaliable at https://github.com/Hammour-steak/ECDB. △ Less

Submitted 29 August, 2024; originally announced August 2024.

arXiv:2408.16279 [pdf, ps, other]

Model-independent determination of the strong-phase difference between $D^0$ and $\bar{D}^0 \to π^+π^-π^+π^-$ decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (647 additional authors not shown)

Abstract: Measurements of the strong-phase difference between $D^0$ and $\bar{D}^0\toπ^+π^-π^+π^-$ are performed in bins of phase space. The study exploits a sample of quantum-correlated $D\bar{D}$ mesons collected by the BESIII experiment in $e^+e^-$ collisions at a center-of-mass energy of 3.773~GeV, corresponding to an integrated luminosity of 2.93~fb$^{-1}$. Here, $D$ denotes a neutral charm meson in a… ▽ More Measurements of the strong-phase difference between $D^0$ and $\bar{D}^0\toπ^+π^-π^+π^-$ are performed in bins of phase space. The study exploits a sample of quantum-correlated $D\bar{D}$ mesons collected by the BESIII experiment in $e^+e^-$ collisions at a center-of-mass energy of 3.773~GeV, corresponding to an integrated luminosity of 2.93~fb$^{-1}$. Here, $D$ denotes a neutral charm meson in a superposition of flavor eigenstates. The reported results are valuable for measurements of the $C\!P$-violating phase $γ$ (also denoted $φ_3$) in $B^\pm \to DK^\pm$, $D \to π^+π^-π^+π^-$ decays, and the binning schemes are designed to provide good statistical sensitivity to this parameter. The expected uncertainty on $γ$ arising from the precision of the strong-phase measurements, when applied to very large samples of $B$-meson decays, is around $1.5^\circ$ or $2^\circ$, depending on the binning scheme. The binned strong-phase parameters are combined to give a value of $F_+^{4π} = 0.746 \pm 0.010 \pm 0.004$ for the $C\!P$-even fraction of $D^0 \to π^+π^-π^+π^-$ decays, which is around 30\% more precise than the previous best measurement of this quantity. △ Less

Submitted 29 August, 2024; originally announced August 2024.

arXiv:2408.16260 [pdf, other]

A General Framework for Optimizing and Learning Nash Equilibrium

Authors: Di Zhang, Wei Gu, Qing Jin

Abstract: One key in real-life Nash equilibrium applications is to calibrate players' cost functions. To leverage the approximation ability of neural networks, we proposed a general framework for optimizing and learning Nash equilibrium using neural networks to estimate players' cost functions. Depending on the availability of data, we propose two approaches (a) the two-stage approach: we need the data pair… ▽ More One key in real-life Nash equilibrium applications is to calibrate players' cost functions. To leverage the approximation ability of neural networks, we proposed a general framework for optimizing and learning Nash equilibrium using neural networks to estimate players' cost functions. Depending on the availability of data, we propose two approaches (a) the two-stage approach: we need the data pair of players' strategy and relevant function value to first learn the players' cost functions by monotonic neural networks or graph neural networks, and then solve the Nash equilibrium with the learned neural networks; (b) the joint approach: we use the data of partial true observation of the equilibrium and contextual information (e.g., weather) to optimize and learn Nash equilibrium simultaneously. The problem is formulated as an optimization problem with equilibrium constraints and solved using a modified Backpropagation Algorithm. The proposed methods are validated in numerical experiments. △ Less

Submitted 29 August, 2024; originally announced August 2024.

arXiv:2408.16173 [pdf, other]

LLM-assisted Labeling Function Generation for Semantic Type Detection

Authors: Chenjie Li, Dan Zhang, Jin Wang

Abstract: Detecting semantic types of columns in data lake tables is an important application. A key bottleneck in semantic type detection is the availability of human annotation due to the inherent complexity of data lakes. In this paper, we propose using programmatic weak supervision to assist in annotating the training data for semantic type detection by leveraging labeling functions. One challenge in th… ▽ More Detecting semantic types of columns in data lake tables is an important application. A key bottleneck in semantic type detection is the availability of human annotation due to the inherent complexity of data lakes. In this paper, we propose using programmatic weak supervision to assist in annotating the training data for semantic type detection by leveraging labeling functions. One challenge in this process is the difficulty of manually writing labeling functions due to the large volume and low quality of the data lake table datasets. To address this issue, we explore employing Large Language Models (LLMs) for labeling function generation and introduce several prompt engineering strategies for this purpose. We conduct experiments on real-world web table datasets. Based on the initial results, we perform extensive analysis and provide empirical insights and future directions for researchers in this field. △ Less

Submitted 28 August, 2024; originally announced August 2024.

Comments: VLDB'24-DATAI

arXiv:2408.15971 [pdf, other]

BattleAgentBench: A Benchmark for Evaluating Cooperation and Competition Capabilities of Language Models in Multi-Agent Systems

Authors: Wei Wang, Dan Zhang, Tao Feng, Boyan Wang, Jie Tang

Abstract: Large Language Models (LLMs) are becoming increasingly powerful and capable of handling complex tasks, e.g., building single agents and multi-agent systems. Compared to single agents, multi-agent systems have higher requirements for the collaboration capabilities of language models. Many benchmarks are proposed to evaluate their collaborative abilities. However, these benchmarks lack fine-grained… ▽ More Large Language Models (LLMs) are becoming increasingly powerful and capable of handling complex tasks, e.g., building single agents and multi-agent systems. Compared to single agents, multi-agent systems have higher requirements for the collaboration capabilities of language models. Many benchmarks are proposed to evaluate their collaborative abilities. However, these benchmarks lack fine-grained evaluations of LLM collaborative capabilities. Additionally, multi-agent collaborative and competitive scenarios are ignored in existing works. To address these two problems, we propose a benchmark, called BattleAgentBench, which defines seven sub-stages of three varying difficulty levels and conducts a fine-grained evaluation of language models in terms of single-agent scenario navigation capabilities, paired-agent task execution abilities, and multi-agent collaboration and competition capabilities. We conducted extensive evaluations on leading four closed-source and seven open-source models. Experimental results indicate that API-based models perform excellently on simple tasks but open-source small models struggle with simple tasks. Regarding difficult tasks that require collaborative and competitive abilities, although API-based models have demonstrated some collaborative capabilities, there is still enormous room for improvement. △ Less

Submitted 28 August, 2024; originally announced August 2024.

arXiv:2408.15781 [pdf, other]

Determining non-Hermitian parent Hamiltonian from a single eigenstate

Authors: Xu-Dan Xie, Zheng-Yuan Xue, Dan-Bo Zhang

Abstract: A quantum state for being an eigenstate of some local Hamiltonian should be constraint by zero energy variance and consequently, the constraint is rather strong that a single eigenstate may uniquely determine the Hamiltonian. For non-Hermitian systems, it is natural to expect that determining the Hamiltonian requires a pair of both left and right eigenstates. Here, we observe that it can be suffic… ▽ More A quantum state for being an eigenstate of some local Hamiltonian should be constraint by zero energy variance and consequently, the constraint is rather strong that a single eigenstate may uniquely determine the Hamiltonian. For non-Hermitian systems, it is natural to expect that determining the Hamiltonian requires a pair of both left and right eigenstates. Here, we observe that it can be sufficient to determine a non-Hermitian Hamiltonian from a single right or left eigenstate. Our approach is based on the quantum covariance matrix, where the solution of Hamiltonian corresponds to the complex null vector. Our scheme favours non-Hermitian Hamiltonian learning on experimental quantum systems, as only the right eigenstates there can be accessed. Furthermore, we use numerical simulations to examine the effects of measurement errors and show the stability of our scheme. △ Less

Submitted 28 August, 2024; originally announced August 2024.

arXiv:2408.15227 [pdf, other]

Axion Dark Matter eXperiment around 3.3 μeV with Dine-Fischler-Srednicki-Zhitnitsky Discovery Ability

Authors: C. Bartram, C. Boutan, T. Braine, J. H. Buckley, T. J. Caligiure, G. Carosi, A. S. Chou, C. Cisneros, John Clarke, E. J. Daw, N. Du, L. D. Duffy, T. A. Dyson, C. Gaikwad, J. R. Gleason, C. Goodman, M. Goryachev, M. Guzzetti, C. Hanretty, E. Hartman, A. T. Hipp, J. Hoffman, M. Hollister, R. Khatiwada, S. Knirck , et al. (24 additional authors not shown)

Abstract: We report the results of a QCD axion dark matter search with discovery ability for Dine-Fischler-Srednicki-Zhitnitsky (DFSZ) axions using an axion haloscope. Sub-Kelvin noise temperatures are reached with an ultra low-noise Josephson parametric amplifier cooled by a dilution refrigerator. This work excludes (with a 90% confidence level) DFSZ axions with masses between 3.27 to 3.34 μeV, assuming a… ▽ More We report the results of a QCD axion dark matter search with discovery ability for Dine-Fischler-Srednicki-Zhitnitsky (DFSZ) axions using an axion haloscope. Sub-Kelvin noise temperatures are reached with an ultra low-noise Josephson parametric amplifier cooled by a dilution refrigerator. This work excludes (with a 90% confidence level) DFSZ axions with masses between 3.27 to 3.34 μeV, assuming a standard halo model with a local energy density of 0.45 GeV/cc made up 100% of axions. △ Less

Submitted 27 August, 2024; originally announced August 2024.

arXiv:2408.14380 [pdf, other]

Probing Causality Manipulation of Large Language Models

Authors: Chenyang Zhang, Haibo Tong, Bin Zhang, Dongyu Zhang

Abstract: Large language models (LLMs) have shown various ability on natural language processing, including problems about causality. It is not intuitive for LLMs to command causality, since pretrained models usually work on statistical associations, and do not focus on causes and effects in sentences. So that probing internal manipulation of causality is necessary for LLMs. This paper proposes a novel appr… ▽ More Large language models (LLMs) have shown various ability on natural language processing, including problems about causality. It is not intuitive for LLMs to command causality, since pretrained models usually work on statistical associations, and do not focus on causes and effects in sentences. So that probing internal manipulation of causality is necessary for LLMs. This paper proposes a novel approach to probe causality manipulation hierarchically, by providing different shortcuts to models and observe behaviors. We exploit retrieval augmented generation (RAG) and in-context learning (ICL) for models on a designed causality classification task. We conduct experiments on mainstream LLMs, including GPT-4 and some smaller and domain-specific models. Our results suggest that LLMs can detect entities related to causality and recognize direct causal relationships. However, LLMs lack specialized cognition for causality, merely treating them as part of the global semantic of the sentence. △ Less

Submitted 26 August, 2024; originally announced August 2024.

arXiv:2408.13988 [pdf, other]

Automatic Medical Report Generation: Methods and Applications

Authors: Li Guo, Anas M. Tahir, Dong Zhang, Z. Jane Wang, Rabab K. Ward

Abstract: The increasing demand for medical imaging has surpassed the capacity of available radiologists, leading to diagnostic delays and potential misdiagnoses. Artificial intelligence (AI) techniques, particularly in automatic medical report generation (AMRG), offer a promising solution to this dilemma. This review comprehensively examines AMRG methods from 2021 to 2024. It (i) presents solutions to prim… ▽ More The increasing demand for medical imaging has surpassed the capacity of available radiologists, leading to diagnostic delays and potential misdiagnoses. Artificial intelligence (AI) techniques, particularly in automatic medical report generation (AMRG), offer a promising solution to this dilemma. This review comprehensively examines AMRG methods from 2021 to 2024. It (i) presents solutions to primary challenges in this field, (ii) explores AMRG applications across various imaging modalities, (iii) introduces publicly available datasets, (iv) outlines evaluation metrics, (v) identifies techniques that significantly enhance model performance, and (vi) discusses unresolved issues and potential future research directions. This paper aims to provide a comprehensive understanding of the existing literature and inspire valuable future research. △ Less

Submitted 25 August, 2024; originally announced August 2024.

Comments: 42 pages and 9 figures

arXiv:2408.13789 [pdf]

Multi-watt long-wavelength infrared femtosecond lasers and resonant enamel ablation

Authors: Xuemei Yang, Dunxiang Zhang, Weizhe Wang, Kan Tian, Linzhen He, Jinmiao Guo, Bo Hu, Tao Pu, Wenlong Li, Shiran Sun, Chunmei Ding, Han Wu, Kenkai Li, Yujie Peng, Jianshu Li, Yuxin Leng, Houkun Liang

Abstract: High-power broadband tunable long-wavelength infrared (LWIR) femtosecond lasers operating at fingerprint wavelengths of 7-14 μm hold significant promise across a range of applications, including molecular hyperspectral imaging, strong-field light-matter interaction, and resonant tissue ablation. Here we present 6-12 μm broadband tunable parametric amplifier based on LiGaS2 or BaGa4S7, generating n… ▽ More High-power broadband tunable long-wavelength infrared (LWIR) femtosecond lasers operating at fingerprint wavelengths of 7-14 μm hold significant promise across a range of applications, including molecular hyperspectral imaging, strong-field light-matter interaction, and resonant tissue ablation. Here we present 6-12 μm broadband tunable parametric amplifier based on LiGaS2 or BaGa4S7, generating new record output power of 2.4 W at 7.5 μm, and 1.5 W at 9.5 μm, pumped by a simple and effective thin-square-rod Yb:YAG amplifier producing 110 W 274 fs output pulses. As a proof of concept, we showcase efficient resonant ablation and microstructure fabrication on enamel at the hydroxyapatite resonant wavelength of 9.5 μm, with a laser intensity two orders-of-magnitude lower than that required by non-resonant femtosecond lasers, which could foster more precision surgical applications with superior biosafety. △ Less

Submitted 25 August, 2024; originally announced August 2024.

arXiv:2408.13548 [pdf, ps, other]

Admissible weak factorization systems on extriangulated categories

Authors: Yajun Ma, Hanyang You, Dongdong Zhang, Panyue Zhou

Abstract: Extriangulated categories, introduced by Nakaoka and Palu, serve as a simultaneous generalization of exact and triangulated categories. In this paper, we first introduce the concept of admissible weak factorization systems and establish a bijection between cotorsion pairs and admissible weak factorization systems in extriangulated categories. Consequently, we give the equivalences between heredita… ▽ More Extriangulated categories, introduced by Nakaoka and Palu, serve as a simultaneous generalization of exact and triangulated categories. In this paper, we first introduce the concept of admissible weak factorization systems and establish a bijection between cotorsion pairs and admissible weak factorization systems in extriangulated categories. Consequently, we give the equivalences between hereditary cotorsion pairs and compatible cotorsion pairs via admissible weak factorization systems under certain conditions in extriangulated categories, thereby generalizing a result by Di, Li, and Liang. △ Less

Submitted 27 August, 2024; v1 submitted 24 August, 2024; originally announced August 2024.

Comments: 13 pages

arXiv:2408.13459 [pdf, other]

Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model

Authors: Chen Rao, Guangyuan Li, Zehua Lan, Jiakai Sun, Junsheng Luan, Wei Xing, Lei Zhao, Huaizhong Lin, Jianfeng Dong, Dalong Zhang

Abstract: Current video deblurring methods have limitations in recovering high-frequency information since the regression losses are conservative with high-frequency details. Since Diffusion Models (DMs) have strong capabilities in generating high-frequency details, we consider introducing DMs into the video deblurring task. However, we found that directly applying DMs to the video deblurring task has the f… ▽ More Current video deblurring methods have limitations in recovering high-frequency information since the regression losses are conservative with high-frequency details. Since Diffusion Models (DMs) have strong capabilities in generating high-frequency details, we consider introducing DMs into the video deblurring task. However, we found that directly applying DMs to the video deblurring task has the following problems: (1) DMs require many iteration steps to generate videos from Gaussian noise, which consumes many computational resources. (2) DMs are easily misled by the blurry artifacts in the video, resulting in irrational content and distortion of the deblurred video. To address the above issues, we propose a novel video deblurring framework VD-Diff that integrates the diffusion model into the Wavelet-Aware Dynamic Transformer (WADT). Specifically, we perform the diffusion model in a highly compact latent space to generate prior features containing high-frequency information that conforms to the ground truth distribution. We design the WADT to preserve and recover the low-frequency information in the video while utilizing the high-frequency information generated by the diffusion model. Extensive experiments show that our proposed VD-Diff outperforms SOTA methods on GoPro, DVD, BSD, and Real-World Video datasets. △ Less

Submitted 24 August, 2024; originally announced August 2024.

Comments: accepted by ECCV2024

ACM Class: I.4.4

arXiv:2408.13161 [pdf, other]

Say No to Freeloader: Protecting Intellectual Property of Your Deep Model

Authors: Lianyu Wang, Meng Wang, Huazhu Fu, Daoqiang Zhang

Abstract: Model intellectual property (IP) protection has attracted growing attention as science and technology advancements stem from human intellectual labor and computational expenses. Ensuring IP safety for trainers and owners is of utmost importance, particularly in domains where ownership verification and applicability authorization are required. A notable approach to safeguarding model IP involves pr… ▽ More Model intellectual property (IP) protection has attracted growing attention as science and technology advancements stem from human intellectual labor and computational expenses. Ensuring IP safety for trainers and owners is of utmost importance, particularly in domains where ownership verification and applicability authorization are required. A notable approach to safeguarding model IP involves proactively preventing the use of well-trained models of authorized domains from unauthorized domains. In this paper, we introduce a novel Compact Un-transferable Pyramid Isolation Domain (CUPI-Domain) which serves as a barrier against illegal transfers from authorized to unauthorized domains. Drawing inspiration from human transitive inference and learning abilities, the CUPI-Domain is designed to obstruct cross-domain transfers by emphasizing the distinctive style features of the authorized domain. This emphasis leads to failure in recognizing irrelevant private style features on unauthorized domains. To this end, we propose novel CUPI-Domain generators, which select features from both authorized and CUPI-Domain as anchors. Then, we fuse the style features and semantic features of these anchors to generate labeled and style-rich CUPI-Domain. Additionally, we design external Domain-Information Memory Banks (DIMB) for storing and updating labeled pyramid features to obtain stable domain class features and domain class-wise style features. Based on the proposed whole method, the novel style and discriminative loss functions are designed to effectively enhance the distinction in style and discriminative features between authorized and unauthorized domains, respectively. Moreover, we provide two solutions for utilizing CUPI-Domain based on whether the unauthorized domain is known: target-specified CUPI-Domain and target-free CUPI-Domain. △ Less

Submitted 27 August, 2024; v1 submitted 23 August, 2024; originally announced August 2024.

arXiv:2408.12528 [pdf, other]

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Authors: Jinheng Xie, Weijia Mao, Zechen Bai, David Junhao Zhang, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang, Mike Zheng Shou

Abstract: We present a unified transformer, i.e., Show-o, that unifies multimodal understanding and generation. Unlike fully autoregressive models, Show-o unifies autoregressive and (discrete) diffusion modeling to adaptively handle inputs and outputs of various and mixed modalities. The unified model flexibly supports a wide range of vision-language tasks including visual question-answering, text-to-image… ▽ More We present a unified transformer, i.e., Show-o, that unifies multimodal understanding and generation. Unlike fully autoregressive models, Show-o unifies autoregressive and (discrete) diffusion modeling to adaptively handle inputs and outputs of various and mixed modalities. The unified model flexibly supports a wide range of vision-language tasks including visual question-answering, text-to-image generation, text-guided inpainting/extrapolation, and mixed-modality generation. Across various benchmarks, it demonstrates comparable or superior performance to existing individual models with an equivalent or larger number of parameters tailored for understanding or generation. This significantly highlights its potential as a next-generation foundation model. Code and models are released at https://github.com/showlab/Show-o. △ Less

Submitted 25 August, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

Comments: Technical Report

arXiv:2408.11813 [pdf, other]

SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs

Authors: Yuanyang Yin, Yaqi Zhao, Yajie Zhang, Ke Lin, Jiahao Wang, Xin Tao, Pengfei Wan, Di Zhang, Baoqun Yin, Wentao Zhang

Abstract: Multimodal Large Language Models (MLLMs) have recently demonstrated remarkable perceptual and reasoning abilities, typically comprising a Vision Encoder, an Adapter, and a Large Language Model (LLM). The adapter serves as the critical bridge between the visual and language components. However, training adapters with image-level supervision often results in significant misalignment, undermining the… ▽ More Multimodal Large Language Models (MLLMs) have recently demonstrated remarkable perceptual and reasoning abilities, typically comprising a Vision Encoder, an Adapter, and a Large Language Model (LLM). The adapter serves as the critical bridge between the visual and language components. However, training adapters with image-level supervision often results in significant misalignment, undermining the LLMs' capabilities and limiting the potential of Multimodal LLMs. To address this, we introduce Supervised Embedding Alignment (SEA), a token-level alignment method that leverages vision-language pre-trained models, such as CLIP, to align visual tokens with the LLM's embedding space through contrastive learning. This approach ensures a more coherent integration of visual and language representations, enhancing the performance and interpretability of multimodal LLMs while preserving their inherent capabilities. Extensive experiments show that SEA effectively improves MLLMs, particularly for smaller models, without adding extra data or inference computation. SEA also lays the groundwork for developing more general and adaptable solutions to enhance multimodal systems. △ Less

Submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.11240 [pdf, ps, other]

Asymmetric Graph Error Control with Low Complexity in Causal Bandits

Authors: Chen Peng, Di Zhang, Urbashi Mitra

Abstract: In this paper, the causal bandit problem is investigated, in which the objective is to select an optimal sequence of interventions on nodes in a causal graph. It is assumed that the graph is governed by linear structural equations; it is further assumed that both the causal topology and the distribution of interventions are unknown. By exploiting the causal relationships between the nodes whose si… ▽ More In this paper, the causal bandit problem is investigated, in which the objective is to select an optimal sequence of interventions on nodes in a causal graph. It is assumed that the graph is governed by linear structural equations; it is further assumed that both the causal topology and the distribution of interventions are unknown. By exploiting the causal relationships between the nodes whose signals contribute to the reward, interventions are optimized. First, based on the difference between the two types of graph identification errors (false positives and negatives), a causal graph learning method is proposed, which strongly reduces sample complexity relative to the prior art by learning sub-graphs. Under the assumption of Gaussian exogenous inputs and minimum-mean squared error weight estimation, a new uncertainty bound tailored to the causal bandit problem is derived. This uncertainty bound drives an upper confidence bound based intervention selection to optimize the reward. To cope with non-stationary bandits, a sub-graph change detection mechanism is proposed, with high sample efficiency. Numerical results compare the new methodology to existing schemes and show a substantial performance improvement in both stationary and non-stationary settings. Compared to existing approaches, the proposed scheme takes 67% fewer samples to learn the causal structure and achieves an average reward gain of 85%. △ Less

Submitted 20 August, 2024; originally announced August 2024.

arXiv:2408.10313 [pdf, other]

$\text{AdS}_4$ Holography and the Hilbert Scheme

Authors: Samuel Crew, Daniel Zhang, Ziruo Zhang

Abstract: We elucidate a holographic relationship between the enumerative geometry of the Hilbert scheme of $N$ points in the plane $\mathbb{C}^2$, with $N$ large, and the entropy of certain magnetically charged black holes with $\text{AdS}_4$ asymptotics. Specifically, we demonstrate how the entropy functional arises from the asymptotics of 't Hooft and Wilson line operators in a 3d $\mathcal{N}= 4$ gauge… ▽ More We elucidate a holographic relationship between the enumerative geometry of the Hilbert scheme of $N$ points in the plane $\mathbb{C}^2$, with $N$ large, and the entropy of certain magnetically charged black holes with $\text{AdS}_4$ asymptotics. Specifically, we demonstrate how the entropy functional arises from the asymptotics of 't Hooft and Wilson line operators in a 3d $\mathcal{N}= 4$ gauge theory. The gauge-Bethe correspondence allows us to interpret this calculation in terms of the enumerative geometry of the Hilbert scheme and thereby conjecture that the entropy is saturated by expectation values of certain natural operators in the quantum $K$-theory ring acting on the localised $K$-theory of the Hilbert scheme. We give numerical evidence that the large $N$ limit is saturated by contributions from a certain vacuum/fixed point on the Hilbert scheme, associated to a particular triangular-shaped Young diagram, by evolving solutions to the Bethe equations numerically at finite (but large) $N$ towards the classical limit. We thus conjecture a concrete geometric holographic dual of the so-called gravitational/Cardy block. △ Less

Submitted 19 August, 2024; originally announced August 2024.

Comments: 46 pages, 11 figures

arXiv:2408.10115 [pdf, other]

GLIMMER: Incorporating Graph and Lexical Features in Unsupervised Multi-Document Summarization

Authors: Ran Liu, Ming Liu, Min Yu, Jianguo Jiang, Gang Li, Dan Zhang, Jingyuan Li, Xiang Meng, Weiqing Huang

Abstract: Pre-trained language models are increasingly being used in multi-document summarization tasks. However, these models need large-scale corpora for pre-training and are domain-dependent. Other non-neural unsupervised summarization approaches mostly rely on key sentence extraction, which can lead to information loss. To address these challenges, we propose a lightweight yet effective unsupervised app… ▽ More Pre-trained language models are increasingly being used in multi-document summarization tasks. However, these models need large-scale corpora for pre-training and are domain-dependent. Other non-neural unsupervised summarization approaches mostly rely on key sentence extraction, which can lead to information loss. To address these challenges, we propose a lightweight yet effective unsupervised approach called GLIMMER: a Graph and LexIcal features based unsupervised Multi-docuMEnt summaRization approach. It first constructs a sentence graph from the source documents, then automatically identifies semantic clusters by mining low-level features from raw texts, thereby improving intra-cluster correlation and the fluency of generated sentences. Finally, it summarizes clusters into natural sentences. Experiments conducted on Multi-News, Multi-XScience and DUC-2004 demonstrate that our approach outperforms existing unsupervised approaches. Furthermore, it surpasses state-of-the-art pre-trained multi-document summarization models (e.g. PEGASUS and PRIMERA) under zero-shot settings in terms of ROUGE scores. Additionally, human evaluations indicate that summaries generated by GLIMMER achieve high readability and informativeness scores. Our code is available at https://github.com/Oswald1997/GLIMMER. △ Less

Submitted 19 August, 2024; originally announced August 2024.

Comments: 19 pages, 7 figures. Accepted by ECAI 2024

arXiv:2408.09949 [pdf, other]

C${^2}$RL: Content and Context Representation Learning for Gloss-free Sign Language Translation and Retrieval

Authors: Zhigang Chen, Benjia Zhou, Yiqing Huang, Jun Wan, Yibo Hu, Hailin Shi, Yanyan Liang, Zhen Lei, Du Zhang

Abstract: Sign Language Representation Learning (SLRL) is crucial for a range of sign language-related downstream tasks such as Sign Language Translation (SLT) and Sign Language Retrieval (SLRet). Recently, many gloss-based and gloss-free SLRL methods have been proposed, showing promising performance. Among them, the gloss-free approach shows promise for strong scalability without relying on gloss annotatio… ▽ More Sign Language Representation Learning (SLRL) is crucial for a range of sign language-related downstream tasks such as Sign Language Translation (SLT) and Sign Language Retrieval (SLRet). Recently, many gloss-based and gloss-free SLRL methods have been proposed, showing promising performance. Among them, the gloss-free approach shows promise for strong scalability without relying on gloss annotations. However, it currently faces suboptimal solutions due to challenges in encoding the intricate, context-sensitive characteristics of sign language videos, mainly struggling to discern essential sign features using a non-monotonic video-text alignment strategy. Therefore, we introduce an innovative pretraining paradigm for gloss-free SLRL, called C${^2}$RL, in this paper. Specifically, rather than merely incorporating a non-monotonic semantic alignment of video and text to learn language-oriented sign features, we emphasize two pivotal aspects of SLRL: Implicit Content Learning (ICL) and Explicit Context Learning (ECL). ICL delves into the content of communication, capturing the nuances, emphasis, timing, and rhythm of the signs. In contrast, ECL focuses on understanding the contextual meaning of signs and converting them into equivalent sentences. Despite its simplicity, extensive experiments confirm that the joint optimization of ICL and ECL results in robust sign language representation and significant performance gains in gloss-free SLT and SLRet tasks. Notably, C${^2}$RL improves the BLEU-4 score by +5.3 on P14T, +10.6 on CSL-daily, +6.2 on OpenASL, and +1.3 on How2Sign. It also boosts the R@1 score by +8.3 on P14T, +14.4 on CSL-daily, and +5.9 on How2Sign. Additionally, we set a new baseline for the OpenASL dataset in the SLRet task. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.09768 [pdf, other]

MalLight: Influence-Aware Coordinated Traffic Signal Control for Traffic Signal Malfunctions

Authors: Qinchen Yang, Zejun Xie, Hua Wei, Desheng Zhang, Yu Yang

Abstract: Urban traffic is subject to disruptions that cause extended waiting time and safety issues at signalized intersections. While numerous studies have addressed the issue of intelligent traffic systems in the context of various disturbances, traffic signal malfunction, a common real-world occurrence with significant repercussions, has received comparatively limited attention. The primary objective of… ▽ More Urban traffic is subject to disruptions that cause extended waiting time and safety issues at signalized intersections. While numerous studies have addressed the issue of intelligent traffic systems in the context of various disturbances, traffic signal malfunction, a common real-world occurrence with significant repercussions, has received comparatively limited attention. The primary objective of this research is to mitigate the adverse effects of traffic signal malfunction, such as traffic congestion and collision, by optimizing the control of neighboring functioning signals. To achieve this goal, this paper presents a novel traffic signal control framework (MalLight), which leverages an Influence-aware State Aggregation Module (ISAM) and an Influence-aware Reward Aggregation Module (IRAM) to achieve coordinated control of surrounding traffic signals. To the best of our knowledge, this study pioneers the application of a Reinforcement Learning(RL)-based approach to address the challenges posed by traffic signal malfunction. Empirical investigations conducted on real-world datasets substantiate the superior performance of our proposed methodology over conventional and deep learning-based alternatives in the presence of signal malfunction, with reduction of throughput alleviated by as much as 48.6$\%$. △ Less

Submitted 20 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

Comments: Paper accepted to CIKM24 Full Research track

arXiv:2408.09237 [pdf, other]

QEDCartographer: Automating Formal Verification Using Reward-Free Reinforcement Learning

Authors: Alex Sanchez-Stern, Abhishek Varghese, Zhanna Kaufman, Dylan Zhang, Talia Ringer, Yuriy Brun

Abstract: Formal verification is a promising method for producing reliable software, but the difficulty of manually writing verification proofs severely limits its utility in practice. Recent methods have automated some proof synthesis by guiding a search through the proof space using a theorem prover. Unfortunately, the theorem prover provides only the crudest estimate of progress, resulting in effectively… ▽ More Formal verification is a promising method for producing reliable software, but the difficulty of manually writing verification proofs severely limits its utility in practice. Recent methods have automated some proof synthesis by guiding a search through the proof space using a theorem prover. Unfortunately, the theorem prover provides only the crudest estimate of progress, resulting in effectively undirected search. To address this problem, we create QEDCartographer, an automated proof-synthesis tool that combines supervised and reinforcement learning to more effectively explore the proof space. QEDCartographer incorporates the proofs' branching structure, enabling reward-free search and overcoming the sparse reward problem inherent to formal verification. We evaluate QEDCartographer using the CoqGym benchmark of 68.5K theorems from 124 open-source Coq projects. QEDCartographer fully automatically proves 21.4% of the test-set theorems. Previous search-based proof-synthesis tools Tok, Tac, ASTactic, Passport, and Proverbot9001, which rely only on supervised learning, prove 9.6%, 9.8%, 10.9%, 12.5%, and 19.8%, respectively. Diva, which combines 62 tools, proves 19.2%. Comparing to the most effective prior tool, Proverbot9001, QEDCartographer produces 26% shorter proofs 27% faster, on average over the theorems both tools prove. Together, QEDCartographer and non-learning-based CoqHammer prove 31.8% of the theorems, while CoqHammer alone proves 26.6%. Our work demonstrates that reinforcement learning is a fruitful research direction for improving proof-synthesis tools' search mechanisms. △ Less

Submitted 28 August, 2024; v1 submitted 17 August, 2024; originally announced August 2024.

Comments: Published in the International Conference on Software Engineering (ICSE) 2025: Alex Sanchez-Stern, Abhishek Varghese, Zhanna Kaufman, Dylan Zhang, Talia Ringer, and Yuriy Brun, QEDCartographer: Automating Formal Verification Using Reward-Free Reinforcement Learning, in Proceedings of the 47th International Conference on Software Engineering (ICSE), 2025

arXiv:2408.08826 [pdf, other]

Search for the rare decay $J/ψ\to γD^0+c.c.$ at BESIII

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (642 additional authors not shown)

Abstract: Using $(10087\pm44)\times10^6J/ψ$ events collected with the BESIII detector, we search for the rare decay $J/ψ\to γD^0+c.c.$ for the first time. No obvious signal is observed and the upper limit on the branching fraction is determined to be ${\cal B}(J/ψ\to γD^{0}+c.c.)< 9.1 \times 10^{-8}$ at 90\% confidence level. Using $(10087\pm44)\times10^6J/ψ$ events collected with the BESIII detector, we search for the rare decay $J/ψ\to γD^0+c.c.$ for the first time. No obvious signal is observed and the upper limit on the branching fraction is determined to be ${\cal B}(J/ψ\to γD^{0}+c.c.)< 9.1 \times 10^{-8}$ at 90\% confidence level. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.08536 [pdf, other]

Blockchain-Enabled Accountability in Data Supply Chain: A Data Bill of Materials Approach

Authors: Yue Liu, Dawen Zhang, Boming Xia, Julia Anticev, Tunde Adebayo, Zhenchang Xing, Moses Machao

Abstract: In the era of advanced artificial intelligence, highlighted by large-scale generative models like GPT-4, ensuring the traceability, verifiability, and reproducibility of datasets throughout their lifecycle is paramount for research institutions and technology companies. These organisations increasingly rely on vast corpora to train and fine-tune advanced AI models, resulting in intricate data supp… ▽ More In the era of advanced artificial intelligence, highlighted by large-scale generative models like GPT-4, ensuring the traceability, verifiability, and reproducibility of datasets throughout their lifecycle is paramount for research institutions and technology companies. These organisations increasingly rely on vast corpora to train and fine-tune advanced AI models, resulting in intricate data supply chains that demand effective data governance mechanisms. In addition, the challenge intensifies as diverse stakeholders may use assorted tools, often without adequate measures to ensure the accountability of data and the reliability of outcomes. In this study, we adapt the concept of ``Software Bill of Materials" into the field of data governance and management to address the above challenges, and introduce ``Data Bill of Materials" (DataBOM) to capture the dependency relationship between different datasets and stakeholders by storing specific metadata. We demonstrate a platform architecture for providing blockchain-based DataBOM services, present the interaction protocol for stakeholders, and discuss the minimal requirements for DataBOM metadata. The proposed solution is evaluated in terms of feasibility and performance via case study and quantitative analysis respectively. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.08063 [pdf, other]

Constraining Ultralight ALP Dark Matter in Light of Cosmic Birefringence

Authors: Dongdong Zhang, Elisa G. M. Ferreira, Ippei Obata, Toshiya Namikawa

Abstract: Cosmic birefringence, the observed rotation of the polarization plane of the cosmic microwave background (CMB), serves as a compelling probe for parity-violating physics beyond the Standard Model. This study explores the potential of ultralight axion-like particle (ALP) dark matter to explain the observed cosmic birefringence in the CMB. We focus on the previously understudied mass range of… ▽ More Cosmic birefringence, the observed rotation of the polarization plane of the cosmic microwave background (CMB), serves as a compelling probe for parity-violating physics beyond the Standard Model. This study explores the potential of ultralight axion-like particle (ALP) dark matter to explain the observed cosmic birefringence in the CMB. We focus on the previously understudied mass range of $10^{-25}$ eV to $10^{-23}$ eV, where ALPs start to undergo nonlinear clustering in the late universe. Our analysis incorporates recent cosmological constraints and considers the washout effect on CMB polarization. We find that for models with ALP masses $10^{-25}$ eV $\lesssim m_φ\lesssim 10^{-23}$ eV and birefringence arising from late ALP clustering, the upper limit on the ALP-photon coupling constant, imposed by the washout effect, is stringently lower than the coupling required to account for the observed static cosmic birefringence signal. This discrepancy persists regardless of the ALP fraction in dark matter. Furthermore, considering ALPs with masses $m_φ\gtrsim$ $10^{-23}$ eV cannot explain static birefringence due to their rapid field oscillations, our results indicate that, all ALP dark matter candidates capable of nonlinear clustering in the late universe and thus contributing mainly to the rotation angle of polarized photons, are incompatible with explaining the static cosmic birefringence signal observed in Planck and WMAP data. △ Less

Submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.08013 [pdf, other]

Adaptive Learning of Consistency and Inconsistency Information for Fake News Detection

Authors: Aohan Li, Jiaxin Chen, Xin Liao, Dengyong Zhang

Abstract: The rapid advancement of social media platforms has significantly reduced the cost of information dissemination, yet it has also led to a proliferation of fake news, posing a threat to societal trust and credibility. Most of fake news detection research focused on integrating text and image information to represent the consistency of multiple modes in news content, while paying less attention to i… ▽ More The rapid advancement of social media platforms has significantly reduced the cost of information dissemination, yet it has also led to a proliferation of fake news, posing a threat to societal trust and credibility. Most of fake news detection research focused on integrating text and image information to represent the consistency of multiple modes in news content, while paying less attention to inconsistent information. Besides, existing methods that leveraged inconsistent information often caused one mode overshadowing another, leading to ineffective use of inconsistent clue. To address these issues, we propose an adaptive multi-modal feature fusion network (MFF-Net). Inspired by human judgment processes for determining truth and falsity in news, MFF-Net focuses on inconsistent parts when news content is generally consistent and consistent parts when it is generally inconsistent. Specifically, MFF-Net extracts semantic and global features from images and texts respectively, and learns consistency information between modes through a multiple feature fusion module. To deal with the problem of modal information being easily masked, we design a single modal feature filtering strategy to capture inconsistent information from corresponding modes separately. Finally, similarity scores are calculated based on global features with adaptive adjustments made to achieve weighted fusion of consistent and inconsistent features. Extensive experimental results demonstrate that MFF-Net outperforms state-of-the-art methods across three public news datasets derived from real social medias. △ Less

Submitted 16 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.07891 [pdf, other]

Quantum-inspired Interpretable Deep Learning Architecture for Text Sentiment Analysis

Authors: Bingyu Li, Da Zhang, Zhiyuan Zhao, Junyu Gao, Yuan Yuan

Abstract: Text has become the predominant form of communication on social media, embedding a wealth of emotional nuances. Consequently, the extraction of emotional information from text is of paramount importance. Despite previous research making some progress, existing text sentiment analysis models still face challenges in integrating diverse semantic information and lack interpretability. To address thes… ▽ More Text has become the predominant form of communication on social media, embedding a wealth of emotional nuances. Consequently, the extraction of emotional information from text is of paramount importance. Despite previous research making some progress, existing text sentiment analysis models still face challenges in integrating diverse semantic information and lack interpretability. To address these issues, we propose a quantum-inspired deep learning architecture that combines fundamental principles of quantum mechanics (QM principles) with deep learning models for text sentiment analysis. Specifically, we analyze the commonalities between text representation and QM principles to design a quantum-inspired text representation method and further develop a quantum-inspired text embedding layer. Additionally, we design a feature extraction layer based on long short-term memory (LSTM) networks and self-attention mechanisms (SAMs). Finally, we calculate the text density matrix using the quantum complex numbers principle and apply 2D-convolution neural networks (CNNs) for feature condensation and dimensionality reduction. Through a series of visualization, comparative, and ablation experiments, we demonstrate that our model not only shows significant advantages in accuracy and efficiency compared to previous related models but also achieves a certain level of interpretability by integrating QM principles. Our code is available at QISA. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2408.07592 [pdf, other]

Multi-periodicity dependency Transformer based on spectrum offset for radio frequency fingerprint identification

Authors: Jing Xiao, Wenrui Ding, Zeqi Shao, Duona Zhang, Yanan Ma, Yufeng Wang, Jian Wang

Abstract: Radio Frequency Fingerprint Identification (RFFI) has emerged as a pivotal task for reliable device authentication. Despite advancements in RFFI methods, background noise and intentional modulation features result in weak energy and subtle differences in the RFF features. These challenges diminish the capability of RFFI methods in feature representation, complicating the effective identification o… ▽ More Radio Frequency Fingerprint Identification (RFFI) has emerged as a pivotal task for reliable device authentication. Despite advancements in RFFI methods, background noise and intentional modulation features result in weak energy and subtle differences in the RFF features. These challenges diminish the capability of RFFI methods in feature representation, complicating the effective identification of device identities. This paper proposes a novel Multi-Periodicity Dependency Transformer (MPDFormer) to address these challenges. The MPDFormer employs a spectrum offset-based periodic embedding representation to augment the discrepency of intrinsic features. We delve into the intricacies of the periodicity-dependency attention mechanism, integrating both inter-period and intra-period attention mechanisms. This mechanism facilitates the extraction of both long and short-range periodicity-dependency features , accentuating the feature distinction whilst concurrently attenuating the perturbations caused by background noise and weak-periodicity features. Empirical results demonstrate MPDFormer's superiority over established baseline methods, achieving a 0.07s inference time on NVIDIA Jetson Orin NX. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2408.07263 [pdf, other]

Eavesdropping Mobile Apps and Actions through Wireless Traffic in the Open World

Authors: Xiaoguang Yang, Yong Huang, Junli Guo, Dalong Zhang, Qingxian Wang

Abstract: While smartphones and WiFi networks are bringing many positive changes to people's lives, they are susceptible to traffic analysis attacks, which infer user's private information from encrypted traffic. Existing traffic analysis attacks mainly target TCP/IP layers or are limited to the closed-world assumption, where all possible apps and actions have been involved in the model training. To overcom… ▽ More While smartphones and WiFi networks are bringing many positive changes to people's lives, they are susceptible to traffic analysis attacks, which infer user's private information from encrypted traffic. Existing traffic analysis attacks mainly target TCP/IP layers or are limited to the closed-world assumption, where all possible apps and actions have been involved in the model training. To overcome these limitations, we propose MACPrint, a novel system that infers mobile apps and in-app actions based on WiFi MAC layer traffic in the open-world setting. MACPrint first extracts rich statistical and contextual features of encrypted wireless traffic. Then, we develop Label Recorder, an automatic traffic labeling app, to improve labeling accuracy in the training phase. Finally, TCN models with OpenMax functions are used to recognize mobile apps and actions in the open world accurately. To evaluate our system, we collect MAC layer traffic data over 125 hours from more than 40 apps. The experimental results show that MAC-Print can achieve an accuracy of over 96% for recognizing apps and actions in the closed-world setting, and obtains an accuracy of over 86% in the open-world setting. △ Less

Submitted 13 August, 2024; originally announced August 2024.

Comments: Accepted by International Conference on Intelligent Computing 2024

arXiv:2408.07246 [pdf, other]

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area

Authors: Junxian Li, Di Zhang, Xunzhi Wang, Zeying Hao, Jingdi Lei, Qian Tan, Cai Zhou, Wei Liu, Yaotian Yang, Xinrui Xiong, Weiyun Wang, Zhe Chen, Wenhai Wang, Wei Li, Shufei Zhang, Mao Su, Wanli Ouyang, Yuqiang Li, Dongzhan Zhou

Abstract: Large Language Models (LLMs) have achieved remarkable success and have been applied across various scientific fields, including chemistry. However, many chemical tasks require the processing of visual information, which cannot be successfully handled by existing chemical LLMs. This brings a growing need for models capable of integrating multimodal information in the chemical domain. In this paper,… ▽ More Large Language Models (LLMs) have achieved remarkable success and have been applied across various scientific fields, including chemistry. However, many chemical tasks require the processing of visual information, which cannot be successfully handled by existing chemical LLMs. This brings a growing need for models capable of integrating multimodal information in the chemical domain. In this paper, we introduce \textbf{ChemVLM}, an open-source chemical multimodal large language model specifically designed for chemical applications. ChemVLM is trained on a carefully curated bilingual multimodal dataset that enhances its ability to understand both textual and visual chemical information, including molecular structures, reactions, and chemistry examination questions. We develop three datasets for comprehensive evaluation, tailored to Chemical Optical Character Recognition (OCR), Multimodal Chemical Reasoning (MMCR), and Multimodal Molecule Understanding tasks. We benchmark ChemVLM against a range of open-source and proprietary multimodal large language models on various tasks. Experimental results demonstrate that ChemVLM achieves competitive performance across all evaluated tasks. Our model can be found at https://huggingface.co/AI4Chem/ChemVLM-26B. △ Less

Submitted 16 August, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

Comments: 11 pages, updated version

arXiv:2408.06912 [pdf, ps, other]

New refinements of Narayana polynomials and Motzkin polynomials

Authors: Janet J. W. Dong, Lora R. Du, Kathy Q. Ji, Dax T. X. Zhang

Abstract: Chen, Deutsch and Elizalde introduced a refinement of the Narayana polynomials by distinguishing between old (leftmost child) and young leaves of plane trees. They also provided a refinement of Coker's formula by constructing a bijection. In fact, Coker's formula establishes a connection between the Narayana polynomials and the Motzkin polynomials, which implies the $γ$-positivity of the Narayana… ▽ More Chen, Deutsch and Elizalde introduced a refinement of the Narayana polynomials by distinguishing between old (leftmost child) and young leaves of plane trees. They also provided a refinement of Coker's formula by constructing a bijection. In fact, Coker's formula establishes a connection between the Narayana polynomials and the Motzkin polynomials, which implies the $γ$-positivity of the Narayana polynomials. In this paper, we introduce the polynomial $G_{n}(x_{11},x_{12},x_2;y_{11},y_{12},y_2)$, which further refine the Narayana polynomials by considering leaves of plane trees that have no siblings. We obtain the generating function for $G_n(x_{11},x_{12},x_2;y_{11},y_{12},y_2)$. To achieve further refinement of Coker's formula based on the polynomial $G_n(x_{11},x_{12},x_2;y_{11},y_{12},y_2)$, we consider a refinement $M_n(u_1,u_2,u_3;v_1,v_2)$ of the Motzkin polynomials by classifying the old leaves of a tip-augmented plane tree into three categories and the young leaves into two categories. The generating function for $M_n(u_1,u_2,u_3;v_1,v_2)$ is also established, and the refinement of Coker's formula is immediately derived by combining the generating function for $G_n(x_{11},x_{12},x_2;y_{11},y_{12},y_2)$ and the generating function for $M_n(u_1,u_2,u_3;v_1,v_2)$. We derive several interesting consequences from this refinement of Coker's formula. The method used in this paper is the grammatical approach introduced by Chen. We develop a unified grammatical approach to exploring polynomials associated with the statistics defined on plane trees. As you will see, the derivations of the generating functions for $G_n(x_{11},x_{12},x_2;{y}_{11},{y}_{12},y_2)$ and $M_n(u_1,u_2,u_3;v_1,v_2)$ become quite simple once their grammars are established. △ Less

Submitted 18 August, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

Comments: 40 pages

arXiv:2408.06848 [pdf, other]

Improving WiFi CSI Fingerprinting with IQ Samples

Authors: Junjie Wang, Yong Huang, Feiyang Zhao, Wenjing Wang, Dalong Zhang, Wei Wang

Abstract: Identity authentication is crucial for ensuring the information security of wireless communication. Radio frequency (RF) fingerprinting techniques provide a prom-ising supplement to cryptography-based authentication approaches but rely on dedicated equipment to capture in-phase and quadrature (IQ) samples, hindering their wide adoption. Recent advances advocate easily obtainable channel state in-f… ▽ More Identity authentication is crucial for ensuring the information security of wireless communication. Radio frequency (RF) fingerprinting techniques provide a prom-ising supplement to cryptography-based authentication approaches but rely on dedicated equipment to capture in-phase and quadrature (IQ) samples, hindering their wide adoption. Recent advances advocate easily obtainable channel state in-formation (CSI) by commercial WiFi devices for lightweight RF fingerprinting, but they mainly focus on eliminating channel interference and cannot address the challenges of coarse granularity and information loss of CSI measurements. To overcome these challenges, we propose CSI2Q, a novel CSI fingerprinting sys-tem that achieves comparable performance to IQ-based approaches. Instead of ex-tracting fingerprints directly from raw CSI measurements, CSI2Q first transforms them into time-domain signals that share the same feature space with IQ samples. Then, the distinct advantages of an IQ fingerprinting model in feature extraction are transferred to its CSI counterpart via an auxiliary training strategy. Finally, the trained CSI fingerprinting model is used to decide which device the sample under test comes from. We evaluate CSI2Q on both synthetic and real CSI datasets. On the synthetic dataset, our system can improve the recognition accuracy from 76% to 91%. On the real dataset, CSI2Q boosts the accuracy from 67% to 82%. △ Less

Submitted 13 August, 2024; originally announced August 2024.

Comments: Accepted by International Conference on Intelligent Computing 2024

arXiv:2408.06751 [pdf]

Polarization entanglement enabled by orthogonally stacked van der Waals NbOCl2 crystals

Authors: Qiangbing Guo, Yun-Kun Wu, Di Zhang, Qiuhong Zhang, Guang-Can Guo, Andrea Alù, Xi-Feng Ren, Cheng-Wei Qiu

Abstract: Polarization entanglement holds significant importance for photonic quantum technologies. Recently emerging subwavelength nonlinear quantum light sources, e.g., GaP and LiNbO3 thin films, benefiting from the relaxed phase-matching constraints and volume confinement, has shown intriguing properties, such as high-dimensional hyperentanglement and robust entanglement anti-degradation. Van der Waals (… ▽ More Polarization entanglement holds significant importance for photonic quantum technologies. Recently emerging subwavelength nonlinear quantum light sources, e.g., GaP and LiNbO3 thin films, benefiting from the relaxed phase-matching constraints and volume confinement, has shown intriguing properties, such as high-dimensional hyperentanglement and robust entanglement anti-degradation. Van der Waals (vdW) NbOCl2 crystal, renowned for its superior optical nonlinearities, has emerged as one of ideal candidates for ultrathin quantum light sources [Nature 613, 53 (2023)]. However, polarization-entanglement is inaccessible in NbOCl2 crystal due to its unfavorable nonlinear susceptibility tensor. Here, by leveraging the twist-stacking degree of freedom inherently in vdW systems, we showcase the preparation of tunable polarization entanglement and quantum Bell states. Our work not only provides a new and tunable polarization-entangled vdW photon-pair source, but also introduces a new knob in engineering the entanglement state of quantum light at the nanoscale. △ Less

Submitted 13 August, 2024; originally announced August 2024.

Comments: 16 pages,4 figures

arXiv:2408.06677 [pdf, other]

Search for $η_c(2S)\toωω$ and $ωφ$ decays and measurements of $χ_{cJ}\toωω$ and $ωφ$ in $ψ(2S)$ radiative processes

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

Abstract: Using $(2712\pm 14)$ $\times$ 10$^{6}$ $ψ(2S)$ events collected with the BESIII detector at the BEPCII collider, we search for the decays $η_{c}(2S)\toωω$ and $η_{c}(2S)\toωφ$ via the process $ψ(2S)\toγη_{c}(2S)$. Evidence of $η_{c}(2S)\toωω$ is found with a statistical significance of $3.2σ$. The branching fraction is measured to be… ▽ More Using $(2712\pm 14)$ $\times$ 10$^{6}$ $ψ(2S)$ events collected with the BESIII detector at the BEPCII collider, we search for the decays $η_{c}(2S)\toωω$ and $η_{c}(2S)\toωφ$ via the process $ψ(2S)\toγη_{c}(2S)$. Evidence of $η_{c}(2S)\toωω$ is found with a statistical significance of $3.2σ$. The branching fraction is measured to be $\mathcal{B}(η_{c}(2S)\toωω)=(5.65\pm3.77(\rm stat.)\pm5.32(\rm syst.))\times10^{-4}$. No statistically significant signal is observed for the decay $η_{c}(2S)\toωφ$. The upper limit of the branching fraction at the 90\% confidence level is determined to be $\mathcal{B}(ψ(2S)\toγη_{c}(2S),η_{c}(2S)\toωφ)<2.24\times 10^{-7}$. We also update the branching fractions of $χ_{cJ}\to ωω$ and $χ_{cJ}\toωφ$ decays via the $ψ(2S)\toγχ_{cJ}$ transition. The branching fractions are determined to be $\mathcal{B}(χ_{c0}\toωω)=(10.63\pm0.11\pm0.46)\times 10^{-4}$, $\mathcal{B}(χ_{c1}\toωω)=(6.39\pm0.07\pm0.29)\times 10^{-4}$, $\mathcal{B}(χ_{c2}\toωω)=(8.50\pm0.08\pm0.38)\times 10^{-4}$, $\mathcal{B}(χ_{c0}\toωφ)=(1.18\pm0.03\pm0.05)\times 10^{-4}$, $\mathcal{B}(χ_{c1}\toωφ)=(2.03\pm0.15\pm0.12)\times 10^{-5}$, and $\mathcal{B}(χ_{c2}\toωφ)=(9.37\pm1.07\pm0.59)\times 10^{-6}$, where the first uncertainties are statistical and the second are systematic. △ Less

Submitted 13 August, 2024; originally announced August 2024.

arXiv:2408.06614 [pdf, other]

ViMo: Generating Motions from Casual Videos

Authors: Liangdong Qiu, Chengxing Yu, Yanran Li, Zhao Wang, Haibin Huang, Chongyang Ma, Di Zhang, Pengfei Wan, Xiaoguang Han

Abstract: Although humans have the innate ability to imagine multiple possible actions from videos, it remains an extraordinary challenge for computers due to the intricate camera movements and montages. Most existing motion generation methods predominantly rely on manually collected motion datasets, usually tediously sourced from motion capture (Mocap) systems or Multi-View cameras, unavoidably resulting i… ▽ More Although humans have the innate ability to imagine multiple possible actions from videos, it remains an extraordinary challenge for computers due to the intricate camera movements and montages. Most existing motion generation methods predominantly rely on manually collected motion datasets, usually tediously sourced from motion capture (Mocap) systems or Multi-View cameras, unavoidably resulting in a limited size that severely undermines their generalizability. Inspired by recent advance of diffusion models, we probe a simple and effective way to capture motions from videos and propose a novel Video-to-Motion-Generation framework (ViMo) which could leverage the immense trove of untapped video content to produce abundant and diverse 3D human motions. Distinct from prior work, our videos could be more causal, including complicated camera movements and occlusions. Striking experimental results demonstrate the proposed model could generate natural motions even for videos where rapid movements, varying perspectives, or frequent occlusions might exist. We also show this work could enable three important downstream applications, such as generating dancing motions according to arbitrary music and source video style. Extensive experimental results prove that our model offers an effective and scalable way to generate diversity and realistic motions. Code and demos will be public soon. △ Less

Submitted 12 August, 2024; originally announced August 2024.

MSC Class: 68Txx

arXiv:2408.05621 [pdf]

doi 10.3390/ma16237468

CMOS-Compatible Ultrathin Superconducting NbN Thin Films Deposited by Reactive Ion Sputtering on 300 mm Si Wafer

Authors: Zihao Yang, Xiucheng Wei, Pinku Roy, Di Zhang, Ping Lu, Samyak Dhole, Haiyan Wang, Nicholas Cucciniello, Nag Patibandla, Zhebo Chen, Hao Zeng, Quanxi Jia, Mingwei Zhu

Abstract: We report a milestone in achieving large-scale, ultrathin (~5 nm) superconducting NbN thin films on 300 mm Si wafers using a high-volume manufacturing (HVM) industrial physical vapor deposition (PVD) system. The NbN thin films possess remarkable structural uniformity and consistently high superconducting quality across the entire 300 mm Si wafer, by incorporating an AlN buffer layer. High-resoluti… ▽ More We report a milestone in achieving large-scale, ultrathin (~5 nm) superconducting NbN thin films on 300 mm Si wafers using a high-volume manufacturing (HVM) industrial physical vapor deposition (PVD) system. The NbN thin films possess remarkable structural uniformity and consistently high superconducting quality across the entire 300 mm Si wafer, by incorporating an AlN buffer layer. High-resolution X-ray diffraction and transmission electron microscopy analyses unveiled enhanced crystallinity of (111)-oriented δ-phase NbN with the AlN buffer layer. Notably, NbN films deposited on AlN-buffered Si substrates exhibited a significantly elevated superconducting critical temperature (~2 K higher for the 10 nm NbN) and a higher upper critical magnetic field or Hc2 (34.06 T boost in Hc2 for the 50 nm NbN) in comparison with those without AlN. These findings present a promising pathway for the integration of quantum-grade superconducting NbN films with the existing 300 mm CMOS Si platform for quantum information applications. △ Less

Submitted 10 August, 2024; originally announced August 2024.

Journal ref: Materials 2023, 16, 7468

arXiv:2408.05543 [pdf, other]

PixelFade: Privacy-preserving Person Re-identification with Noise-guided Progressive Replacement

Authors: Delong Zhang, Yi-Xing Peng, Xiao-Ming Wu, Ancong Wu, Wei-Shi Zheng

Abstract: Online person re-identification services face privacy breaches from potential data leakage and recovery attacks, exposing cloud-stored images to malicious attackers and triggering public concern. The privacy protection of pedestrian images is crucial. Previous privacy-preserving person re-identification methods are unable to resist recovery attacks and compromise accuracy. In this paper, we propos… ▽ More Online person re-identification services face privacy breaches from potential data leakage and recovery attacks, exposing cloud-stored images to malicious attackers and triggering public concern. The privacy protection of pedestrian images is crucial. Previous privacy-preserving person re-identification methods are unable to resist recovery attacks and compromise accuracy. In this paper, we propose an iterative method (PixelFade) to optimize pedestrian images into noise-like images to resist recovery attacks. We first give an in-depth study of protected images from previous privacy methods, which reveal that the chaos of protected images can disrupt the learning of recovery models. Accordingly, Specifically, we propose Noise-guided Objective Function with the feature constraints of a specific authorization model, optimizing pedestrian images to normal-distributed noise images while preserving their original identity information as per the authorization model. To solve the above non-convex optimization problem, we propose a heuristic optimization algorithm that alternately performs the Constraint Operation and the Partial Replacement Operation. This strategy not only safeguards that original pixels are replaced with noises to protect privacy, but also guides the images towards an improved optimization direction to effectively preserve discriminative features. Extensive experiments demonstrate that our PixelFade outperforms previous methods in resisting recovery attacks and Re-ID performance. The code is available at https://github.com/iSEE-Laboratory/PixelFade. △ Less

Submitted 10 August, 2024; originally announced August 2024.

Comments: accepted by ACMMM24

arXiv:2408.05517 [pdf, other]

SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning

Authors: Yuze Zhao, Jintao Huang, Jinghan Hu, Xingjun Wang, Yunlin Mao, Daoze Zhang, Zeyinzi Jiang, Zhikai Wu, Baole Ai, Ang Wang, Wenmeng Zhou, Yingda Chen

Abstract: Recent development in Large Language Models (LLMs) and Multi-modal Large Language Models (MLLMs) have leverage Attention-based Transformer architectures and achieved superior performance and generalization capabilities. They have since covered extensive areas of traditional learning tasks. For instance, text-based tasks such as text-classification and sequence-labeling, as well as multi-modal task… ▽ More Recent development in Large Language Models (LLMs) and Multi-modal Large Language Models (MLLMs) have leverage Attention-based Transformer architectures and achieved superior performance and generalization capabilities. They have since covered extensive areas of traditional learning tasks. For instance, text-based tasks such as text-classification and sequence-labeling, as well as multi-modal tasks like Visual Question Answering (VQA) and Optical Character Recognition (OCR), which were previously addressed using different models, can now be tackled based on one foundation model. Consequently, the training and lightweight fine-tuning of LLMs and MLLMs, especially those based on Transformer architecture, has become particularly important. In recognition of these overwhelming needs, we develop SWIFT, a customizable one-stop infrastructure for large models. With support of over $300+$ LLMs and $50+$ MLLMs, SWIFT stands as the open-source framework that provide the most comprehensive support for fine-tuning large models. In particular, it is the first training framework that provides systematic support for MLLMs. In addition to the core functionalities of fine-tuning, SWIFT also integrates post-training processes such as inference, evaluation, and model quantization, to facilitate fast adoptions of large models in various application scenarios. With a systematic integration of various training techniques, SWIFT offers helpful utilities such as benchmark comparisons among different training techniques for large models. For fine-tuning models specialized in agent framework, we show that notable improvements on the ToolBench leader-board can be achieved by training with customized dataset on SWIFT, with an increase of 5.2%-21.8% in the Act.EM metric over various baseline models, a reduction in hallucination by 1.6%-14.1%, and an average performance improvement of 8%-17%. △ Less

Submitted 18 August, 2024; v1 submitted 10 August, 2024; originally announced August 2024.

arXiv:2408.05450 [pdf, ps, other]

Existence and non-uniqueness of probabilistically strong solutions to 3D stochastic magnetohydrodynamic equations

Authors: Wenping Cao, Yachun Li, Deng Zhang

Abstract: We are concerned with the 3D stochastic magnetohydrodynamic (MHD) equations driven by additive noise on torus. For arbitrarily prescribed divergence-free initial data in $L^{2}_x$, we construct infinitely many probabilistically strong and analitically weak solutions in the class $L^{r}_ΩL_{t}^γW_{x}^{s,p}$, where $r>1$ and $(s, γ, p)$ lie in a supercritical regime with respect to the the Ladyžhens… ▽ More We are concerned with the 3D stochastic magnetohydrodynamic (MHD) equations driven by additive noise on torus. For arbitrarily prescribed divergence-free initial data in $L^{2}_x$, we construct infinitely many probabilistically strong and analitically weak solutions in the class $L^{r}_ΩL_{t}^γW_{x}^{s,p}$, where $r>1$ and $(s, γ, p)$ lie in a supercritical regime with respect to the the Ladyžhenskaya-Prodi-Serrin (LPS) criteria. In particular, we get the non-uniqueness of probabilistically strong solutions, which is sharp at one LPS endpoint space. Our proof utilizes intermittent flows which are different from those of Navier-Stokes equations and derives the non-uniqueness even in the high viscous and resistive regime beyond the Lions exponent 5/4. Furthermore, we prove that as the noise intensity tends to zero, the accumulation points of stochastic MHD solutions contain all deterministic solutions to MHD solutions, which include the recently constructed solutions in [28, 29] to deterministic MHD systems. △ Less

Submitted 10 August, 2024; originally announced August 2024.

arXiv:2408.05194 [pdf]

The Economic Analysis of the Common Pool Method through the HARA Utility Functions

Authors: Mu Lin, Di Zhang, Ben Chen, Hang Zheng

Abstract: Water market is a contemporary marketplace for water trading and is deemed to one of the most efficient instruments to improve the social welfare. In modern water markets, the two widely used trading systems are an improved pair-wise trading, and a 'smart market' or common pool method. In comparison with the economic model, this paper constructs a conceptual mathematic model through the HARA utili… ▽ More Water market is a contemporary marketplace for water trading and is deemed to one of the most efficient instruments to improve the social welfare. In modern water markets, the two widely used trading systems are an improved pair-wise trading, and a 'smart market' or common pool method. In comparison with the economic model, this paper constructs a conceptual mathematic model through the HARA utility functions. Mirroring the concepts such as Nash Equilibrium, Pareto optimal and stable matching in economy, three significant propositions are acquired which illustrate the advantages of the common pool method compared with the improved pair-wise trading. △ Less

Submitted 9 August, 2024; originally announced August 2024.

arXiv:2408.05134 [pdf, other]

Observation of muonic Dalitz decays of $χ_{b}$ mesons and precise spectroscopy of hidden-beauty states

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1114 additional authors not shown)

Abstract: The decays of the $χ_{b1}(1P)$, $χ_{b2}(1P)$, $χ_{b1}(2P)$ and $χ_{b2}(2P)$~mesons into the~$Υ(1S)μ^+μ^-$ final state are observed with a high significance using proton-proton collision data collected with the LHCb detector and corresponding to an integrated luminosity of 9fb$^{-1}$. The newly observed decays together with the $Υ(2S)\rightarrow Υ(1S)π^+π^-$ and $Υ(3S)\rightarrow Υ(2S)π^+π^-$ decay… ▽ More The decays of the $χ_{b1}(1P)$, $χ_{b2}(1P)$, $χ_{b1}(2P)$ and $χ_{b2}(2P)$~mesons into the~$Υ(1S)μ^+μ^-$ final state are observed with a high significance using proton-proton collision data collected with the LHCb detector and corresponding to an integrated luminosity of 9fb$^{-1}$. The newly observed decays together with the $Υ(2S)\rightarrow Υ(1S)π^+π^-$ and $Υ(3S)\rightarrow Υ(2S)π^+π^-$ decay modes are used for precision measurements of the mass and mass splittings for the hidden-beauty states. △ Less

Submitted 9 August, 2024; originally announced August 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-025.html

Report number: LHCb-PAPER-2024-025,CERN-EP-2024-207

arXiv:2408.04767 [pdf]

Data-Driven Pixel Control: Challenges and Prospects

Authors: Saurabh Farkya, Zachary Alan Daniels, Aswin Raghavan, Gooitzen van der Wal, Michael Isnardi, Michael Piacentino, David Zhang

Abstract: Recent advancements in sensors have led to high resolution and high data throughput at the pixel level. Simultaneously, the adoption of increasingly large (deep) neural networks (NNs) has lead to significant progress in computer vision. Currently, visual intelligence comes at increasingly high computational complexity, energy, and latency. We study a data-driven system that combines dynamic sensin… ▽ More Recent advancements in sensors have led to high resolution and high data throughput at the pixel level. Simultaneously, the adoption of increasingly large (deep) neural networks (NNs) has lead to significant progress in computer vision. Currently, visual intelligence comes at increasingly high computational complexity, energy, and latency. We study a data-driven system that combines dynamic sensing at the pixel level with computer vision analytics at the video level and propose a feedback control loop to minimize data movement between the sensor front-end and computational back-end without compromising detection and tracking precision. Our contributions are threefold: (1) We introduce anticipatory attention and show that it leads to high precision prediction with sparse activation of pixels; (2) Leveraging the feedback control, we show that the dimensionality of learned feature vectors can be significantly reduced with increased sparsity; and (3) We emulate analog design choices (such as varying RGB or Bayer pixel format and analog noise) and study their impact on the key metrics of the data-driven system. Comparative analysis with traditional pixel and deep learning models shows significant performance enhancements. Our system achieves a 10X reduction in bandwidth and a 15-30X improvement in Energy-Delay Product (EDP) when activating only 30% of pixels, with a minor reduction in object detection and tracking precision. Based on analog emulation, our system can achieve a throughput of 205 megapixels/sec (MP/s) with a power consumption of only 110 mW per MP, i.e., a theoretical improvement of ~30X in EDP. △ Less

Submitted 8 August, 2024; originally announced August 2024.

Comments: Accepted to the Conference on Dynamic Data-Driven Applications Systems (DDDAS2024)

arXiv:2408.04686 [pdf, other]

Multi-Turn Context Jailbreak Attack on Large Language Models From First Principles

Authors: Xiongtao Sun, Deyue Zhang, Dongdong Yang, Quanchen Zou, Hui Li

Abstract: Large language models (LLMs) have significantly enhanced the performance of numerous applications, from intelligent conversations to text generation. However, their inherent security vulnerabilities have become an increasingly significant challenge, especially with respect to jailbreak attacks. Attackers can circumvent the security mechanisms of these LLMs, breaching security constraints and causi… ▽ More Large language models (LLMs) have significantly enhanced the performance of numerous applications, from intelligent conversations to text generation. However, their inherent security vulnerabilities have become an increasingly significant challenge, especially with respect to jailbreak attacks. Attackers can circumvent the security mechanisms of these LLMs, breaching security constraints and causing harmful outputs. Focusing on multi-turn semantic jailbreak attacks, we observe that existing methods lack specific considerations for the role of multiturn dialogues in attack strategies, leading to semantic deviations during continuous interactions. Therefore, in this paper, we establish a theoretical foundation for multi-turn attacks by considering their support in jailbreak attacks, and based on this, propose a context-based contextual fusion black-box jailbreak attack method, named Context Fusion Attack (CFA). This method approach involves filtering and extracting key terms from the target, constructing contextual scenarios around these terms, dynamically integrating the target into the scenarios, replacing malicious key terms within the target, and thereby concealing the direct malicious intent. Through comparisons on various mainstream LLMs and red team datasets, we have demonstrated CFA's superior success rate, divergence, and harmfulness compared to other multi-turn attack strategies, particularly showcasing significant advantages on Llama3 and GPT-4. △ Less

Submitted 8 August, 2024; originally announced August 2024.

arXiv:2408.04422 [pdf, other]

Analysis of the dynamics of the decay $D^{+}\to K_{S}^{0} π^{0} e^{+}ν_{e}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

Abstract: The branching fraction of $D^+\to K_{S}^{0} π^{0}e^+ν_e$ is measured for the first time using $7.93~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energy $\sqrt{s}=3.773$~GeV with the BESIII detector operating at the BEPCII collider, and is determined to be ${\mathcal B}$($D^+\to K_S^0π^0e^+ν_e$) = $(0.881~\pm~0.017_{\rm stat.}~\pm~0.016_{\rm syst.})$\%. Based on a… ▽ More The branching fraction of $D^+\to K_{S}^{0} π^{0}e^+ν_e$ is measured for the first time using $7.93~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energy $\sqrt{s}=3.773$~GeV with the BESIII detector operating at the BEPCII collider, and is determined to be ${\mathcal B}$($D^+\to K_S^0π^0e^+ν_e$) = $(0.881~\pm~0.017_{\rm stat.}~\pm~0.016_{\rm syst.})$\%. Based on an analysis of the $D^+\to K_S^0π^0e^+ν_e$ decay dynamics, we observe the $S\text{-}{\rm wave}$ and $P$-wave components with fractions of $f_{S\text{-}{\rm wave}}$ = $(6.13~\pm~0.27_{\rm stat.}~\pm ~0.30_{\rm syst.})\%$ and $f_{\bar K^{*}(892)^0}$ = $(93.88~\pm~0.27_{\rm stat.}~\pm~0.29_{\rm syst.})$\%, respectively. From these results, we obtain the branching fractions ${\mathcal B}$($D^+\to (K_S^0π^0)_{S\text{-}{\rm wave}}~e^+ν_e$) = $(5.41~\pm~0.35_{\rm stat.}~\pm~0.37_{\rm syst.})\times10^{-4}$ and ${\mathcal B}$($D^+\to \bar K^{*}(892)^0e^+ν_e$) = $(4.97~\pm~0.11_{\rm stat.}~\pm~0.12_{\rm syst.})$\%. In addition, the hadronic form-factor ratios of $D^{+} \to \bar {K}^{*}(892)^0e^+ν_e$ at $q^2=0$, assuming a single-pole dominance parameterization, are determined to be $r_V=\frac{V(0)}{A_1(0)}= 1.43~\pm~0.07_{\rm stat.}~\pm~0.03_{\rm syst.}$ and $r_2=\frac{A_2(0)}{A_1(0)}=0.72~\pm~0.06_{\rm stat.}~\pm~0.02_{\rm syst.}$. △ Less

Submitted 8 August, 2024; originally announced August 2024.

arXiv:2408.04259 [pdf, other]

EfficientRAG: Efficient Retriever for Multi-Hop Question Answering

Authors: Ziyuan Zhuang, Zhiyang Zhang, Sitao Cheng, Fangkai Yang, Jia Liu, Shujian Huang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

Abstract: Retrieval-augmented generation (RAG) methods encounter difficulties when addressing complex questions like multi-hop queries. While iterative retrieval methods improve performance by gathering additional information, current approaches often rely on multiple calls of large language models (LLMs). In this paper, we introduce EfficientRAG, an efficient retriever for multi-hop question answering. Eff… ▽ More Retrieval-augmented generation (RAG) methods encounter difficulties when addressing complex questions like multi-hop queries. While iterative retrieval methods improve performance by gathering additional information, current approaches often rely on multiple calls of large language models (LLMs). In this paper, we introduce EfficientRAG, an efficient retriever for multi-hop question answering. EfficientRAG iteratively generates new queries without the need for LLM calls at each iteration and filters out irrelevant information. Experimental results demonstrate that EfficientRAG surpasses existing RAG methods on three open-domain multi-hop question-answering datasets. △ Less

Submitted 8 August, 2024; originally announced August 2024.

Comments: 20 pages, 4 figures

Showing 1–50 of 3,808 results for author: Zhang, D