Search | arXiv e-print repository

Search for $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0h_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (653 additional authors not shown)

Abstract: Using $(2712.4 \pm 14.3) \times 10^6~ψ$(3686) events collected with the BESIII detector operating at the BEPCII collider, we search for the hadronic transition $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0 h_c$. No significant signal is observed. We set the most stringent upper limits to date on the branching fractions $\mathcal{B}(ψ(3686)\to π^0 h_c)\times\mathcal{B}(h_c\toπ^+π^-J/ψ)$ and… ▽ More Using $(2712.4 \pm 14.3) \times 10^6~ψ$(3686) events collected with the BESIII detector operating at the BEPCII collider, we search for the hadronic transition $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0 h_c$. No significant signal is observed. We set the most stringent upper limits to date on the branching fractions $\mathcal{B}(ψ(3686)\to π^0 h_c)\times\mathcal{B}(h_c\toπ^+π^-J/ψ)$ and $\mathcal{B}(h_c \to π^+π^-J/ψ)$ at the 90$\%$ confidence level, which are determined to be $6.7\times 10^{-7}$ and $9.4 \times10^{-4}$, respectively. △ Less

Submitted 30 August, 2024; originally announced August 2024.

arXiv:2408.13005 [pdf, other]

EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation

Authors: Cong Wang, Jiaxi Gu, Panwen Hu, Haoyu Zhao, Yuanfan Guo, Jianhua Han, Hang Xu, Xiaodan Liang

Abstract: Following the advancements in text-guided image generation technology exemplified by Stable Diffusion, video generation is gaining increased attention in the academic community. However, relying solely on text guidance for video generation has serious limitations, as videos contain much richer content than images, especially in terms of motion. This information can hardly be adequately described w… ▽ More Following the advancements in text-guided image generation technology exemplified by Stable Diffusion, video generation is gaining increased attention in the academic community. However, relying solely on text guidance for video generation has serious limitations, as videos contain much richer content than images, especially in terms of motion. This information can hardly be adequately described with plain text. Fortunately, in computer vision, various visual representations can serve as additional control signals to guide generation. With the help of these signals, video generation can be controlled in finer detail, allowing for greater flexibility for different applications. Integrating various controls, however, is nontrivial. In this paper, we propose a universal framework called EasyControl. By propagating and injecting condition features through condition adapters, our method enables users to control video generation with a single condition map. With our framework, various conditions including raw pixels, depth, HED, etc., can be integrated into different Unet-based pre-trained video diffusion models at a low practical cost. We conduct comprehensive experiments on public datasets, and both quantitative and qualitative results indicate that our method outperforms state-of-the-art methods. EasyControl significantly improves various evaluation metrics across multiple validation datasets compared to previous works. Specifically, for the sketch-to-video generation task, EasyControl achieves an improvement of 152.0 on FVD and 19.9 on IS, respectively, in UCF101 compared with VideoComposer. For fidelity, our model demonstrates powerful image retention ability, resulting in high FVD and IS in UCF101 and MSR-VTT compared to other image-to-video models. △ Less

Submitted 23 August, 2024; originally announced August 2024.

arXiv:2408.12171 [pdf, other]

Recent Advances on Machine Learning for Computational Fluid Dynamics: A Survey

Authors: Haixin Wang, Yadi Cao, Zijie Huang, Yuxuan Liu, Peiyan Hu, Xiao Luo, Zezheng Song, Wanjia Zhao, Jilin Liu, Jinan Sun, Shikun Zhang, Long Wei, Yue Wang, Tailin Wu, Zhi-Ming Ma, Yizhou Sun

Abstract: This paper explores the recent advancements in enhancing Computational Fluid Dynamics (CFD) tasks through Machine Learning (ML) techniques. We begin by introducing fundamental concepts, traditional methods, and benchmark datasets, then examine the various roles ML plays in improving CFD. The literature systematically reviews papers in recent five years and introduces a novel classification for for… ▽ More This paper explores the recent advancements in enhancing Computational Fluid Dynamics (CFD) tasks through Machine Learning (ML) techniques. We begin by introducing fundamental concepts, traditional methods, and benchmark datasets, then examine the various roles ML plays in improving CFD. The literature systematically reviews papers in recent five years and introduces a novel classification for forward modeling: Data-driven Surrogates, Physics-Informed Surrogates, and ML-assisted Numerical Solutions. Furthermore, we also review the latest ML methods in inverse design and control, offering a novel classification and providing an in-depth discussion. Then we highlight real-world applications of ML for CFD in critical scientific and engineering disciplines, including aerodynamics, combustion, atmosphere & ocean science, biology fluid, plasma, symbolic regression, and reduced order modeling. Besides, we identify key challenges and advocate for future research directions to address these challenges, such as multi-scale representation, physical knowledge encoding, scientific foundation model and automatic scientific discovery. This review serves as a guide for the rapidly expanding ML for CFD community, aiming to inspire insights for future advancements. We draw the conclusion that ML is poised to significantly transform CFD research by enhancing simulation accuracy, reducing computational time, and enabling more complex analyses of fluid dynamics. The paper resources can be viewed at https://github.com/WillDreamer/Awesome-AI4CFD. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: 22 pages, 6 figures

arXiv:2408.11746 [pdf, other]

Mixed Sparsity Training: Achieving 4$\times$ FLOP Reduction for Transformer Pretraining

Authors: Pihe Hu, Shaolong Li, Longbo Huang

Abstract: Large language models (LLMs) have made significant strides in complex tasks, yet their widespread adoption is impeded by substantial computational demands. With hundreds of billion parameters, transformer-based LLMs necessitate months of pretraining across a high-end GPU cluster. However, this paper reveals a compelling finding: transformers exhibit considerable redundancy in pretraining computati… ▽ More Large language models (LLMs) have made significant strides in complex tasks, yet their widespread adoption is impeded by substantial computational demands. With hundreds of billion parameters, transformer-based LLMs necessitate months of pretraining across a high-end GPU cluster. However, this paper reveals a compelling finding: transformers exhibit considerable redundancy in pretraining computations, which motivates our proposed solution, Mixed Sparsity Training (MST), an efficient pretraining method that can reduce about $75\%$ of Floating Point Operations (FLOPs) while maintaining performance. MST integrates dynamic sparse training (DST) with Sparsity Variation (SV) and Hybrid Sparse Attention (HSA) during pretraining, involving three distinct phases: warm-up, ultra-sparsification, and restoration. The warm-up phase transforms the dense model into a sparse one, and the restoration phase reinstates connections. Throughout these phases, the model is trained with a dynamically evolving sparse topology and an HSA mechanism to maintain performance and minimize training FLOPs concurrently. Our experiment on GPT-2 showcases a FLOP reduction of $4\times$ without compromising performance. △ Less

Submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.11444 [pdf, other]

A Practical Trigger-Free Backdoor Attack on Neural Networks

Authors: Jiahao Wang, Xianglong Zhang, Xiuzhen Cheng, Pengfei Hu, Guoming Zhang

Abstract: Backdoor attacks on deep neural networks have emerged as significant security threats, especially as DNNs are increasingly deployed in security-critical applications. However, most existing works assume that the attacker has access to the original training data. This limitation restricts the practicality of launching such attacks in real-world scenarios. Additionally, using a specified trigger to… ▽ More Backdoor attacks on deep neural networks have emerged as significant security threats, especially as DNNs are increasingly deployed in security-critical applications. However, most existing works assume that the attacker has access to the original training data. This limitation restricts the practicality of launching such attacks in real-world scenarios. Additionally, using a specified trigger to activate the injected backdoor compromises the stealthiness of the attacks. To address these concerns, we propose a trigger-free backdoor attack that does not require access to any training data. Specifically, we design a novel fine-tuning approach that incorporates the concept of malicious data into the concept of the attacker-specified class, resulting the misclassification of trigger-free malicious data into the attacker-specified class. Furthermore, instead of relying on training data to preserve the model's knowledge, we employ knowledge distillation methods to maintain the performance of the infected model on benign samples, and introduce a parameter importance evaluation mechanism based on elastic weight constraints to facilitate the fine-tuning of the infected model. The effectiveness, practicality, and stealthiness of the proposed attack are comprehensively evaluated on three real-world datasets. Furthermore, we explore the potential for enhancing the attack through the use of auxiliary datasets and model inversion. △ Less

Submitted 21 August, 2024; originally announced August 2024.

Comments: 12 pages, 10 figures

arXiv:2408.10053 [pdf, other]

Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory

Authors: Haoran Li, Wei Fan, Yulin Chen, Jiayang Cheng, Tianshu Chu, Xuebing Zhou, Peizhao Hu, Yangqiu Song

Abstract: Privacy research has attracted wide attention as individuals worry that their private data can be easily leaked during interactions with smart devices, social platforms, and AI applications. Computer science researchers, on the other hand, commonly study privacy issues through privacy attacks and defenses on segmented fields. Privacy research is conducted on various sub-fields, including Computer… ▽ More Privacy research has attracted wide attention as individuals worry that their private data can be easily leaked during interactions with smart devices, social platforms, and AI applications. Computer science researchers, on the other hand, commonly study privacy issues through privacy attacks and defenses on segmented fields. Privacy research is conducted on various sub-fields, including Computer Vision (CV), Natural Language Processing (NLP), and Computer Networks. Within each field, privacy has its own formulation. Though pioneering works on attacks and defenses reveal sensitive privacy issues, they are narrowly trapped and cannot fully cover people's actual privacy concerns. Consequently, the research on general and human-centric privacy research remains rather unexplored. In this paper, we formulate the privacy issue as a reasoning problem rather than simple pattern matching. We ground on the Contextual Integrity (CI) theory which posits that people's perceptions of privacy are highly correlated with the corresponding social context. Based on such an assumption, we develop the first comprehensive checklist that covers social identities, private attributes, and existing privacy regulations. Unlike prior works on CI that either cover limited expert annotated norms or model incomplete social context, our proposed privacy checklist uses the whole Health Insurance Portability and Accountability Act of 1996 (HIPAA) as an example, to show that we can resort to large language models (LLMs) to completely cover the HIPAA's regulations. Additionally, our checklist also gathers expert annotations across multiple ontologies to determine private information including but not limited to personally identifiable information (PII). We use our preliminary results on the HIPAA to shed light on future context-centric privacy research to cover more privacy regulations, social norms and standards. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.08826 [pdf, other]

Search for the rare decay $J/ψ\to γD^0+c.c.$ at BESIII

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (642 additional authors not shown)

Abstract: Using $(10087\pm44)\times10^6J/ψ$ events collected with the BESIII detector, we search for the rare decay $J/ψ\to γD^0+c.c.$ for the first time. No obvious signal is observed and the upper limit on the branching fraction is determined to be ${\cal B}(J/ψ\to γD^{0}+c.c.)< 9.1 \times 10^{-8}$ at 90\% confidence level. Using $(10087\pm44)\times10^6J/ψ$ events collected with the BESIII detector, we search for the rare decay $J/ψ\to γD^0+c.c.$ for the first time. No obvious signal is observed and the upper limit on the branching fraction is determined to be ${\cal B}(J/ψ\to γD^{0}+c.c.)< 9.1 \times 10^{-8}$ at 90\% confidence level. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.07644 [pdf, other]

doi 10.13140/RG.2.2.24505.17769

SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning

Authors: Jianye Xu, Pan Hu, Bassam Alrifaee

Abstract: This paper introduces an open-source, decentralized framework named SigmaRL, designed to enhance both sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles. Most RL agents exhibit a limited capacity to generalize, often focusing narrowly on specific scenarios, and are usually evaluated in similar or even the same sce… ▽ More This paper introduces an open-source, decentralized framework named SigmaRL, designed to enhance both sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles. Most RL agents exhibit a limited capacity to generalize, often focusing narrowly on specific scenarios, and are usually evaluated in similar or even the same scenarios seen during training. Various methods have been proposed to address these challenges, including experience replay and regularization. However, how observation design in RL affects sample efficiency and generalization remains an under-explored area. We address this gap by proposing five strategies to design information-dense observations, focusing on general features that are applicable to most traffic scenarios. We train our RL agents using these strategies on an intersection and evaluate their generalization through numerical experiments across completely unseen traffic scenarios, including a new intersection, an on-ramp, and a roundabout. Incorporating these information-dense observations reduces training times to under one hour on a single CPU, and the evaluation results reveal that our RL agents can effectively zero-shot generalize. Code: github.com/cas-lab-munich/SigmaRL △ Less

Submitted 14 August, 2024; originally announced August 2024.

Comments: 8 pages, 5 figures, accepted for presentation at the IEEE International Conference on Intelligent Transportation Systems (ITSC) 2024

arXiv:2408.07374 [pdf]

Coupling Between Local and Global Oscillations in Palladium-Catalysed Methane Oxidation

Authors: Yuxiong Hu, Jianyu Hu, Mengzhao Sun, Aowen Li, Shucheng Shi, P. J. Hu, Wu Zhou, Marc-Georg Willinger, Dan Zhou, Zhi Liu, Xi Liu, Wei-Xue Li, Zhu-Jun Wang

Abstract: The interplay between order and disorder is crucial across various fields, especially in understanding oscillatory phenomena. Periodic oscillations are frequently observed in heterogeneous catalysis, yet their underlying mechanisms need deeper exploration. Here, we investigate how periodic oscillations arise during methane oxidation catalysed by palladium nanoparticles (Pd NPs), utilizing a suite… ▽ More The interplay between order and disorder is crucial across various fields, especially in understanding oscillatory phenomena. Periodic oscillations are frequently observed in heterogeneous catalysis, yet their underlying mechanisms need deeper exploration. Here, we investigate how periodic oscillations arise during methane oxidation catalysed by palladium nanoparticles (Pd NPs), utilizing a suite of complementary operando techniques across various spatial scales. We found that reaction intensity and collective dynamic modes can be tuned by the reactant gas-flow rate. At lower gas-flow rates, we observed periodic facet reconstruction of Pd NPs correlated with repeated bubbling behaviour at the Pd/PdO interface, without evident global oscillatory responses. Conversely, at higher gas-flow rates, Pd NPs undergo chaotic transformations between metallic and oxidized states, resulting in overall oscillation. Integrating our observations at different gas-flow rates, we attributed the emergence of global oscillation to thermal coupling regulated by gas flow and connected local and global dynamics through a weak synchronization mechanism. This work demonstrates the correlations between open surfaces and interfaces, chaos and regularity, and dissipative processes and coupling behaviour. Our findings offer critical insights into the complexity behind catalytic oscillations and provide guidance for modulating oscillatory behaviours in catalytic processes, with significant implications for both science and industry. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2408.05592 [pdf, other]

doi 10.1109/SANER60148.2024.00048

SHREC: a SRE Behaviour Knowledge Graph Model for Shell Command Recommendations

Authors: Andrea Tonon, Bora Caglayan, MingXue Wang, Peng Hu, Fei Shen, Puchao Zhang

Abstract: In IT system operations, shell commands are common command line tools used by site reliability engineers (SREs) for daily tasks, such as system configuration, package deployment, and performance optimization. The efficiency in their execution has a crucial business impact since shell commands very often aim to execute critical operations, such as the resolution of system faults. However, many shel… ▽ More In IT system operations, shell commands are common command line tools used by site reliability engineers (SREs) for daily tasks, such as system configuration, package deployment, and performance optimization. The efficiency in their execution has a crucial business impact since shell commands very often aim to execute critical operations, such as the resolution of system faults. However, many shell commands involve long parameters that make them hard to remember and type. Additionally, the experience and knowledge of SREs using these commands are almost always not preserved. In this work, we propose SHREC, a SRE behaviour knowledge graph model for shell command recommendations. We model the SRE shell behaviour knowledge as a knowledge graph and propose a strategy to directly extract such a knowledge from SRE historical shell operations. The knowledge graph is then used to provide shell command recommendations in real-time to improve the SRE operation efficiency. Our empirical study based on real shell commands executed in our company demonstrates that SHREC can improve the SRE operation efficiency, allowing to share and re-utilize the SRE knowledge. △ Less

Submitted 10 August, 2024; originally announced August 2024.

Comments: Accepted at IEEE SANER 2024

Journal ref: Proceedings of the 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2024. p. 406-416

arXiv:2408.04859 [pdf]

doi 10.1103/PhysRevPhysEducRes.20.020108

Investigating and improving student understanding of the basics of quantum computing

Authors: Peter Hu, Yangqiuting Li, Chandralekha Singh

Abstract: Quantum information science and engineering (QISE) is a rapidly developing field that leverages the skills of experts from many disciplines to utilize the potential of quantum systems in a variety of applications. It requires talent from a wide variety of traditional fields, including physics, engineering, chemistry, and computer science, to name a few. To prepare students for such opportunities,… ▽ More Quantum information science and engineering (QISE) is a rapidly developing field that leverages the skills of experts from many disciplines to utilize the potential of quantum systems in a variety of applications. It requires talent from a wide variety of traditional fields, including physics, engineering, chemistry, and computer science, to name a few. To prepare students for such opportunities, it is important to give them a strong foundation in the basics of QISE, in which quantum computing plays a central role. In this study, we discuss the development, validation, and evaluation of a QuILT, or Quantum Interactive Learning Tutorial, on the basics and applications of quantum computing. These include an overview of key quantum mechanical concepts relevant for quantum computation (including ways a quantum computer is different from a classical computer), properties of single- and multi-qubit systems, and the basics of single-qubit quantum gates. The tutorial uses guided inquiry-based teaching-learning sequences. Its development and validation involved conducting cognitive task analysis from both expert and student perspectives and using common student difficulties as a guide. The inquiry-based learning sequences in the tutorial provide scaffolding support to help students develop a functional understanding. The final version of the validated tutorial was implemented in two distinct courses offered by the physics department with slightly different student populations and broader course goals. Students' understanding was evaluated after traditional lecture-based instruction on the requisite concepts, and again after engaging with the tutorial. We analyze and discuss their improvement in performance on concepts covered in the tutorial. △ Less

Submitted 9 August, 2024; originally announced August 2024.

Comments: 39 pages

Journal ref: Physical Review Physics Education Research 20, 020108 (2024)

arXiv:2408.04791 [pdf, other]

A holographic model of magnetohydrodynamics with fortuitous SO(3) symmetry

Authors: Yanqi Wang, Peng-Ju Hu, Yi Pang

Abstract: We study magnetohydrodynamics using holography. The gravity model is closely related to the STU supergravity in five dimensions and admits an analytical black brane solution carrying the conserved charge dual to the magnetic 1-form symmetry of the magnetohydrodynamic system. The black brane solution features a fortuitous SO(3) symmetry, providing a new symmetry principle for describing the magneto… ▽ More We study magnetohydrodynamics using holography. The gravity model is closely related to the STU supergravity in five dimensions and admits an analytical black brane solution carrying the conserved charge dual to the magnetic 1-form symmetry of the magnetohydrodynamic system. The black brane solution features a fortuitous SO(3) symmetry, providing a new symmetry principle for describing the magnetohydrodynamics. Since the bulk theory contains multiple 2-form gauge fields, the resistivity becomes matrix-valued. We find that the antisymmetric part of the resistivity matrix exhibits novel features depending on the UV cut-off of the theory. We also compute the shear and bulk viscosities and find that the bulk viscosity is proportional to the shear viscosity. Remarkably, the proportionality constant is exactly what is required for conformality, despite the zeroth-order energy-momentum tensor not being trace-free. △ Less

Submitted 17 August, 2024; v1 submitted 8 August, 2024; originally announced August 2024.

Comments: 24 pages, 5 figures

arXiv:2408.04737 [pdf, other]

Quantifying the Corpus Bias Problem in Automatic Music Transcription Systems

Authors: Lukáš Samuel Marták, Patricia Hu, Gerhard Widmer

Abstract: Automatic Music Transcription (AMT) is the task of recognizing notes in audio recordings of music. The State-of-the-Art (SotA) benchmarks have been dominated by deep learning systems. Due to the scarcity of high quality data, they are usually trained and evaluated exclusively or predominantly on classical piano music. Unfortunately, that hinders our ability to understand how they generalize to oth… ▽ More Automatic Music Transcription (AMT) is the task of recognizing notes in audio recordings of music. The State-of-the-Art (SotA) benchmarks have been dominated by deep learning systems. Due to the scarcity of high quality data, they are usually trained and evaluated exclusively or predominantly on classical piano music. Unfortunately, that hinders our ability to understand how they generalize to other music. Previous works have revealed several aspects of memorization and overfitting in these systems. We identify two primary sources of distribution shift: the music, and the sound. Complementing recent results on the sound axis (i.e. acoustics, timbre), we investigate the musical one (i.e. note combinations, dynamics, genre). We evaluate the performance of several SotA AMT systems on two new experimental test sets which we carefully construct to emulate different levels of musical distribution shift. Our results reveal a stark performance gap, shedding further light on the Corpus Bias problem, and the extent to which it continues to trouble these systems. △ Less

Submitted 8 August, 2024; originally announced August 2024.

Comments: 2 pages, 1 figure, presented in the 1st International Workshop on Sound Signal Processing Applications (IWSSPA) 2024

arXiv:2408.03124 [pdf, other]

Closed-loop Diffusion Control of Complex Physical Systems

Authors: Long Wei, Haodong Feng, Peiyan Hu, Tao Zhang, Yuchen Yang, Xiang Zheng, Ruiqi Feng, Dixia Fan, Tailin Wu

Abstract: The control problems of complex physical systems have wide applications in science and engineering. Several previous works have demonstrated that generative control methods based on diffusion models have significant advantages for solving these problems. However, existing generative control methods face challenges in handling closed-loop control, which is an inherent constraint for effective contr… ▽ More The control problems of complex physical systems have wide applications in science and engineering. Several previous works have demonstrated that generative control methods based on diffusion models have significant advantages for solving these problems. However, existing generative control methods face challenges in handling closed-loop control, which is an inherent constraint for effective control of complex physical systems. In this paper, we propose a Closed-Loop Diffusion method for Physical systems Control (CL-DiffPhyCon). By adopting an asynchronous denoising schedule for different time steps, CL-DiffPhyCon generates control signals conditioned on real-time feedback from the environment. Thus, CL-DiffPhyCon is able to speed up diffusion control methods in a closed-loop framework. We evaluate CL-DiffPhyCon on the 1D Burgers' equation control and 2D incompressible fluid control tasks. The results demonstrate that CL-DiffPhyCon achieves notable control performance with significant sampling acceleration. △ Less

Submitted 31 July, 2024; originally announced August 2024.

arXiv:2407.18406

A form of refined Roth's theorem and its application to the $abc$-conjecture

Authors: Pei-Chu Hu, Bao Qin Li

Abstract: In this paper, we give a form of refined Roth's theorem. As an application, we prove a special case of the $abc$-conjecture. In this paper, we give a form of refined Roth's theorem. As an application, we prove a special case of the $abc$-conjecture. △ Less

Submitted 1 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

Comments: There is an error in Theorem 1.6

MSC Class: 11A41; 11E95

arXiv:2407.16599 [pdf, other]

The $\mathbb{Z}/p$-equivariant spectrum $BP\mathbb{R}$ for an odd prime $p$

Authors: Po Hu, Igor Kriz, Petr Somberg, Foling Zou

Abstract: In the present paper, we construct a $\mathbb{Z}/p$-equivariant analog of the $\mathbb{Z}/2$-equivariant spectrum $BP\mathbb{R}$ previously constructed by Hu and Kriz. We prove that this spectrum has some of the properties conjectured by Hill, Hopkins, and Ravenel. Our main construction method is an $\mathbb{Z}/p$-equivariant analog of the Brown-Peterson tower of $BP$, based on a previous descript… ▽ More In the present paper, we construct a $\mathbb{Z}/p$-equivariant analog of the $\mathbb{Z}/2$-equivariant spectrum $BP\mathbb{R}$ previously constructed by Hu and Kriz. We prove that this spectrum has some of the properties conjectured by Hill, Hopkins, and Ravenel. Our main construction method is an $\mathbb{Z}/p$-equivariant analog of the Brown-Peterson tower of $BP$, based on a previous description of the $\mathbb{Z}/p$-equivariant Steenrod algebra with constant coefficients by the authors. We also describe several variants of our construction and comparisons with other known equivariant spectra. △ Less

Submitted 23 July, 2024; originally announced July 2024.

MSC Class: 55P42; 55P91; 55P92

arXiv:2407.12867 [pdf, other]

Swift-BAT GUANO follow-up of gravitational-wave triggers in the third LIGO-Virgo-KAGRA observing run

Authors: Gayathri Raman, Samuele Ronchini, James Delaunay, Aaron Tohuvavohu, Jamie A. Kennea, Tyler Parsotan, Elena Ambrosi, Maria Grazia Bernardini, Sergio Campana, Giancarlo Cusumano, Antonino D'Ai, Paolo D'Avanzo, Valerio D'Elia, Massimiliano De Pasquale, Simone Dichiara, Phil Evans, Dieter Hartmann, Paul Kuin, Andrea Melandri, Paul O'Brien, Julian P. Osborne, Kim Page, David M. Palmer, Boris Sbarufatti, Gianpiero Tagliaferri , et al. (1797 additional authors not shown)

Abstract: We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wav… ▽ More We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wave Transient Catalogs (GWTC-3). Targeted searches were carried out on the entire GW sample using the maximum--likelihood NITRATES pipeline on the BAT data made available via the GUANO infrastructure. We do not detect any significant electromagnetic emission that is temporally and spatially coincident with any of the GW candidates. We report flux upper limits in the 15-350 keV band as a function of sky position for all the catalog candidates. For GW candidates where the Swift-BAT false alarm rate is less than 10$^{-3}$ Hz, we compute the GW--BAT joint false alarm rate. Finally, the derived Swift-BAT upper limits are used to infer constraints on the putative electromagnetic emission associated with binary black hole mergers. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 50 pages, 10 figures, 4 tables

arXiv:2407.11620 [pdf]

A Deep Learning-Based Target Radial Length Estimation Method through HRRP Sequence

Authors: Lingfeng Chen, Panhe Hu, Zhiliang Pan, Xiao Sun, Zehao Wang

Abstract: This paper introduces an innovative deep learning-based method for end-to-end target radial length estimation from HRRP (High Resolution Range Profile) sequences. Firstly, the HRRP sequences are normalized and transformed into GAF (Gram Angular Field) images to effectively capture and utilize the temporal information. Subsequently, these GAF images serve as the input for a pretrained ResNet-101 mo… ▽ More This paper introduces an innovative deep learning-based method for end-to-end target radial length estimation from HRRP (High Resolution Range Profile) sequences. Firstly, the HRRP sequences are normalized and transformed into GAF (Gram Angular Field) images to effectively capture and utilize the temporal information. Subsequently, these GAF images serve as the input for a pretrained ResNet-101 model, which is then fine-tuned for target radial length estimation. The simulation results show that compared to traditional threshold method and simple networks e.g. one-dimensional CNN (Convolutional Neural Network), the proposed method demonstrates superior noise resistance and higher accuracy under low SNR (Signal-to-Noise Ratio) conditions. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 2 pages, 2 figures. Accepted by APCAP 2024

arXiv:2407.08236 [pdf, other]

HRRPGraphNet: A Graph Neural Network Based Approach for HRRP Radar Target Recognition

Authors: Lingfeng Chen, Panhe Hu, Zhiliang Pan, Xiao Sun, Zehao Wang

Abstract: High Resolution Range Profiles (HRRP) have become a key area of focus in the domain of Radar Automatic Target Recognition (RATR). Despite the success of data-driven neural network-based HRRP recognition, challenges such as insufficient training samples persist in its real-world application. This letter introduces HRRPGraphNet, a novel Graph Neural Network (GNN) model designed specifically for HRRP… ▽ More High Resolution Range Profiles (HRRP) have become a key area of focus in the domain of Radar Automatic Target Recognition (RATR). Despite the success of data-driven neural network-based HRRP recognition, challenges such as insufficient training samples persist in its real-world application. This letter introduces HRRPGraphNet, a novel Graph Neural Network (GNN) model designed specifically for HRRP target recognition that leverages new insights to address these challenges. A pivotal innovation is the transformation of HRRP data into a graph structure, utilizing a range cell amplitude-based node vector and a range-relative adjacency matrix. This graph-based approach facilitates both local feature extraction via one-dimensional convolution layers and global feature extraction through a graph convolution layer, capitalizing on the intrinsic relationships between range cells which is a distinct advantage over existing sequence-based methods. Experiments on the aircraft electromagnetic simulation dataset and the measured dataset have confirmed HRRPGraphNet's superior accuracy and robustness, particularly in fewer training sample environments, underscoring the potential of graph-driven innovations in HRRP-based RATR. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 5 pages, 4 figures

arXiv:2407.06494 [pdf, other]

A Generative Approach to Control Complex Physical Systems

Authors: Long Wei, Peiyan Hu, Ruiqi Feng, Haodong Feng, Yixuan Du, Tao Zhang, Rui Wang, Yue Wang, Zhi-Ming Ma, Tailin Wu

Abstract: Controlling the evolution of complex physical systems is a fundamental task across science and engineering. Classical techniques suffer from limited applicability or huge computational costs. On the other hand, recent deep learning and reinforcement learning-based approaches often struggle to optimize long-term control sequences under the constraints of system dynamics. In this work, we introduce… ▽ More Controlling the evolution of complex physical systems is a fundamental task across science and engineering. Classical techniques suffer from limited applicability or huge computational costs. On the other hand, recent deep learning and reinforcement learning-based approaches often struggle to optimize long-term control sequences under the constraints of system dynamics. In this work, we introduce Diffusion Physical systems Control (DiffPhyCon), a new class of method to address the physical systems control problem. DiffPhyCon excels by simultaneously minimizing both the learned generative energy function and the predefined control objectives across the entire trajectory and control sequence. Thus, it can explore globally and identify near-optimal control sequences. Moreover, we enhance DiffPhyCon with prior reweighting, enabling the discovery of control sequences that significantly deviate from the training distribution. We test our method in 1D Burgers' equation and 2D jellyfish movement control in a fluid environment. Our method outperforms widely applied classical approaches and state-of-the-art deep learning and reinforcement learning methods. Notably, DiffPhyCon unveils an intriguing fast-close-slow-open pattern observed in the jellyfish, aligning with established findings in the field of fluid dynamics. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.03621 [pdf, other]

The Mysterious Case of Neuron 1512: Injectable Realignment Architectures Reveal Internal Characteristics of Meta's Llama 2 Model

Authors: Brenden Smith, Dallin Baker, Clayton Chase, Myles Barney, Kaden Parker, Makenna Allred, Peter Hu, Alex Evans, Nancy Fulda

Abstract: Large Language Models (LLMs) have an unrivaled and invaluable ability to "align" their output to a diverse range of human preferences, by mirroring them in the text they generate. The internal characteristics of such models, however, remain largely opaque. This work presents the Injectable Realignment Model (IRM) as a novel approach to language model interpretability and explainability. Inspired b… ▽ More Large Language Models (LLMs) have an unrivaled and invaluable ability to "align" their output to a diverse range of human preferences, by mirroring them in the text they generate. The internal characteristics of such models, however, remain largely opaque. This work presents the Injectable Realignment Model (IRM) as a novel approach to language model interpretability and explainability. Inspired by earlier work on Neural Programming Interfaces, we construct and train a small network -- the IRM -- to induce emotion-based alignments within a 7B parameter LLM architecture. The IRM outputs are injected via layerwise addition at various points during the LLM's forward pass, thus modulating its behavior without changing the weights of the original model. This isolates the alignment behavior from the complex mechanisms of the transformer model. Analysis of the trained IRM's outputs reveals a curious pattern. Across more than 24 training runs and multiple alignment datasets, patterns of IRM activations align themselves in striations associated with a neuron's index within each transformer layer, rather than being associated with the layers themselves. Further, a single neuron index (1512) is strongly correlated with all tested alignments. This result, although initially counterintuitive, is directly attributable to design choices present within almost all commercially available transformer architectures, and highlights a potential weak point in Meta's pretrained Llama 2 models. It also demonstrates the value of the IRM architecture for language model analysis and interpretability. Our code and datasets are available at https://github.com/DRAGNLabs/injectable-alignment-model △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 21 pages, 17 figures

arXiv:2406.16655 [pdf, other]

Large Language Models Are Cross-Lingual Knowledge-Free Reasoners

Authors: Peng Hu, Sizhe Liu, Changjiang Gao, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang

Abstract: Large Language Models have demonstrated impressive reasoning capabilities across multiple languages. However, the relationship between capabilities in different languages is less explored. In this work, we decompose the process of reasoning tasks into two separated parts: knowledge retrieval and knowledge-free reasoning, and analyze the cross-lingual transferability of them. With adapted and const… ▽ More Large Language Models have demonstrated impressive reasoning capabilities across multiple languages. However, the relationship between capabilities in different languages is less explored. In this work, we decompose the process of reasoning tasks into two separated parts: knowledge retrieval and knowledge-free reasoning, and analyze the cross-lingual transferability of them. With adapted and constructed knowledge-free reasoning datasets, we show that the knowledge-free reasoning capability can be nearly perfectly transferred across various source-target language directions despite the secondary impact of resource in some specific target languages, while cross-lingual knowledge retrieval significantly hinders the transfer. Moreover, by analyzing the hidden states and feed-forward network neuron activation during the reasoning tasks, we show that higher similarity of hidden representations and larger overlap of activated neurons could explain the better cross-lingual transferability of knowledge-free reasoning than knowledge retrieval. Thus, we hypothesize that knowledge-free reasoning embeds in some language-shared mechanism, while knowledge is stored separately in different languages. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.15160 [pdf, other]

Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios

Authors: Ya Jiang, Qing Wang, Jun Du, Maocheng Hu, Pengfei Hu, Zeyan Liu, Shi Cheng, Zhaoxu Nian, Yuxuan Dong, Mingqi Cai, Xin Fang, Chin-Hui Lee

Abstract: This study presents an audio-visual information fusion approach to sound event localization and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video modality information through cross-modal learning and multi-modal fusion. First, we propose a cross-modal teacher-student learning (TSL) framework to transfer information from an audio-only teacher model, trained on a rich c… ▽ More This study presents an audio-visual information fusion approach to sound event localization and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video modality information through cross-modal learning and multi-modal fusion. First, we propose a cross-modal teacher-student learning (TSL) framework to transfer information from an audio-only teacher model, trained on a rich collection of audio data with multiple data augmentation techniques, to an audio-visual student model trained with only a limited set of multi-modal data. Next, we propose a two-stage audio-visual fusion strategy, consisting of an early feature fusion and a late video-guided decision fusion to exploit synergies between audio and video modalities. Finally, we introduce an innovative video pixel swapping (VPS) technique to extend an audio channel swapping (ACS) method to an audio-visual joint augmentation. Evaluation results on the Detection and Classification of Acoustic Scenes and Events (DCASE) 2023 Challenge data set demonstrate significant improvements in SELD performances. Furthermore, our submission to the SELD task of the DCASE 2023 Challenge ranks first place by effectively integrating the proposed techniques into a model ensemble. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: accepted by icme2024

arXiv:2406.10928 [pdf, other]

doi 10.1145/3637528.3671708

Make Your Home Safe: Time-aware Unsupervised User Behavior Anomaly Detection in Smart Homes via Loss-guided Mask

Authors: Jingyu Xiao, Zhiyao Xu, Qingsong Zou, Qing Li, Dan Zhao, Dong Fang, Ruoyu Li, Wenxin Tang, Kang Li, Xudong Zuo, Penghui Hu, Yong Jiang, Zixuan Weng, Michael R. Lyv

Abstract: Smart homes, powered by the Internet of Things, offer great convenience but also pose security concerns due to abnormal behaviors, such as improper operations of users and potential attacks from malicious attackers. Several behavior modeling methods have been proposed to identify abnormal behaviors and mitigate potential risks. However, their performance often falls short because they do not effec… ▽ More Smart homes, powered by the Internet of Things, offer great convenience but also pose security concerns due to abnormal behaviors, such as improper operations of users and potential attacks from malicious attackers. Several behavior modeling methods have been proposed to identify abnormal behaviors and mitigate potential risks. However, their performance often falls short because they do not effectively learn less frequent behaviors, consider temporal context, or account for the impact of noise in human behaviors. In this paper, we propose SmartGuard, an autoencoder-based unsupervised user behavior anomaly detection framework. First, we design a Loss-guided Dynamic Mask Strategy (LDMS) to encourage the model to learn less frequent behaviors, which are often overlooked during learning. Second, we propose a Three-level Time-aware Position Embedding (TTPE) to incorporate temporal information into positional embedding to detect temporal context anomaly. Third, we propose a Noise-aware Weighted Reconstruction Loss (NWRL) that assigns different weights for routine behaviors and noise behaviors to mitigate the interference of noise behaviors during inference. Comprehensive experiments on three datasets with ten types of anomaly behaviors demonstrates that SmartGuard consistently outperforms state-of-the-art baselines and also offers highly interpretable results. △ Less

Submitted 18 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

Comments: KDD 2024

arXiv:2406.08757 [pdf, other]

SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding

Authors: Jiefeng Ma, Yan Wang, Chenyu Liu, Jun Du, Yu Hu, Zhenrong Zhang, Pengfei Hu, Qing Wang, Jianshu Zhang

Abstract: Accurately identifying and organizing textual content is crucial for the automation of document processing in the field of form understanding. Existing datasets, such as FUNSD and XFUND, support entity classification and relationship prediction tasks but are typically limited to local and entity-level annotations. This limitation overlooks the hierarchically structured representation of documents,… ▽ More Accurately identifying and organizing textual content is crucial for the automation of document processing in the field of form understanding. Existing datasets, such as FUNSD and XFUND, support entity classification and relationship prediction tasks but are typically limited to local and entity-level annotations. This limitation overlooks the hierarchically structured representation of documents, constraining comprehensive understanding of complex forms. To address this issue, we present the SRFUND, a hierarchically structured multi-task form understanding benchmark. SRFUND provides refined annotations on top of the original FUNSD and XFUND datasets, encompassing five tasks: (1) word to text-line merging, (2) text-line to entity merging, (3) entity category classification, (4) item table localization, and (5) entity-based full-document hierarchical structure recovery. We meticulously supplemented the original dataset with missing annotations at various levels of granularity and added detailed annotations for multi-item table regions within the forms. Additionally, we introduce global hierarchical structure dependencies for entity relation prediction tasks, surpassing traditional local key-value associations. The SRFUND dataset includes eight languages including English, Chinese, Japanese, German, French, Spanish, Italian, and Portuguese, making it a powerful tool for cross-lingual form understanding. Extensive experimental results demonstrate that the SRFUND dataset presents new challenges and significant opportunities in handling diverse layouts and global hierarchical structures of forms, thus providing deep insights into the field of form understanding. The original dataset and implementations of baseline methods are available at https://sprateam-ustc.github.io/SRFUND △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: NeurIPS 2024 Track on Datasets and Benchmarks under review

arXiv:2406.08454 [pdf, other]

Towards Musically Informed Evaluation of Piano Transcription Models

Authors: Patricia Hu, Lukáš Samuel Marták, Carlos Cancino-Chacón, Gerhard Widmer

Abstract: Automatic piano transcription models are typically evaluated using simple frame- or note-wise information retrieval (IR) metrics. Such benchmark metrics do not provide insights into the transcription quality of specific musical aspects such as articulation, dynamics, or rhythmic precision of the output, which are essential in the context of expressive performance analysis. Furthermore, in recent y… ▽ More Automatic piano transcription models are typically evaluated using simple frame- or note-wise information retrieval (IR) metrics. Such benchmark metrics do not provide insights into the transcription quality of specific musical aspects such as articulation, dynamics, or rhythmic precision of the output, which are essential in the context of expressive performance analysis. Furthermore, in recent years, MAESTRO has become the de-facto training and evaluation dataset for such models. However, inference performance has been observed to deteriorate substantially when applied on out-of-distribution data, thereby questioning the suitability and reliability of transcribed outputs from such models for specific MIR tasks. In this work, we investigate the performance of three state-of-the-art piano transcription models in two experiments. In the first one, we propose a variety of musically informed evaluation metrics which, in contrast to the IR metrics, offer more detailed insight into the musical quality of the transcriptions. In the second experiment, we compare inference performance on real-world and perturbed audio recordings, and highlight musical dimensions which our metrics can help explain. Our experimental results highlight the weaknesses of existing piano transcription metrics and contribute to a more musically sound error analysis of transcription outputs. △ Less

Submitted 29 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

Comments: Accepted at the 25th International Society for Music Information Retrieval Conference (ISMIR 2024)

arXiv:2406.07393 [pdf, other]

Limited Out-of-Context Knowledge Reasoning in Large Language Models

Authors: Peng Hu, Changjiang Gao, Ruiqi Gao, Jiajun Chen, Shujian Huang

Abstract: Large Language Models (LLMs) have demonstrated strong capabilities as knowledge bases and significant in-context reasoning capabilities. However, previous work challenges their out-of-context reasoning ability, i.e., the ability to infer information from their training data, instead of from the context or prompt. This paper focuses on a significant facet of out-of-context reasoning: Out-of-Context… ▽ More Large Language Models (LLMs) have demonstrated strong capabilities as knowledge bases and significant in-context reasoning capabilities. However, previous work challenges their out-of-context reasoning ability, i.e., the ability to infer information from their training data, instead of from the context or prompt. This paper focuses on a significant facet of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge. We designed a synthetic dataset with seven representative OCKR tasks to systematically assess the OCKR capabilities of LLMs. Using this dataset, we evaluated the LLaMA2-13B-chat model and discovered that its proficiency in this aspect is limited, regardless of whether the knowledge is trained in a separate or adjacent training settings. Moreover, training the model to reason with complete reasoning data did not result in significant improvement. Training the model to perform explicit knowledge retrieval helps in only one of the tasks, indicating that the model's limited OCKR capabilities are due to difficulties in retrieving relevant knowledge. Furthermore, we treat cross-lingual knowledge transfer as a distinct form of OCKR, and evaluate this ability. Our results show that the evaluated model also exhibits limited ability in transferring knowledge across languages. The dataset used in this study is available at https://github.com/NJUNLP/ID-OCKR. △ Less

Submitted 24 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.02019 [pdf, other]

Uncorrelated estimations of $H_0$ redshift evolution from DESI baryon acoustic oscillation observations

Authors: X. D. Jia, J. P. Hu, F. Y. Wang

Abstract: The Dark Energy Spectroscopic Instrumnet (DESI) collaboration recently released the first year data of baryon acoustic oscillations (BAOs). Basing on the five different tracers, the cosmological constraint shows a hint of deviation from the standard $Λ$CDM model. In this letter, We combine the DESI BAOs with other cosmic probes to constrain the evolution of Hubble constant as a function of redshif… ▽ More The Dark Energy Spectroscopic Instrumnet (DESI) collaboration recently released the first year data of baryon acoustic oscillations (BAOs). Basing on the five different tracers, the cosmological constraint shows a hint of deviation from the standard $Λ$CDM model. In this letter, We combine the DESI BAOs with other cosmic probes to constrain the evolution of Hubble constant as a function of redshift in flat $Λ$CDM model. The non-parametric method is used to estimate the value of Hubble constant at different redshift bins. The correlation among different bins are removed by diagonalizing the covariance matrix. The joint data sample demonstrate a decreasing trend of Hubble constant with a significance of $8.6 σ$, which can naturally resolve the Hubble tension. It may be due to dynamical dark energy or modified gravity. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 7 pages, 2 figures, 1 table, submitted to AAS journal

arXiv:2406.01953 [pdf, other]

On-Demand Routing in LEO Mega-Constellations with Dynamic Laser Inter-Satellite Links

Authors: Dhiraj Bhattacharjee, Pablo G. Madoery, Aizaz U. Chaudhry, Halim Yanikomeroglu, Gunes Karabulut Kurt, Peng Hu, Khaled Ahmed, Stephane Martel

Abstract: Low Earth orbit (LEO) satellite mega constellations are beginning to include laser inter-satellite links (LISLs) to extend the Internet to the most remote locations on Earth. Since the process of establishing these links incurs a setup delay on the order of seconds, a static network topology is generally established well in advance, which is then used for the routing calculations. However, this in… ▽ More Low Earth orbit (LEO) satellite mega constellations are beginning to include laser inter-satellite links (LISLs) to extend the Internet to the most remote locations on Earth. Since the process of establishing these links incurs a setup delay on the order of seconds, a static network topology is generally established well in advance, which is then used for the routing calculations. However, this involves keeping links active even when they are not being used to forward traffic, leading to poor energy efficiency. Motivated by technological advances that are gradually decreasing the LISL setup delays, we foresee scenarios where it will be possible to compute routes and establish dynamic LISLs on demand. This will require considering setup delays as penalties that will affect the end-to-end latency. In this paper, we present a nonlinear optimization model that considers these penalties in the cost function and propose three heuristic algorithms that solve the problem in a tractable way. The algorithms establish different trade-offs in terms of performance and computational complexity. We extensively analyze metrics including average latency, route change rate, outage probability, and jitter in Starlink's Phase I version 2 constellation. The results show the benefit of adaptive routing schemes according to the link setup delay. In particular, more complex schemes can decrease the average end-to-end latency in exchange for an increase in execution time. On the other hand, depending on the maximum tolerated latency, it is possible to use less computationally complex schemes which will be more scalable for the satellite mega constellations of the future. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.17792 [pdf, other]

JUNO Sensitivity to Invisible Decay Modes of Neutrons

Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Kai Adamowicz, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli, Daniel Bick , et al. (635 additional authors not shown)

Abstract: We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation mode… ▽ More We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation modes of the excited residual nuclei can produce a time- and space-correlated triple coincidence signal in the JUNO detector. Based on a full Monte Carlo simulation informed with the latest available data, we estimate all backgrounds, including inverse beta decay events of the reactor antineutrino $\barν_e$, natural radioactivity, cosmogenic isotopes and neutral current interactions of atmospheric neutrinos. Pulse shape discrimination and multivariate analysis techniques are employed to further suppress backgrounds. With two years of exposure, JUNO is expected to give an order of magnitude improvement compared to the current best limits. After 10 years of data taking, the JUNO expected sensitivities at a 90% confidence level are $τ/B( n \rightarrow { inv} ) > 5.0 \times 10^{31} \, {\rm yr}$ and $τ/B( nn \rightarrow { inv} ) > 1.4 \times 10^{32} \, {\rm yr}$. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 28 pages, 7 figures, 4 tables

arXiv:2405.15438 [pdf, other]

Comparing remote sensing-based forest biomass mapping approaches using new forest inventory plots in contrasting forests in northeastern and southwestern China

Authors: Wenquan Dong, Edward T. A. Mitchard, Yuwei Chen, Man Chen, Congfeng Cao, Peilun Hu, Cong Xu, Steven Hancock

Abstract: Large-scale high spatial resolution aboveground biomass (AGB) maps play a crucial role in determining forest carbon stocks and how they are changing, which is instrumental in understanding the global carbon cycle, and implementing policy to mitigate climate change. The advent of the new space-borne LiDAR sensor, NASA's GEDI instrument, provides unparalleled possibilities for the accurate and unbia… ▽ More Large-scale high spatial resolution aboveground biomass (AGB) maps play a crucial role in determining forest carbon stocks and how they are changing, which is instrumental in understanding the global carbon cycle, and implementing policy to mitigate climate change. The advent of the new space-borne LiDAR sensor, NASA's GEDI instrument, provides unparalleled possibilities for the accurate and unbiased estimation of forest AGB at high resolution, particularly in dense and tall forests, where Synthetic Aperture Radar (SAR) and passive optical data exhibit saturation. However, GEDI is a sampling instrument, collecting dispersed footprints, and its data must be combined with that from other continuous cover satellites to create high-resolution maps, using local machine learning methods. In this study, we developed local models to estimate forest AGB from GEDI L2A data, as the models used to create GEDI L4 AGB data incorporated minimal field data from China. We then applied LightGBM and random forest regression to generate wall-to-wall AGB maps at 25 m resolution, using extensive GEDI footprints as well as Sentinel-1 data, ALOS-2 PALSAR-2 and Sentinel-2 optical data. Through a 5-fold cross-validation, LightGBM demonstrated a slightly better performance than Random Forest across two contrasting regions. However, in both regions, the computation speed of LightGBM is substantially faster than that of the random forest model, requiring roughly one-third of the time to compute on the same hardware. Through the validation against field data, the 25 m resolution AGB maps generated using the local models developed in this study exhibited higher accuracy compared to the GEDI L4B AGB data. We found in both regions an increase in error as slope increased. The trained models were tested on nearby but different regions and exhibited good performance. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.11862 [pdf, other]

SEMv3: A Fast and Robust Approach to Table Separation Line Detection

Authors: Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du

Abstract: Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. The `"split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. However, challenges such as wireless and deformed tables make it demanding. In this paper, we adhere to the "split-and-merge" paradigm and propose SEMv3 (SEM: Spl… ▽ More Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. The `"split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. However, challenges such as wireless and deformed tables make it demanding. In this paper, we adhere to the "split-and-merge" paradigm and propose SEMv3 (SEM: Split, Embed and Merge), a method that is both fast and robust for detecting table separation lines. During the split stage, we introduce a Keypoint Offset Regression (KOR) module, which effectively detects table separation lines by directly regressing the offset of each line relative to its keypoint proposals. Moreover, in the merge stage, we define a series of merge actions to efficiently describe the table structure based on table grids. Extensive ablation studies demonstrate that our proposed KOR module can detect table separation lines quickly and accurately. Furthermore, on public datasets (e.g. WTW, ICDAR-2019 cTDaR Historical and iFLYTAB), SEMv3 achieves state-of-the-art (SOTA) performance. The code is available at https://github.com/Chunchunwumu/SEMv3. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 9 pages, 6 figures, 5 tables. Accepted by IJCAI2024 main track

arXiv:2404.17875 [pdf, other]

Noisy Node Classification by Bi-level Optimization based Multi-teacher Distillation

Authors: Yujing Liu, Zongqian Wu, Zhengyu Lu, Ci Nie, Guoqiu Wen, Ping Hu, Xiaofeng Zhu

Abstract: Previous graph neural networks (GNNs) usually assume that the graph data is with clean labels for representation learning, but it is not true in real applications. In this paper, we propose a new multi-teacher distillation method based on bi-level optimization (namely BO-NNC), to conduct noisy node classification on the graph data. Specifically, we first employ multiple self-supervised learning me… ▽ More Previous graph neural networks (GNNs) usually assume that the graph data is with clean labels for representation learning, but it is not true in real applications. In this paper, we propose a new multi-teacher distillation method based on bi-level optimization (namely BO-NNC), to conduct noisy node classification on the graph data. Specifically, we first employ multiple self-supervised learning methods to train diverse teacher models, and then aggregate their predictions through a teacher weight matrix. Furthermore, we design a new bi-level optimization strategy to dynamically adjust the teacher weight matrix based on the training progress of the student model. Finally, we design a label improvement module to improve the label quality. Extensive experimental results on real datasets show that our method achieves the best results compared to state-of-the-art methods. △ Less

Submitted 8 May, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

arXiv:2404.17281 [pdf]

Topological polarization singularities induced by the non-Hermitian Dirac points

Authors: Jun Wang, Jie Liu, Peng Hu, Qiao Jiang, Dezhuan Han

Abstract: A Dirac point in the Hermitian photonic system will split into a pair of exceptional points (EPs) or even spawn a ring of EPs if non-Hermiticity is involved. Here, we present a new type of non-Hermitian Dirac point which is situated in the complex plane of eigenfrequency. When there is differential loss, the Dirac point exhibits a dual behavior: it not only splits into a pair of EPs with opposite… ▽ More A Dirac point in the Hermitian photonic system will split into a pair of exceptional points (EPs) or even spawn a ring of EPs if non-Hermiticity is involved. Here, we present a new type of non-Hermitian Dirac point which is situated in the complex plane of eigenfrequency. When there is differential loss, the Dirac point exhibits a dual behavior: it not only splits into a pair of EPs with opposite chirality in the band structure but also induces a pair of circularly polarized states (C points) with opposite handedness in the far-field radiation. Furthermore, breaking the corresponding mirror symmetries enables independent control of these Dirac-point induced C points, facilitating the merging of two C points and generation of unidirectional guided resonances. Our results demonstrate an explicit relation between the band singularities and polarization singularities, and provide a new mechanism to generate unidirectional emission, which can be useful in the band engineering and polarization manipulation. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.11577 [pdf, other]

Towards Reliable Empirical Machine Unlearning Evaluation: A Game-Theoretic View

Authors: Yiwen Tu, Pingbang Hu, Jiaqi Ma

Abstract: Machine unlearning is the process of updating machine learning models to remove the information of specific training data samples, in order to comply with data protection regulations that allow individuals to request the removal of their personal data. Despite the recent development of numerous unlearning algorithms, reliable evaluation of these algorithms remains an open research question. In thi… ▽ More Machine unlearning is the process of updating machine learning models to remove the information of specific training data samples, in order to comply with data protection regulations that allow individuals to request the removal of their personal data. Despite the recent development of numerous unlearning algorithms, reliable evaluation of these algorithms remains an open research question. In this work, we focus on membership inference attack (MIA) based evaluation, one of the most common approaches for evaluating unlearning algorithms, and address various pitfalls of existing evaluation metrics that lack reliability. Specifically, we propose a game-theoretic framework that formalizes the evaluation process as a game between unlearning algorithms and MIA adversaries, measuring the data removal efficacy of unlearning algorithms by the capability of the MIA adversaries. Through careful design of the game, we demonstrate that the natural evaluation metric induced from the game enjoys provable guarantees that the existing evaluation metrics fail to satisfy. Furthermore, we propose a practical and efficient algorithm to estimate the evaluation metric induced from the game, and demonstrate its effectiveness through both theoretical analysis and empirical experiments. This work presents a novel and reliable approach to empirically evaluating unlearning algorithms, paving the way for the development of more effective unlearning techniques. △ Less

Submitted 12 June, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.07689 [pdf, other]

Rectifying bedload flux variability from channel geometry and grain shape effects

Authors: Thomas Pähtz, Yulan Chen, Jiafeng Xie, Rémi Monthiller, Raphaël Maurin, Katharina Tholen, Yen-Cheng Lin, Hao-Che Ho, Peng Hu, Zhiguo He, Orencio Durán

Abstract: Bedload transport occurs when a bed composed of sedimentary grains becomes mobile in response to the shearing by a flow of liquid. It shapes the landscapes of Earth and other planetary bodies by promoting the formation and growth of various multiscale geological features. Estimating the rate at which such processes take place requires accurate bedload flux predictions. However, even for highly ide… ▽ More Bedload transport occurs when a bed composed of sedimentary grains becomes mobile in response to the shearing by a flow of liquid. It shapes the landscapes of Earth and other planetary bodies by promoting the formation and growth of various multiscale geological features. Estimating the rate at which such processes take place requires accurate bedload flux predictions. However, even for highly idealized conditions in the laboratory, study-to-study variability of reported bedload flux measurements borders an order of magnitude. This uncertainty stems from physically poorly supported, typically empirical methods to account for channel geometry effects in the determination of the transport-driving bed shear stress and from study-to-study grain shape variations. Here, we derive and validate a universal method of bed shear stress determination and apply it to a number of independent grain-shape-controlled data sets from experiments and CFD-DEM simulations for a very diverse range of transport conditions. An existing physical bedload flux model, here generalized to account for grain shape variability, predicts almost all these data within a factor of 1.3, whereas a recently proposed grain shape correction of the bed shear stress (Deal et al., Nature 613, 298-302, 2023) substantially increases the bedload flux scatter across weak and intense transport conditions. △ Less

Submitted 22 August, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.04659 [pdf, other]

Multilingual Pretraining and Instruction Tuning Improve Cross-Lingual Knowledge Alignment, But Only Shallowly

Authors: Changjiang Gao, Hongda Hu, Peng Hu, Jiajun Chen, Jixing Li, Shujian Huang

Abstract: Despite their strong ability to retrieve knowledge in English, current large language models show imbalance abilities in different languages. Two approaches are proposed to address this, i.e., multilingual pretraining and multilingual instruction tuning. However, whether and how do such methods contribute to the cross-lingual knowledge alignment inside the models is unknown. In this paper, we prop… ▽ More Despite their strong ability to retrieve knowledge in English, current large language models show imbalance abilities in different languages. Two approaches are proposed to address this, i.e., multilingual pretraining and multilingual instruction tuning. However, whether and how do such methods contribute to the cross-lingual knowledge alignment inside the models is unknown. In this paper, we propose CLiKA, a systematic framework to assess the cross-lingual knowledge alignment of LLMs in the Performance, Consistency and Conductivity levels, and explored the effect of multilingual pretraining and instruction tuning on the degree of alignment. Results show that: while both multilingual pretraining and instruction tuning are beneficial for cross-lingual knowledge alignment, the training strategy needs to be carefully designed. Namely, continued pretraining improves the alignment of the target language at the cost of other languages, while mixed pretraining affect other languages less. Also, the overall cross-lingual knowledge alignment, especially in the conductivity level, is unsatisfactory for all tested LLMs, and neither multilingual pretraining nor instruction tuning can substantially improve the cross-lingual knowledge conductivity. △ Less

Submitted 6 April, 2024; originally announced April 2024.

arXiv:2404.04607 [pdf, ps, other]

Quantized perfect transmission in graphene nanoribbons with random hollow adsorbates

Authors: Jia-Le Yu, Zhe Hou, Irfan Hussain Bhat, Pei-Jia Hu, Jia-Wen Sun, Xiao-Feng Chen, Ai-Min Guo, Qing-Feng Sun

Abstract: Impurities exist inevitably in two-dimensional materials as they spontaneously adsorb onto the surface during fabrication, usually exerting detrimental effects on electronic transport. Here, we focus on a special type of impurities that preferentially adsorb onto the hollow regions of graphene nanoribbons (GNRs), and study how they affect the quantum transport in GNRs. Contrary to previous knowled… ▽ More Impurities exist inevitably in two-dimensional materials as they spontaneously adsorb onto the surface during fabrication, usually exerting detrimental effects on electronic transport. Here, we focus on a special type of impurities that preferentially adsorb onto the hollow regions of graphene nanoribbons (GNRs), and study how they affect the quantum transport in GNRs. Contrary to previous knowledge that random adatoms should localize electrons, the so-called Anderson localization, noteworthy quantized conductance peaks (QCPs) are observed at specific electron energies. These QCPs are remarkably robust against variations in system size, GNR edge, and adatom properties, and they can reappear at identical energies following an arithmetic sequence of device width. Further investigation of wavefunction reveals a unique transport mode at each QCP energy which transmits through disordered GNRs reflectionlessly, while all the others become fully Anderson localized, indicating the survival of quantum ballistic transport in the localized regime. Our findings highlight the potential utility of hollow adatoms as a powerful tool to manipulate the conductivity of GNRs, and deepen the understanding of the interplay between impurities and graphene. △ Less

Submitted 9 April, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

Comments: 8 pages, 4 figures, and 1 table; comments are welcome

arXiv:2404.04346 [pdf, other]

Koala: Key frame-conditioned long video-LLM

Authors: Reuben Tan, Ximeng Sun, Ping Hu, Jui-hsien Wang, Hanieh Deilamsalehy, Bryan A. Plummer, Bryan Russell, Kate Saenko

Abstract: Long video question answering is a challenging task that involves recognizing short-term activities and reasoning about their fine-grained relationships. State-of-the-art video Large Language Models (vLLMs) hold promise as a viable solution due to their demonstrated emergent capabilities on new tasks. However, despite being trained on millions of short seconds-long videos, vLLMs are unable to unde… ▽ More Long video question answering is a challenging task that involves recognizing short-term activities and reasoning about their fine-grained relationships. State-of-the-art video Large Language Models (vLLMs) hold promise as a viable solution due to their demonstrated emergent capabilities on new tasks. However, despite being trained on millions of short seconds-long videos, vLLMs are unable to understand minutes-long videos and accurately answer questions about them. To address this limitation, we propose a lightweight and self-supervised approach, Key frame-conditioned long video-LLM (Koala), that introduces learnable spatiotemporal queries to adapt pretrained vLLMs for generalizing to longer videos. Our approach introduces two new tokenizers that condition on visual tokens computed from sparse video key frames for understanding short and long video moments. We train our proposed approach on HowTo100M and demonstrate its effectiveness on zero-shot long video understanding benchmarks, where it outperforms state-of-the-art large models by 3 - 6% in absolute accuracy across all tasks. Surprisingly, we also empirically show that our approach not only helps a pretrained vLLM to understand long videos but also improves its accuracy on short-term action recognition. △ Less

Submitted 3 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: Accepted at CVPR 2024 as a poster highlight

arXiv:2404.04248 [pdf, other]

doi 10.3847/2041-8213/ad5beb

Observation of Gravitational Waves from the Coalescence of a $2.5\text{-}4.5~M_\odot$ Compact Object and a Neutron Star

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, S. Akçay, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah , et al. (1771 additional authors not shown)

Abstract: We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the so… ▽ More We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the source has a mass less than $5~M_\odot$ at 99% credibility. We cannot definitively determine from gravitational-wave data alone whether either component of the source is a neutron star or a black hole. However, given existing estimates of the maximum neutron star mass, we find the most probable interpretation of the source to be the coalescence of a neutron star with a black hole that has a mass between the most massive neutron stars and the least massive black holes observed in the Galaxy. We provisionally estimate a merger rate density of $55^{+127}_{-47}~\text{Gpc}^{-3}\,\text{yr}^{-1}$ for compact binary coalescences with properties similar to the source of GW230529_181500; assuming that the source is a neutron star-black hole merger, GW230529_181500-like sources constitute about 60% of the total merger rate inferred for neutron star-black hole coalescences. The discovery of this system implies an increase in the expected rate of neutron star-black hole mergers with electromagnetic counterparts and provides further evidence for compact objects existing within the purported lower mass gap. △ Less

Submitted 26 July, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: 45 pages (10 pages author list, 13 pages main text, 1 page acknowledgements, 13 pages appendices, 8 pages bibliography), 17 figures, 16 tables. Update to match version published in The Astrophysical Journal Letters. Data products available from https://zenodo.org/records/10845779

Report number: LIGO-P2300352

Journal ref: ApJL 970, L34 (2024)

arXiv:2404.00855 [pdf, other]

TSOM: Small Object Motion Detection Neural Network Inspired by Avian Visual Circuit

Authors: Pignge Hu, Xiaoteng Zhang, Mengmeng Li, Yingjie Zhu, Li Shi

Abstract: Detecting small moving objects in complex backgrounds from an overhead perspective is a highly challenging task for machine vision systems. As an inspiration from nature, the avian visual system is capable of processing motion information in various complex aerial scenes, and its Retina-OT-Rt visual circuit is highly sensitive to capturing the motion information of small objects from high altitude… ▽ More Detecting small moving objects in complex backgrounds from an overhead perspective is a highly challenging task for machine vision systems. As an inspiration from nature, the avian visual system is capable of processing motion information in various complex aerial scenes, and its Retina-OT-Rt visual circuit is highly sensitive to capturing the motion information of small objects from high altitudes. However, more needs to be done on small object motion detection algorithms based on the avian visual system. In this paper, we conducted mathematical modeling based on extensive studies of the biological mechanisms of the Retina-OT-Rt visual circuit. Based on this, we proposed a novel tectum small object motion detection neural network (TSOM). The neural network includes the retina, SGC dendritic, SGC Soma, and Rt layers, each layer corresponding to neurons in the visual pathway. The Retina layer is responsible for accurately projecting input content, the SGC dendritic layer perceives and encodes spatial-temporal information, the SGC Soma layer computes complex motion information and extracts small objects, and the Rt layer integrates and decodes motion information from multiple directions to determine the position of small objects. Extensive experiments on pigeon neurophysiological experiments and image sequence data showed that the TSOM is biologically interpretable and effective in extracting reliable small object motion features from complex high-altitude backgrounds. △ Less

Submitted 31 March, 2024; originally announced April 2024.

arXiv:2403.19386 [pdf, other]

PointCloud-Text Matching: Benchmark Datasets and a Baseline

Authors: Yanglin Feng, Yang Qin, Dezhong Peng, Hongyuan Zhu, Xi Peng, Peng Hu

Abstract: In this paper, we present and study a new instance-level retrieval task: PointCloud-Text Matching~(PTM), which aims to find the exact cross-modal instance that matches a given point-cloud query or text query. PTM could be applied to various scenarios, such as indoor/urban-canyon localization and scene retrieval. However, there exists no suitable and targeted dataset for PTM in practice. Therefore,… ▽ More In this paper, we present and study a new instance-level retrieval task: PointCloud-Text Matching~(PTM), which aims to find the exact cross-modal instance that matches a given point-cloud query or text query. PTM could be applied to various scenarios, such as indoor/urban-canyon localization and scene retrieval. However, there exists no suitable and targeted dataset for PTM in practice. Therefore, we construct three new PTM benchmark datasets, namely 3D2T-SR, 3D2T-NR, and 3D2T-QA. We observe that the data is challenging and with noisy correspondence due to the sparsity, noise, or disorder of point clouds and the ambiguity, vagueness, or incompleteness of texts, which make existing cross-modal matching methods ineffective for PTM. To tackle these challenges, we propose a PTM baseline, named Robust PointCloud-Text Matching method (RoMa). RoMa consists of two modules: a Dual Attention Perception module (DAP) and a Robust Negative Contrastive Learning module (RNCL). Specifically, DAP leverages token-level and feature-level attention to adaptively focus on useful local and global features, and aggregate them into common representations, thereby reducing the adverse impact of noise and ambiguity. To handle noisy correspondence, RNCL divides negative pairs, which are much less error-prone than positive pairs, into clean and noisy subsets, and assigns them forward and reverse optimization directions respectively, thus enhancing robustness against noisy correspondence. We conduct extensive experiments on our benchmarks and demonstrate the superiority of our RoMa. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.16888 [pdf, other]

doi 10.1007/978-981-99-8432-9_11

Towards Balanced RGB-TSDF Fusion for Consistent Semantic Scene Completion by 3D RGB Feature Completion and a Classwise Entropy Loss Function

Authors: Laiyan Ding, Panwen Hu, Jie Li, Rui Huang

Abstract: Semantic Scene Completion (SSC) aims to jointly infer semantics and occupancies of 3D scenes. Truncated Signed Distance Function (TSDF), a 3D encoding of depth, has been a common input for SSC. Furthermore, RGB-TSDF fusion, seems promising since these two modalities provide color and geometry information, respectively. Nevertheless, RGB-TSDF fusion has been considered nontrivial and commonly-used… ▽ More Semantic Scene Completion (SSC) aims to jointly infer semantics and occupancies of 3D scenes. Truncated Signed Distance Function (TSDF), a 3D encoding of depth, has been a common input for SSC. Furthermore, RGB-TSDF fusion, seems promising since these two modalities provide color and geometry information, respectively. Nevertheless, RGB-TSDF fusion has been considered nontrivial and commonly-used naive addition will result in inconsistent results. We argue that the inconsistency comes from the sparsity of RGB features upon projecting into 3D space, while TSDF features are dense, leading to imbalanced feature maps when summed up. To address this RGB-TSDF distribution difference, we propose a two-stage network with a 3D RGB feature completion module that completes RGB features with meaningful values for occluded areas. Moreover, we propose an effective classwise entropy loss function to punish inconsistency. Extensive experiments on public datasets verify that our method achieves state-of-the-art performance among methods that do not adopt extra data. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.13936 [pdf, other]

Secure and Efficient Group Handover Protocol in 5G Non-Terrestrial Networks

Authors: Bohan Zhang, Peng Hu, Ahmad Akbari Azirani, Mohammad A. Salahuddin, Diogo Barradas, Noura Limam, Raouf Boutaba

Abstract: The growing low-Earth orbit (LEO) satellite constellations have become an essential part of the fifth-generation (5G) non-terrestrial network (NTN) market. These satellites can enable direct-to-cell connectivity for mobile devices and support various applications with ubiquitous coverage for 5G and beyond networks. However, satellite-based NTNs bring several challenges to the 5G handover protocol… ▽ More The growing low-Earth orbit (LEO) satellite constellations have become an essential part of the fifth-generation (5G) non-terrestrial network (NTN) market. These satellites can enable direct-to-cell connectivity for mobile devices and support various applications with ubiquitous coverage for 5G and beyond networks. However, satellite-based NTNs bring several challenges to the 5G handover protocol design. The high mobility of satellites can lead to signaling storms and security compromises during handovers. This paper addresses these challenges by proposing a secure and efficient group handover protocol. The protocol's effectiveness is evaluated on a custom discrete-event simulator and compared against the baseline 5G handover scheme. The simulator is made publicly available. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: Accepted by the 2024 IEEE International Conference on Communications (ICC), 9-13 June 2024, Denver, CO, USA

arXiv:2403.12131 [pdf, other]

doi 10.1103/PhysRevD.110.L021901

The Surprising Effectiveness of Weyl Gravity in Probing Quantum Corrections to AdS Black Holes

Authors: Liang Ma, Peng-Ju Hu, Yi Pang, Hong Lu

Abstract: Computing leading higher curvature contributions to thermodynamic quantities of AdS black hole is drastically simplified once the higher curvature terms are expressed in terms of powers of Weyl tensor by applying proper field redefinitions, avoiding the usual complications caused by higher derivative Gibbons-Hawking-York (GHY) term or surface counterterms. We establish the method by computing the… ▽ More Computing leading higher curvature contributions to thermodynamic quantities of AdS black hole is drastically simplified once the higher curvature terms are expressed in terms of powers of Weyl tensor by applying proper field redefinitions, avoiding the usual complications caused by higher derivative Gibbons-Hawking-York (GHY) term or surface counterterms. We establish the method by computing the Euclidean action of general rotating AdS black holes in five dimensional quadratic curvature theories with or without supersymmetry and verifying the results numerically. Our result is the state of the art for charged rotating AdS black holes in five dimensional minimal gauged supergravity including corrections from all three supersymmetric curvature squared terms. Our approach facilitates precision tests in the AdS/CFT correspondence and should be applicable in diverse dimensions. △ Less

Submitted 26 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: 5 pages, 1 figures. Clarifications and references added. Version accepted in PRD Letters

Journal ref: Phys.Rev.D110:L021901,2024

arXiv:2403.11549 [pdf, other]

Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters

Authors: Jiazuo Yu, Yunzhi Zhuge, Lu Zhang, Ping Hu, Dong Wang, Huchuan Lu, You He

Abstract: Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset. However, mitigating the performance degradation in large-scale models is non-trivial due to (i) parameter shifts throughout lifelong learning and (ii) significant computational burdens associated with full-model tuning. In this work, we present… ▽ More Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset. However, mitigating the performance degradation in large-scale models is non-trivial due to (i) parameter shifts throughout lifelong learning and (ii) significant computational burdens associated with full-model tuning. In this work, we present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models. Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters in response to new tasks. To preserve the zero-shot recognition capability of vision-language models, we further introduce a Distribution Discriminative Auto-Selector (DDAS) that automatically routes in-distribution and out-of-distribution inputs to the MoE Adapter and the original CLIP, respectively. Through extensive experiments across various settings, our proposed method consistently outperforms previous state-of-the-art approaches while concurrently reducing parameter training burdens by 60%. Our code locates at https://github.com/JiazuoYu/MoE-Adapters4CL △ Less

Submitted 3 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: This work is accepted by CVPR2024. More modifications may be performed

arXiv:2403.08292 [pdf, other]

Weak Collocation Regression for Inferring Stochastic Dynamics with Lévy Noise

Authors: Liya Guo, Liwei Lu, Zhijun Zeng, Pipi Hu, Yi Zhu

Abstract: With the rapid increase of observational, experimental and simulated data for stochastic systems, tremendous efforts have been devoted to identifying governing laws underlying the evolution of these systems. Despite the broad applications of non-Gaussian fluctuations in numerous physical phenomena, the data-driven approaches to extracting stochastic dynamics with Lévy noise are relatively few. In… ▽ More With the rapid increase of observational, experimental and simulated data for stochastic systems, tremendous efforts have been devoted to identifying governing laws underlying the evolution of these systems. Despite the broad applications of non-Gaussian fluctuations in numerous physical phenomena, the data-driven approaches to extracting stochastic dynamics with Lévy noise are relatively few. In this work, we propose a Weak Collocation Regression (WCR) to explicitly reveal unknown stochastic dynamical systems, i.e., the Stochastic Differential Equation (SDE) with both $α$-stable Lévy noise and Gaussian noise, from discrete aggregate data. This method utilizes the evolution equation of the probability distribution function, i.e., the Fokker-Planck (FP) equation. With the weak form of the FP equation, the WCR constructs a linear system of unknown parameters where all integrals are evaluated by Monte Carlo method with the observations. Then, the unknown parameters are obtained by a sparse linear regression. For a SDE with Lévy noise, the corresponding FP equation is a partial integro-differential equation (PIDE), which contains nonlocal terms, and is difficult to deal with. The weak form can avoid complicated multiple integrals. Our approach can simultaneously distinguish mixed noise types, even in multi-dimensional problems. Numerical experiments demonstrate that our method is accurate and computationally efficient. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: 19 pages, 5 figures, 10 tables

arXiv:2403.07153 [pdf, other]

2023 Low-Power Computer Vision Challenge (LPCVC) Summary

Authors: Leo Chen, Benjamin Boardley, Ping Hu, Yiru Wang, Yifan Pu, Xin Jin, Yongqiang Yao, Ruihao Gong, Bo Li, Gao Huang, Xianglong Liu, Zifu Wan, Xinwang Chen, Ning Liu, Ziyi Zhang, Dongping Liu, Ruijie Shan, Zhengping Che, Fachao Zhang, Xiaofeng Mou, Jian Tang, Maxim Chuprov, Ivan Malofeev, Alexander Goncharenko, Andrey Shcherbin , et al. (5 additional authors not shown)

Abstract: This article describes the 2023 IEEE Low-Power Computer Vision Challenge (LPCVC). Since 2015, LPCVC has been an international competition devoted to tackling the challenge of computer vision (CV) on edge devices. Most CV researchers focus on improving accuracy, at the expense of ever-growing sizes of machine models. LPCVC balances accuracy with resource requirements. Winners must achieve high accu… ▽ More This article describes the 2023 IEEE Low-Power Computer Vision Challenge (LPCVC). Since 2015, LPCVC has been an international competition devoted to tackling the challenge of computer vision (CV) on edge devices. Most CV researchers focus on improving accuracy, at the expense of ever-growing sizes of machine models. LPCVC balances accuracy with resource requirements. Winners must achieve high accuracy with short execution time when their CV solutions run on an embedded device, such as Raspberry PI or Nvidia Jetson Nano. The vision problem for 2023 LPCVC is segmentation of images acquired by Unmanned Aerial Vehicles (UAVs, also called drones) after disasters. The 2023 LPCVC attracted 60 international teams that submitted 676 solutions during the submission window of one month. This article explains the setup of the competition and highlights the winners' methods that improve accuracy and shorten execution time. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: LPCVC 2023, website: https://lpcv.ai/

arXiv:2403.05002 [pdf, other]

LHMap-loc: Cross-Modal Monocular Localization Using LiDAR Point Cloud Heat Map

Authors: Xinrui Wu, Jianbo Xu, Puyuan Hu, Guangming Wang, Hesheng Wang

Abstract: Localization using a monocular camera in the pre-built LiDAR point cloud map has drawn increasing attention in the field of autonomous driving and mobile robotics. However, there are still many challenges (e.g. difficulties of map storage, poor localization robustness in large scenes) in accurately and efficiently implementing cross-modal localization. To solve these problems, a novel pipeline ter… ▽ More Localization using a monocular camera in the pre-built LiDAR point cloud map has drawn increasing attention in the field of autonomous driving and mobile robotics. However, there are still many challenges (e.g. difficulties of map storage, poor localization robustness in large scenes) in accurately and efficiently implementing cross-modal localization. To solve these problems, a novel pipeline termed LHMap-loc is proposed, which achieves accurate and efficient monocular localization in LiDAR maps. Firstly, feature encoding is carried out on the original LiDAR point cloud map by generating offline heat point clouds, by which the size of the original LiDAR map is compressed. Then, an end-to-end online pose regression network is designed based on optical flow estimation and spatial attention to achieve real-time monocular visual localization in a pre-built map. In addition, a series of experiments have been conducted to prove the effectiveness of the proposed method. Our code is available at: https://github.com/IRMVLab/LHMap-loc. △ Less

Submitted 10 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

Comments: Accepted by 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

arXiv:2403.03004 [pdf, other]

Ultralight vector dark matter search using data from the KAGRA O3GK run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi , et al. (1778 additional authors not shown)

Abstract: Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese… ▽ More Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we present the result of a search for $U(1)_{B-L}$ gauge boson DM using the KAGRA data from auxiliary length channels during the first joint observation run together with GEO600. By applying our search pipeline, which takes into account the stochastic nature of ultralight DM, upper bounds on the coupling strength between the $U(1)_{B-L}$ gauge boson and ordinary matter are obtained for a range of DM masses. While our constraints are less stringent than those derived from previous experiments, this study demonstrates the applicability of our method to the lower-mass vector DM search, which is made difficult in this measurement by the short observation time compared to the auto-correlation time scale of DM. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 20 pages, 5 figures

Report number: LIGO-P2300250

Showing 1–50 of 410 results for author: Hu, P