Search | arXiv e-print repository

MSFMamba: Multi-Scale Feature Fusion State Space Model for Multi-Source Remote Sensing Image Classification

Authors: Feng Gao, Xuepeng Jin, Xiaowei Zhou, Junyu Dong, Qian Du

Abstract: In multi-source remote sensing image classification field, remarkable progress has been made by convolutional neural network and Transformer. However, existing methods are still limited due to the inherent local reductive bias. Recently, Mamba-based methods built upon the State Space Model have shown great potential for long-range dependency modeling with linear complexity, but it has rarely been… ▽ More In multi-source remote sensing image classification field, remarkable progress has been made by convolutional neural network and Transformer. However, existing methods are still limited due to the inherent local reductive bias. Recently, Mamba-based methods built upon the State Space Model have shown great potential for long-range dependency modeling with linear complexity, but it has rarely been explored for the multi-source remote sensing image classification task. To this end, we propose Multi-Scale Feature Fusion Mamba (MSFMamba) network for hyperspectral image (HSI) and LiDAR/SAR data joint classification. Specifically, MSFMamba mainly comprises three parts: Multi-Scale Spatial Mamba (MSpa-Mamba) block, Spectral Mamba (Spe-Mamba) block, and Fusion Mamba (Fus-Mamba) block. Specifically, to solve the feature redundancy in multiple canning routes, the MSpa-Mamba block incorporates the multi-scale strategy to minimize the computational redundancy and alleviate the feature redundancy of SSM. In addition, Spe-Mamba is designed for spectral feature exploration, which is essential for HSI feature modeling. Moreover, to alleviate the heterogeneous gap between HSI and LiDAR/SAR data, we design Fus-Mamba block for multi-source feature fusion. The original Mamba is extended to accommodate dual inputs, and cross-modal feature interaction is enhanced. Extensive experimental results on three multi-source remote sensing datasets demonstrate the superiority performance of the proposed MSFMamba over the state-of-the-art models. Source codes of MSFMamba will be made public available at https://github.com/summitgao/MSFMamba . △ Less

Submitted 26 August, 2024; originally announced August 2024.

arXiv:2408.14158 [pdf, other]

Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning

Authors: Wei An, Xiao Bi, Guanting Chen, Shanhuang Chen, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Wenjun Gao, Kang Guan, Jianzhong Guo, Yongqiang Guo, Zhe Fu, Ying He, Panpan Huang, Jiashi Li, Wenfeng Liang, Xiaodong Liu, Xin Liu, Yiyuan Liu, Yuxuan Liu, Shanghao Lu, Xuan Lu, Xiaotao Nie, Tian Pei , et al. (27 additional authors not shown)

Abstract: The rapid progress in Deep Learning (DL) and Large Language Models (LLMs) has exponentially increased demands of computational power and bandwidth. This, combined with the high costs of faster computing chips and interconnects, has significantly inflated High Performance Computing (HPC) construction costs. To address these challenges, we introduce the Fire-Flyer AI-HPC architecture, a synergistic… ▽ More The rapid progress in Deep Learning (DL) and Large Language Models (LLMs) has exponentially increased demands of computational power and bandwidth. This, combined with the high costs of faster computing chips and interconnects, has significantly inflated High Performance Computing (HPC) construction costs. To address these challenges, we introduce the Fire-Flyer AI-HPC architecture, a synergistic hardware-software co-design framework and its best practices. For DL training, we deployed the Fire-Flyer 2 with 10,000 PCIe A100 GPUs, achieved performance approximating the DGX-A100 while reducing costs by half and energy consumption by 40%. We specifically engineered HFReduce to accelerate allreduce communication and implemented numerous measures to keep our Computation-Storage Integrated Network congestion-free. Through our software stack, including HaiScale, 3FS, and HAI-Platform, we achieved substantial scalability by overlapping computation and communication. Our system-oriented experience from DL training provides valuable insights to drive future advancements in AI-HPC. △ Less

Submitted 31 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

Comments: This is the preprint version of the paper accepted for presentation at the 2024 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC'24). \c{opyright} 2024 IEEE. Personal use of this material is permitted. For other uses, permission from IEEE must be obtained. Please refer to IEEE Xplore for the final published version

arXiv:2408.12232 [pdf, other]

BihoT: A Large-Scale Dataset and Benchmark for Hyperspectral Camouflaged Object Tracking

Authors: Hanzheng Wang, Wei Li, Xiang-Gen Xia, Qian Du

Abstract: Hyperspectral object tracking (HOT) has exhibited potential in various applications, particularly in scenes where objects are camouflaged. Existing trackers can effectively retrieve objects via band regrouping because of the bias in existing HOT datasets, where most objects tend to have distinguishing visual appearances rather than spectral characteristics. This bias allows the tracker to directly… ▽ More Hyperspectral object tracking (HOT) has exhibited potential in various applications, particularly in scenes where objects are camouflaged. Existing trackers can effectively retrieve objects via band regrouping because of the bias in existing HOT datasets, where most objects tend to have distinguishing visual appearances rather than spectral characteristics. This bias allows the tracker to directly use the visual features obtained from the false-color images generated by hyperspectral images without the need to extract spectral features. To tackle this bias, we find that the tracker should focus on the spectral information when object appearance is unreliable. Thus, we provide a new task called hyperspectral camouflaged object tracking (HCOT) and meticulously construct a large-scale HCOT dataset, termed BihoT, which consists of 41,912 hyperspectral images covering 49 video sequences. The dataset covers various artificial camouflage scenes where objects have similar appearances, diverse spectrums, and frequent occlusion, making it a very challenging dataset for HCOT. Besides, a simple but effective baseline model, named spectral prompt-based distractor-aware network (SPDAN), is proposed, comprising a spectral embedding network (SEN), a spectral prompt-based backbone network (SPBN), and a distractor-aware module (DAM). Specifically, the SEN extracts spectral-spatial features via 3-D and 2-D convolutions. Then, the SPBN fine-tunes powerful RGB trackers with spectral prompts and alleviates the insufficiency of training samples. Moreover, the DAM utilizes a novel statistic to capture the distractor caused by occlusion from objects and background. Extensive experiments demonstrate that our proposed SPDAN achieves state-of-the-art performance on the proposed BihoT and other HOT datasets. △ Less

Submitted 22 August, 2024; originally announced August 2024.

arXiv:2408.12109 [pdf, other]

RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data

Authors: Chenglong Wang, Yang Gan, Yifu Huo, Yongyu Mu, Murun Yang, Qiaozhi He, Tong Xiao, Chunliang Zhang, Tongran Liu, Quan Du, Di Yang, Jingbo Zhu

Abstract: Large vision-language models (LVLMs) often fail to align with human preferences, leading to issues like generating misleading content without proper visual context (also known as hallucination). A promising solution to this problem is using human-preference alignment techniques, such as best-of-n sampling and reinforcement learning. However, these techniques face the difficulty arising from the sc… ▽ More Large vision-language models (LVLMs) often fail to align with human preferences, leading to issues like generating misleading content without proper visual context (also known as hallucination). A promising solution to this problem is using human-preference alignment techniques, such as best-of-n sampling and reinforcement learning. However, these techniques face the difficulty arising from the scarcity of visual preference data, which is required to train a visual reward model (VRM). In this work, we continue the line of research. We present a Robust Visual Reward Model (RoVRM) which improves human-preference alignment for LVLMs. RoVRM leverages auxiliary textual preference data through a three-phase progressive training and optimal transport-based preference data selection to effectively mitigate the scarcity of visual preference data. We experiment with RoVRM on the commonly used vision-language tasks based on the LLaVA-1.5-7B and -13B models. Experimental results demonstrate that RoVRM consistently outperforms traditional VRMs. Furthermore, our three-phase progressive training and preference data selection approaches can yield consistent performance gains over ranking-based alignment techniques, such as direct preference optimization. △ Less

Submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.08152 [pdf, other]

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Authors: Huajian Xin, Z. Z. Ren, Junxiao Song, Zhihong Shao, Wanjia Zhao, Haocheng Wang, Bo Liu, Liyue Zhang, Xuan Lu, Qiushi Du, Wenjun Gao, Qihao Zhu, Dejian Yang, Zhibin Gou, Z. F. Wu, Fuli Luo, Chong Ruan

Abstract: We introduce DeepSeek-Prover-V1.5, an open-source language model designed for theorem proving in Lean 4, which enhances DeepSeek-Prover-V1 by optimizing both training and inference processes. Pre-trained on DeepSeekMath-Base with specialization in formal mathematical languages, the model undergoes supervised fine-tuning using an enhanced formal theorem proving dataset derived from DeepSeek-Prover-… ▽ More We introduce DeepSeek-Prover-V1.5, an open-source language model designed for theorem proving in Lean 4, which enhances DeepSeek-Prover-V1 by optimizing both training and inference processes. Pre-trained on DeepSeekMath-Base with specialization in formal mathematical languages, the model undergoes supervised fine-tuning using an enhanced formal theorem proving dataset derived from DeepSeek-Prover-V1. Further refinement is achieved through reinforcement learning from proof assistant feedback (RLPAF). Beyond the single-pass whole-proof generation approach of DeepSeek-Prover-V1, we propose RMaxTS, a variant of Monte-Carlo tree search that employs an intrinsic-reward-driven exploration strategy to generate diverse proof paths. DeepSeek-Prover-V1.5 demonstrates significant improvements over DeepSeek-Prover-V1, achieving new state-of-the-art results on the test set of the high school level miniF2F benchmark ($63.5\%$) and the undergraduate level ProofNet benchmark ($25.3\%$). △ Less

Submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.07261 [pdf, other]

Numerical analysis of a class of penalty discontinuous Galerkin methods for nonlocal diffusion problems

Authors: Qiang Du, Lili Ju, Jianfang Lu, Xiaochuan Tian

Abstract: In this paper, we consider a class of discontinuous Galerkin (DG) methods for one-dimensional nonlocal diffusion (ND) problems. The nonlocal models, which are integral equations, are widely used in describing many physical phenomena with long-range interactions. The ND problem is the nonlocal analog of the classic diffusion problem, and as the interaction radius (horizon) vanishes, then the nonloc… ▽ More In this paper, we consider a class of discontinuous Galerkin (DG) methods for one-dimensional nonlocal diffusion (ND) problems. The nonlocal models, which are integral equations, are widely used in describing many physical phenomena with long-range interactions. The ND problem is the nonlocal analog of the classic diffusion problem, and as the interaction radius (horizon) vanishes, then the nonlocality disappears and the ND problem converges to the classic diffusion problem. Under certain conditions, the exact solution to the ND problem may exhibit discontinuities, setting it apart from the classic diffusion problem. Since the DG method shows its great advantages in resolving problems with discontinuities in computational fluid dynamics over the past several decades, it is natural to adopt the DG method to compute the ND problems. Based on [Du-Ju-Lu-Tian-CAMC2020], we develop the DG methods with different penalty terms, ensuring that the proposed DG methods have local counterparts as the horizon vanishes. This indicates the proposed methods will converge to the existing DG schemes as the horizon vanishes, which is crucial for achieving asymptotic compatibility. Rigorous proofs are provided to demonstrate the stability, error estimates, and asymptotic compatibility of the proposed DG schemes. To observe the effect of the nonlocal diffusion, we also consider the time-dependent convection-diffusion problems with nonlocal diffusion. We conduct several numerical experiments, including accuracy tests and Burgers' equation with nonlocal diffusion, and various horizons are taken to show the good performance of the proposed algorithm and validate the theoretical findings. △ Less

Submitted 13 August, 2024; originally announced August 2024.

MSC Class: 65M60; 65R20; 45A05

arXiv:2408.04177 [pdf, ps, other]

Information Thermodynamics of Non-Hermitian Quantum Systems

Authors: Kui Cao, Qian Du, Su-Peng Kou

Abstract: In this study, we uncover the intrinsic information processes in non-Hermitian quantum systems and their thermodynamic effects. We demonstrate that these systems can exhibit negative entropy production, making them potential candidates for information engines. We also identify a key informational quantity that can characterize phase transitions beyond the reach of traditional partition functions.… ▽ More In this study, we uncover the intrinsic information processes in non-Hermitian quantum systems and their thermodynamic effects. We demonstrate that these systems can exhibit negative entropy production, making them potential candidates for information engines. We also identify a key informational quantity that can characterize phase transitions beyond the reach of traditional partition functions. This work enhances our understanding of the interplay between information and thermodynamics, providing a new perspective on non-Hermitian quantum systems. △ Less

Submitted 7 August, 2024; originally announced August 2024.

arXiv:2408.00422 [pdf, ps, other]

Ginzburg--Landau Functionals in the Large-Graph Limit

Authors: Edith Zhang, James Scott, Qiang Du, Mason A. Porter

Abstract: Ginzburg--Landau (GL) functionals on graphs, which are relaxations of graph-cut functionals on graphs, have yielded a variety of insights in image segmentation and graph clustering. In this paper, we study large-graph limits of GL functionals by taking a functional-analytic view of graphs as nonlocal kernels. For a graph $W_n$ with $n$ nodes, the corresponding graph GL functional $\GL^{W_n}_\ep$ i… ▽ More Ginzburg--Landau (GL) functionals on graphs, which are relaxations of graph-cut functionals on graphs, have yielded a variety of insights in image segmentation and graph clustering. In this paper, we study large-graph limits of GL functionals by taking a functional-analytic view of graphs as nonlocal kernels. For a graph $W_n$ with $n$ nodes, the corresponding graph GL functional $\GL^{W_n}_\ep$ is an energy for functions on $W_n$. We minimize GL functionals on sequences of growing graphs that converge to functions called graphons. For such sequences of graphs, we show that the graph GL functional $Γ$-converges to a continuous and nonlocal functional that we call the \emph{graphon GL functional}. We also investigate the sharp-interface limits of the graph GL and graphon GL functionals, and we relate these limits to a nonlocal total variation. We express the limiting GL functional in terms of Young measures and thereby obtain a probabilistic interpretation of the variational problem in the large-graph limit. Finally, to develop intuition about the graphon GL functional, we compute the GL minimizer for several example families of graphons. △ Less

Submitted 1 August, 2024; originally announced August 2024.

Comments: 37 pages

arXiv:2407.19964 [pdf, other]

A Markov representation of Perron-Frobenius eigenvector for infinite non-negative matrix and Metzler-matrix

Authors: Qian Du, Yong-Hua Mao

Abstract: We will represent the so-called Perron-Frobenius eigenvector (if exists) for infinite non-negative matrix $A$ and Metzler matrix by using its corresponding Markov chain with probability transition function. We will represent the so-called Perron-Frobenius eigenvector (if exists) for infinite non-negative matrix $A$ and Metzler matrix by using its corresponding Markov chain with probability transition function. △ Less

Submitted 29 July, 2024; originally announced July 2024.

arXiv:2407.19803 [pdf, ps, other]

Quasi-stationary distributions for continuous-time $λ$-recurrent jump processes

Authors: Qian Du, Yong-Hua Mao

Abstract: For the continuous-time $λ$-recurrent jump process, the $λ$-recurrence assures the existence of quasi-stationary distribution when it has finite exit states (the states that have positive killing rates). And we give an explicit representation for this quasi-stationary distribution through $Q$-matrix, where the components of the quasi-stationary distribution outside the set $H$ of exit states can b… ▽ More For the continuous-time $λ$-recurrent jump process, the $λ$-recurrence assures the existence of quasi-stationary distribution when it has finite exit states (the states that have positive killing rates). And we give an explicit representation for this quasi-stationary distribution through $Q$-matrix, where the components of the quasi-stationary distribution outside the set $H$ of exit states can be represented by those within $H$. Sufficient condition is also provided for quasi-stationary distribution when the exit states are infinite. △ Less

Submitted 29 July, 2024; originally announced July 2024.

arXiv:2407.15156 [pdf, other]

Computational and analytical studies of a new nonlocal phase-field crystal model in two dimensions

Authors: Qiang Du, Kai Wang, Jiang Yang

Abstract: A nonlocal phase-field crystal (NPFC) model is presented as a nonlocal counterpart of the local phase-field crystal (LPFC) model and a special case of the structural PFC (XPFC) derived from classical field theory for crystal growth and phase transition. The NPFC incorporates a finite range of spatial nonlocal interactions that can account for both repulsive and attractive effects. The specific for… ▽ More A nonlocal phase-field crystal (NPFC) model is presented as a nonlocal counterpart of the local phase-field crystal (LPFC) model and a special case of the structural PFC (XPFC) derived from classical field theory for crystal growth and phase transition. The NPFC incorporates a finite range of spatial nonlocal interactions that can account for both repulsive and attractive effects. The specific form is data-driven and determined by a fitting to the materials structure factor, which can be much more accurate than the LPFC and previously proposed fractional variant. In particular, it is able to match the experimental data of the structure factor up to the second peak, an achievement not possible with other PFC variants studied in the literature. Both LPFC and fractional PFC (FPFC) are also shown to be distinct scaling limits of the NPFC, which reflects the generality. The advantage of NPFC in retaining material properties suggests that it may be more suitable for characterizing liquid-solid transition systems. Moreover, we study numerical discretizations using Fourier spectral methods, which are shown to be convergent and asymptotically compatible, making them robust numerical discretizations across different parameter ranges. Numerical experiments are given in the two-dimensional case to demonstrate the effectiveness of the NPFC in simulating crystal structures and grain boundaries. △ Less

Submitted 21 July, 2024; originally announced July 2024.

arXiv:2406.16087 [pdf, other]

Imperative Learning: A Self-supervised Neural-Symbolic Learning Framework for Robot Autonomy

Authors: Chen Wang, Kaiyi Ji, Junyi Geng, Zhongqiang Ren, Taimeng Fu, Fan Yang, Yifan Guo, Haonan He, Xiangyu Chen, Zitong Zhan, Qiwei Du, Shaoshu Su, Bowen Li, Yuheng Qiu, Yi Du, Qihang Li, Yifan Yang, Xiao Lin, Zhipeng Zhao

Abstract: Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeS… ▽ More Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeSy) computational framework, imperative learning (IL), for robot autonomy, leveraging the generalization abilities of symbolic reasoning. The framework of IL consists of three primary components: a neural module, a reasoning engine, and a memory system. We formulate IL as a special bilevel optimization (BLO), which enables reciprocal learning over the three modules. This overcomes the label-intensive obstacles associated with data-driven approaches and takes advantage of symbolic reasoning concerning logical reasoning, physical principles, geometric analysis, etc. We discuss several optimization techniques for IL and verify their effectiveness in five distinct robot autonomy tasks including path planning, rule induction, optimal control, visual odometry, and multi-robot routing. Through various experiments, we show that IL can significantly enhance robot autonomy capabilities and we anticipate that it will catalyze further research across diverse domains. △ Less

Submitted 6 August, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

arXiv:2406.11931 [pdf, other]

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Authors: DeepSeek-AI, Qihao Zhu, Daya Guo, Zhihong Shao, Dejian Yang, Peiyi Wang, Runxin Xu, Y. Wu, Yukun Li, Huazuo Gao, Shirong Ma, Wangding Zeng, Xiao Bi, Zihui Gu, Hanwei Xu, Damai Dai, Kai Dong, Liyue Zhang, Yishi Piao, Zhibin Gou, Zhenda Xie, Zhewen Hao, Bingxuan Wang, Junxiao Song, Deli Chen , et al. (15 additional authors not shown)

Abstract: We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathe… ▽ More We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder-33B, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K. In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.10469 [pdf, other]

Object-Attribute-Relation Representation based Video Semantic Communication

Authors: Qiyuan Du, Yiping Duan, Qianqian Yang, Xiaoming Tao, Mérouane Debbah

Abstract: With the rapid growth of multimedia data volume, there is an increasing need for efficient video transmission in applications such as virtual reality and future video streaming services. Semantic communication is emerging as a vital technique for ensuring efficient and reliable transmission in low-bandwidth, high-noise settings. However, most current approaches focus on joint source-channel coding… ▽ More With the rapid growth of multimedia data volume, there is an increasing need for efficient video transmission in applications such as virtual reality and future video streaming services. Semantic communication is emerging as a vital technique for ensuring efficient and reliable transmission in low-bandwidth, high-noise settings. However, most current approaches focus on joint source-channel coding (JSCC) that depends on end-to-end training. These methods often lack an interpretable semantic representation and struggle with adaptability to various downstream tasks. In this paper, we introduce the use of object-attribute-relation (OAR) as a semantic framework for videos to facilitate low bit-rate coding and enhance the JSCC process for more effective video transmission. We utilize OAR sequences for both low bit-rate representation and generative video reconstruction. Additionally, we incorporate OAR into the image JSCC model to prioritize communication resources for areas more critical to downstream tasks. Our experiments on traffic surveillance video datasets assess the effectiveness of our approach in terms of video transmission performance. The empirical findings demonstrate that our OAR-based video coding method not only outperforms H.265 coding at lower bit-rates but also synergizes with JSCC to deliver robust and efficient video transmission. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.04705 [pdf]

EAIA: An Efficient and Anonymous Identity Authentication Scheme in 5G-V2V

Authors: Qianmin Du, Jianhong Zhou, Maode Ma

Abstract: Vehicle Ad-hoc Networks (VANETs) have experienced significant development in recent years, playing a crucial role in enhancing the driving experience by enabling safer and more efficient inter-vehicle interactions through information exchange. Vehicle-to-vehicle (V2V) communication is particularly vital as it not only helps to prevent collisions and improve traffic efficiency but also provides ess… ▽ More Vehicle Ad-hoc Networks (VANETs) have experienced significant development in recent years, playing a crucial role in enhancing the driving experience by enabling safer and more efficient inter-vehicle interactions through information exchange. Vehicle-to-vehicle (V2V) communication is particularly vital as it not only helps to prevent collisions and improve traffic efficiency but also provides essential situational awareness to drivers or autonomous driving systems. Communication is typically supported by Roadside Units (RSUs); however, in practical applications, vehicles may exceed the communication range of RSUs, thus exposing them to various malicious attacks. Additionally, considering the limited computational resources of onboard units (OBUs) in vehicles, there is a high demand for designing lightweight security protocols that support V2V communication. To address this issue, this paper proposes an efficient anonymous V2V identity authentication protocol tailored for scenarios that lack RSU support. The proposed protocol has been formally assessed using the Scyther tool, demonstrating its capability to withstand major typical malicious attacks. Performance evaluations indicate that the proposed protocol is efficient in terms of communication and computational overhead, making it a viable solution for V2V vehicle communication. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2406.02190 [pdf, ps, other]

Age of Trust (AoT): A Continuous Verification Framework for Wireless Networks

Authors: Yuquan Xiao, Qinghe Du, Wenchi Cheng, Panagiotis D. Diamantoulakis, George K. Karagiannidis

Abstract: Zero Trust is a new security vision for 6G networks that emphasises the philosophy of never trust and always verify. However, there is a fundamental trade-off between the wireless transmission efficiency and the trust level, which is reflected by the verification interval and its adaptation strategy. More importantly, the mathematical framework to characterise the trust level of the adaptive verif… ▽ More Zero Trust is a new security vision for 6G networks that emphasises the philosophy of never trust and always verify. However, there is a fundamental trade-off between the wireless transmission efficiency and the trust level, which is reflected by the verification interval and its adaptation strategy. More importantly, the mathematical framework to characterise the trust level of the adaptive verification strategy is still missing. Inspired by this vision, we propose a concept called age of trust (AoT) to capture the characteristics of the trust level degrading over time, with the definition of the time elapsed since the last verification of the target user's trust plus the initial age, which depends on the trust level evaluated at that verification. The higher the trust level, the lower the initial age. To evaluate the trust level in the long term, the average AoT is used. We then investigate how to find a compromise between average AoT and wireless transmission efficiency with limited resources. In particular, we address the bi-objective optimization (BOO) problem between average AoT and throughput over a single link with arbitrary service process, where the identity of the receiver is constantly verified, and we devise a periodic verification scheme and a Q-learning-based scheme for constant process and random process, respectively. We also tackle the BOO problem in a multiple random access scenario, where a trust-enhanced frame-slotted ALOHA is designed. Finally, the numerical results show that our proposals can achieve a fair compromise between trust level and wireless transmission efficiency, and thus have a wide application prospect in various zero-trust architectures. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02139 [pdf, other]

Statistical Age of Information: A Risk-Aware Metric and Its Applications in Status Updates

Authors: Yuquan Xiao, Qinghe Du, George K. Karagiannidis

Abstract: Age of information (AoI) is an effective measure to quantify the information freshness in wireless status update systems. It has been further validated that the peak AoI has the potential to capture the core characteristics of the aging process, and thus the average peak AoI is widely used to evaluate the long-term performance of information freshness. However, the average peak AoI is a risk-insen… ▽ More Age of information (AoI) is an effective measure to quantify the information freshness in wireless status update systems. It has been further validated that the peak AoI has the potential to capture the core characteristics of the aging process, and thus the average peak AoI is widely used to evaluate the long-term performance of information freshness. However, the average peak AoI is a risk-insensitive metric and therefore may not be well suited for evaluating critical status update services. Motivated by this concern, and following the spirit of entropic value-at-risk (EVaR) in the field of risk analysis, in this paper we present a concept, termed Statistical AoI, for providing a unified framework to guarantee various requirements of risk-sensitive status-update services with the demand on the violation probability of the peak age. In particular, as the constraint on the violation probability of the peak age varies from loose to strict, the statistical AoI evolves from the average peak AoI to the maximum peak AoI. We then investigate the statistical AoI minimization problem for status updates over wireless fading channels. It is interesting to note that the corresponding optimal sampling scheme varies from step to constant functions of the channel power gain with the peak age violation probability from one to zero. We also address the maximum statistical AoI minimization problem for multi-status updates with time division multiple access (TDMA), where longer transmission time can improve reliability but may also cause the larger age. By solving this problem, we derive the optimal transmission time allocation scheme. Numerical results show that our proposals can better satisfy the diverse requirements of various risk-sensitive status update services, and demonstrate the great potential of improving information freshness compared to baseline approaches. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.13860 [pdf, other]

MAGIC: Map-Guided Few-Shot Audio-Visual Acoustics Modeling

Authors: Diwei Huang, Kunyang Lin, Peihao Chen, Qing Du, Mingkui Tan

Abstract: Few-shot audio-visual acoustics modeling seeks to synthesize the room impulse response in arbitrary locations with few-shot observations. To sufficiently exploit the provided few-shot data for accurate acoustic modeling, we present a *map-guided* framework by constructing acoustic-related visual semantic feature maps of the scenes. Visual features preserve semantic details related to sound and map… ▽ More Few-shot audio-visual acoustics modeling seeks to synthesize the room impulse response in arbitrary locations with few-shot observations. To sufficiently exploit the provided few-shot data for accurate acoustic modeling, we present a *map-guided* framework by constructing acoustic-related visual semantic feature maps of the scenes. Visual features preserve semantic details related to sound and maps provide explicit structural regularities of sound propagation, which are valuable for modeling environment acoustics. We thus extract pixel-wise semantic features derived from observations and project them into a top-down map, namely the **observation semantic map**. This map contains the relative positional information among points and the semantic feature information associated with each point. Yet, limited information extracted by few-shot observations on the map is not sufficient for understanding and modeling the whole scene. We address the challenge by generating a **scene semantic map** via diffusing features and anticipating the observation semantic map. The scene semantic map then interacts with echo encoding by a transformer-based encoder-decoder to predict RIR for arbitrary speaker-listener query pairs. Extensive experiments on Matterport3D and Replica dataset verify the efficacy of our framework. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 17 pages, 12 pages for main paper, 5 pages for supplementary

arXiv:2405.04434 [pdf, other]

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation. Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock its potential. Evaluation results show that, even with only 21B activated parameters, DeepSeek-V2 and its chat versions still achieve top-tier performance among open-source models. △ Less

Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

arXiv:2404.13378 [pdf, other]

doi 10.1109/TIV.2024.3352180

Social Force Embedded Mixed Graph Convolutional Network for Multi-class Trajectory Prediction

Authors: Quancheng Du, Xiao Wang, Shouguo Yin, Lingxi Li, Huansheng Ning

Abstract: Accurate prediction of agent motion trajectories is crucial for autonomous driving, contributing to the reduction of collision risks in human-vehicle interactions and ensuring ample response time for other traffic participants. Current research predominantly focuses on traditional deep learning methods, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs). These meth… ▽ More Accurate prediction of agent motion trajectories is crucial for autonomous driving, contributing to the reduction of collision risks in human-vehicle interactions and ensuring ample response time for other traffic participants. Current research predominantly focuses on traditional deep learning methods, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs). These methods leverage relative distances to forecast the motion trajectories of a single class of agents. However, in complex traffic scenarios, the motion patterns of various types of traffic participants exhibit inherent randomness and uncertainty. Relying solely on relative distances may not adequately capture the nuanced interaction patterns between different classes of road users. In this paper, we propose a novel multi-class trajectory prediction method named the social force embedded mixed graph convolutional network (SFEM-GCN). SFEM-GCN comprises three graph topologies: the semantic graph (SG), position graph (PG), and velocity graph (VG). These graphs encode various of social force relationships among different classes of agents in complex scenes. Specifically, SG utilizes one-hot encoding of agent-class information to guide the construction of graph adjacency matrices based on semantic information. PG and VG create adjacency matrices to capture motion interaction relationships between different classes agents. These graph structures are then integrated into a mixed graph, where learning is conducted using a spatiotemporal graph convolutional neural network (ST-GCNN). To further enhance prediction performance, we adopt temporal convolutional networks (TCNs) to generate the predicted trajectory with fewer parameters. Experimental results on publicly available datasets demonstrate that SFEM-GCN surpasses state-of-the-art methods in terms of accuracy and robustness. △ Less

Submitted 20 April, 2024; originally announced April 2024.

Comments: 11 pages,3 figures, published to IEEE Transactions on Intelligent vehicles

arXiv:2404.11946 [pdf, other]

doi 10.1109/TIV.2023.3338483

S4TP: Social-Suitable and Safety-Sensitive Trajectory Planning for Autonomous Vehicles

Authors: Xiao Wang, Ke Tang, Xingyuan Dai, Jintao Xu, Quancheng Du, Rui Ai, Yuxiao Wang, Weihao Gu

Abstract: In public roads, autonomous vehicles (AVs) face the challenge of frequent interactions with human-driven vehicles (HDVs), which render uncertain driving behavior due to varying social characteristics among humans. To effectively assess the risks prevailing in the vicinity of AVs in social interactive traffic scenarios and achieve safe autonomous driving, this article proposes a social-suitable and… ▽ More In public roads, autonomous vehicles (AVs) face the challenge of frequent interactions with human-driven vehicles (HDVs), which render uncertain driving behavior due to varying social characteristics among humans. To effectively assess the risks prevailing in the vicinity of AVs in social interactive traffic scenarios and achieve safe autonomous driving, this article proposes a social-suitable and safety-sensitive trajectory planning (S4TP) framework. Specifically, S4TP integrates the Social-Aware Trajectory Prediction (SATP) and Social-Aware Driving Risk Field (SADRF) modules. SATP utilizes Transformers to effectively encode the driving scene and incorporates an AV's planned trajectory during the prediction decoding process. SADRF assesses the expected surrounding risk degrees during AVs-HDVs interactions, each with different social characteristics, visualized as two-dimensional heat maps centered on the AV. SADRF models the driving intentions of the surrounding HDVs and predicts trajectories based on the representation of vehicular interactions. S4TP employs an optimization-based approach for motion planning, utilizing the predicted HDVs'trajectories as input. With the integration of SADRF, S4TP executes real-time online optimization of the planned trajectory of AV within lowrisk regions, thus improving the safety and the interpretability of the planned trajectory. We have conducted comprehensive tests of the proposed method using the SMARTS simulator. Experimental results in complex social scenarios, such as unprotected left turn intersections, merging, cruising, and overtaking, validate the superiority of our proposed S4TP in terms of safety and rationality. S4TP achieves a pass rate of 100% across all scenarios, surpassing the current state-of-the-art methods Fanta of 98.25% and Predictive-Decision of 94.75%. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 12 pages,4 figures, published to IEEE Transactions on Intelligent Vehicles

arXiv:2404.11326 [pdf, other]

Single-temporal Supervised Remote Change Detection for Domain Generalization

Authors: Qiangang Du, Jinlong Peng, Xu Chen, Qingdong He, Liren He, Qiang Nie, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

Abstract: Change detection is widely applied in remote sensing image analysis. Existing methods require training models separately for each dataset, which leads to poor domain generalization. Moreover, these methods rely heavily on large amounts of high-quality pair-labelled data for training, which is expensive and impractical. In this paper, we propose a multimodal contrastive learning (ChangeCLIP) based… ▽ More Change detection is widely applied in remote sensing image analysis. Existing methods require training models separately for each dataset, which leads to poor domain generalization. Moreover, these methods rely heavily on large amounts of high-quality pair-labelled data for training, which is expensive and impractical. In this paper, we propose a multimodal contrastive learning (ChangeCLIP) based on visual-language pre-training for change detection domain generalization. Additionally, we propose a dynamic context optimization for prompt learning. Meanwhile, to address the data dependency issue of existing methods, we introduce a single-temporal and controllable AI-generated training strategy (SAIN). This allows us to train the model using a large number of single-temporal images without image pairs in the real world, achieving excellent generalization. Extensive experiments on series of real change detection datasets validate the superiority and strong generalization of ChangeCLIP, outperforming state-of-the-art change detection methods. Code will be available. △ Less

Submitted 23 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.11318 [pdf, other]

Leveraging Fine-Grained Information and Noise Decoupling for Remote Sensing Change Detection

Authors: Qiangang Du, Jinlong Peng, Changan Wang, Xu Chen, Qingdong He, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

Abstract: Change detection aims to identify remote sense object changes by analyzing data between bitemporal image pairs. Due to the large temporal and spatial span of data collection in change detection image pairs, there are often a significant amount of task-specific and task-agnostic noise. Previous effort has focused excessively on denoising, with this goes a great deal of loss of fine-grained informat… ▽ More Change detection aims to identify remote sense object changes by analyzing data between bitemporal image pairs. Due to the large temporal and spatial span of data collection in change detection image pairs, there are often a significant amount of task-specific and task-agnostic noise. Previous effort has focused excessively on denoising, with this goes a great deal of loss of fine-grained information. In this paper, we revisit the importance of fine-grained features in change detection and propose a series of operations for fine-grained information compensation and noise decoupling (FINO). First, the context is utilized to compensate for the fine-grained information in the feature space. Next, a shape-aware and a brightness-aware module are designed to improve the capacity for representation learning. The shape-aware module guides the backbone for more precise shape estimation, guiding the backbone network in extracting object shape features. The brightness-aware module learns a overall brightness estimation to improve the model's robustness to task-agnostic noise. Finally, a task-specific noise decoupling structure is designed as a way to improve the model's ability to separate noise interference from feature similarity. With these training schemes, our proposed method achieves new state-of-the-art (SOTA) results in multiple change detection benchmarks. The code will be made available. △ Less

Submitted 21 June, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

arXiv:2403.12582 [pdf, other]

AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework

Authors: Xiang Li, Zhenyu Li, Chen Shi, Yong Xu, Qing Du, Mingkui Tan, Jun Huang, Wei Lin

Abstract: The task of financial analysis primarily encompasses two key areas: stock trend prediction and the corresponding financial question answering. Currently, machine learning and deep learning algorithms (ML&DL) have been widely applied for stock trend predictions, leading to significant progress. However, these methods fail to provide reasons for predictions, lacking interpretability and reasoning pr… ▽ More The task of financial analysis primarily encompasses two key areas: stock trend prediction and the corresponding financial question answering. Currently, machine learning and deep learning algorithms (ML&DL) have been widely applied for stock trend predictions, leading to significant progress. However, these methods fail to provide reasons for predictions, lacking interpretability and reasoning processes. Also, they can not integrate textual information such as financial news or reports. Meanwhile, large language models (LLMs) have remarkable textual understanding and generation ability. But due to the scarcity of financial training datasets and limited integration with real-time knowledge, LLMs still suffer from hallucinations and are unable to keep up with the latest information. To tackle these challenges, we first release AlphaFin datasets, combining traditional research datasets, real-time financial data, and handwritten chain-of-thought (CoT) data. It has a positive impact on training LLMs for completing financial analysis. We then use AlphaFin datasets to benchmark a state-of-the-art method, called Stock-Chain, for effectively tackling the financial analysis task, which integrates retrieval-augmented generation (RAG) techniques. Extensive experiments are conducted to demonstrate the effectiveness of our framework on financial analysis. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: COLING 2024. The first three authors contributed equally. Project website: https://github.com/AlphaFin-proj/AlphaFin

arXiv:2403.11561 [pdf, other]

Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection

Authors: Liren He, Zhengkai Jiang, Jinlong Peng, Liang Liu, Qiangang Du, Xiaobin Hu, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

Abstract: In the field of multi-class anomaly detection, reconstruction-based methods derived from single-class anomaly detection face the well-known challenge of "learning shortcuts", wherein the model fails to learn the patterns of normal samples as it should, opting instead for shortcuts such as identity mapping or artificial noise elimination. Consequently, the model becomes unable to reconstruct genuin… ▽ More In the field of multi-class anomaly detection, reconstruction-based methods derived from single-class anomaly detection face the well-known challenge of "learning shortcuts", wherein the model fails to learn the patterns of normal samples as it should, opting instead for shortcuts such as identity mapping or artificial noise elimination. Consequently, the model becomes unable to reconstruct genuine anomalies as normal instances, resulting in a failure of anomaly detection. To counter this issue, we present a novel unified feature reconstruction-based anomaly detection framework termed RLR (Reconstruct features from a Learnable Reference representation). Unlike previous methods, RLR utilizes learnable reference representations to compel the model to learn normal feature patterns explicitly, thereby prevents the model from succumbing to the "learning shortcuts" issue. Additionally, RLR incorporates locality constraints into the learnable reference to facilitate more effective normal pattern capture and utilizes a masked learnable key attention mechanism to enhance robustness. Evaluation of RLR on the 15-category MVTec-AD dataset and the 12-category VisA dataset shows superior performance compared to state-of-the-art methods under the unified setting. The code of RLR will be publicly available. △ Less

Submitted 16 July, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: Accepted by ECCV 2024

arXiv:2403.10067 [pdf, other]

Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising

Authors: Shuai Hu, Feng Gao, Xiaowei Zhou, Junyu Dong, Qian Du

Abstract: Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data. However, simultaneously modeling global and local features is rarely explored to enhance HSI denoising. In this letter, we propose a hybrid convolution and attention network (HCANet), which leverages both the strengths of convolution neural networks (CNNs) and Transformers. To enhan… ▽ More Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data. However, simultaneously modeling global and local features is rarely explored to enhance HSI denoising. In this letter, we propose a hybrid convolution and attention network (HCANet), which leverages both the strengths of convolution neural networks (CNNs) and Transformers. To enhance the modeling of both global and local features, we have devised a convolution and attention fusion module aimed at capturing long-range dependencies and neighborhood spectral correlations. Furthermore, to improve multi-scale information aggregation, we design a multi-scale feed-forward network to enhance denoising performance by extracting features at different scales. Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet. The proposed model is effective in removing various types of complex noise. Our codes are available at \url{https://github.com/summitgao/HCANet}. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: IEEE GRSL 2024

arXiv:2403.07712 [pdf, ps, other]

Nonlocal Stokes equation with relaxation on the divergence free equation

Authors: Yajie Zhang, Qiang Du, Zuoqiang Shi

Abstract: In this paper, we consider a new nonlocal approximation to the linear Stokes system with periodic boundary conditions in two and three dimensional spaces . A relaxation term is added to the equation of nonlocal divergence free equation, which is reminiscent to the relaxation of local Stokes equation with small artificial compressibility. Our analysis shows that the well-posedness of the nonlocal s… ▽ More In this paper, we consider a new nonlocal approximation to the linear Stokes system with periodic boundary conditions in two and three dimensional spaces . A relaxation term is added to the equation of nonlocal divergence free equation, which is reminiscent to the relaxation of local Stokes equation with small artificial compressibility. Our analysis shows that the well-posedness of the nonlocal system can be established under some mild assumptions on the kernel of nonlocal interactions. Furthermore, the new nonlocal system converges to the conventional, local Stokes system in second order as the horizon parameter of the nonlocal interaction goes to zero. The study provides more theoretical understanding to some numerical methods, such as smoothed particle hydrodynamics, for simulating incompressible viscous flows. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.05852 [pdf, other]

SSF-Net: Spatial-Spectral Fusion Network with Spectral Angle Awareness for Hyperspectral Object Tracking

Authors: Hanzheng Wang, Wei Li, Xiang-Gen Xia, Qian Du, Jing Tian

Abstract: Hyperspectral video (HSV) offers valuable spatial, spectral, and temporal information simultaneously, making it highly suitable for handling challenges such as background clutter and visual similarity in object tracking. However, existing methods primarily focus on band regrouping and rely on RGB trackers for feature extraction, resulting in limited exploration of spectral information and difficul… ▽ More Hyperspectral video (HSV) offers valuable spatial, spectral, and temporal information simultaneously, making it highly suitable for handling challenges such as background clutter and visual similarity in object tracking. However, existing methods primarily focus on band regrouping and rely on RGB trackers for feature extraction, resulting in limited exploration of spectral information and difficulties in achieving complementary representations of object features. In this paper, a spatial-spectral fusion network with spectral angle awareness (SST-Net) is proposed for hyperspectral (HS) object tracking. Firstly, to address the issue of insufficient spectral feature extraction in existing networks, a spatial-spectral feature backbone ($S^2$FB) is designed. With the spatial and spectral extraction branch, a joint representation of texture and spectrum is obtained. Secondly, a spectral attention fusion module (SAFM) is presented to capture the intra- and inter-modality correlation to obtain the fused features from the HS and RGB modalities. It can incorporate the visual information into the HS spectral context to form a robust representation. Thirdly, to ensure a more accurate response of the tracker to the object position, a spectral angle awareness module (SAAM) investigates the region-level spectral similarity between the template and search images during the prediction stage. Furthermore, we develop a novel spectral angle awareness loss (SAAL) to offer guidance for the SAAM based on similar regions. Finally, to obtain the robust tracking results, a weighted prediction method is considered to combine the HS and RGB predicted motions of objects to leverage the strengths of each modality. Extensive experiments on the HOTC dataset demonstrate the effectiveness of the proposed SSF-Net, compared with state-of-the-art trackers. △ Less

Submitted 9 March, 2024; originally announced March 2024.

arXiv:2403.05031 [pdf, other]

doi 10.1145/3613904.3642187

LightSword: A Customized Virtual Reality Exergame for Long-Term Cognitive Inhibition Training in Older Adults

Authors: Qiuxin Du, Zhen Song, Haiyan Jiang, Xiaoying Wei, Dongdong Weng, Mingming Fan

Abstract: The decline of cognitive inhibition significantly impacts older adults' quality of life and well-being, making it a vital public health problem in today's aging society. Previous research has demonstrated that Virtual reality (VR) exergames have great potential to enhance cognitive inhibition among older adults. However, existing commercial VR exergames were unsuitable for older adults' long-term… ▽ More The decline of cognitive inhibition significantly impacts older adults' quality of life and well-being, making it a vital public health problem in today's aging society. Previous research has demonstrated that Virtual reality (VR) exergames have great potential to enhance cognitive inhibition among older adults. However, existing commercial VR exergames were unsuitable for older adults' long-term cognitive training due to the inappropriate cognitive activation paradigm, unnecessary complexity, and unbefitting difficulty levels. To bridge these gaps, we developed a customized VR cognitive training exergame (LightSword) based on Dual-task and Stroop paradigms for long-term cognitive inhibition training among healthy older adults. Subsequently, we conducted an eight-month longitudinal user study with 12 older adults aged 60 years and above to demonstrate the effectiveness of LightSword in improving cognitive inhibition. After the training, the cognitive inhibition abilities of older adults were significantly enhanced, with benefits persisting for 6 months. This result indicated that LightSword has both short-term and long-term effects in enhancing cognitive inhibition. Furthermore, qualitative feedback revealed that older adults exhibited a positive attitude toward long-term training with LightSword, which enhanced their motivation and compliance. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 23 pages

Journal ref: Proceedings of the CHI Conference on Human Factors in Computing Systems 2024 (CHI '24)

arXiv:2402.18004 [pdf, ps, other]

doi 10.1103/PhysRevD.110.034011

Collisional energy loss of a heavy quark in a semiquark-gluon plasma

Authors: Qianqian Du, Mudong Du, Yun Guo

Abstract: By utilizing a background field effective theory, we compute the collisional energy loss of a heavy quark moving through a semiquark-gluon plasma characterized by nontrivial holonomy for Polyakov loops. We consider the elastic scatterings between the incident heavy quark and the thermal partons with both hard and soft momentum transfers. As compared to the energy loss obtained from the perturbatio… ▽ More By utilizing a background field effective theory, we compute the collisional energy loss of a heavy quark moving through a semiquark-gluon plasma characterized by nontrivial holonomy for Polyakov loops. We consider the elastic scatterings between the incident heavy quark and the thermal partons with both hard and soft momentum transfers. As compared to the energy loss obtained from the perturbation theory, the hard processes get modified through the thermal distribution functions that depend on the background field, while the proper treatment of the soft processes strongly relies on the use of the hard-thermal-loop resummed gluon propagator derived from the background field effective theory. Our results show that the heavy quark energy loss is significantly suppressed in the semiquark-gluon plasma due to a background field that is self-consistently generated in the effective theory. On the other hand, the suppression has a strong dependence on the temperature of the plasma which becomes negligible above $2 - 3 $ times the critical temperature. For a realistic coupling constant, ignoring a relatively weak dependence on the heavy quark velocity, the suppression on the collisional energy loss can be approximated by an overall factor determined solely by the background field. This simple conclusion is expected to be useful for phenomenological applications in the heavy flavor physics. △ Less

Submitted 14 August, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

Comments: final version, published in PRD

arXiv:2402.07749 [pdf, ps, other]

Asymptotically compatible schemes for nonlinear variational models via Gamma-convergence and applications to nonlocal problems

Authors: Qiang Du, James M. Scott, Xiaochuan Tian

Abstract: We present a study on asymptotically compatible Galerkin discretizations for a class of parametrized nonlinear variational problems. The abstract analytical framework is based on variational convergence, or Gamma-convergence. We demonstrate the broad applicability of the theoretical framework by developing asymptotically compatible finite element discretizations of some representative nonlinear no… ▽ More We present a study on asymptotically compatible Galerkin discretizations for a class of parametrized nonlinear variational problems. The abstract analytical framework is based on variational convergence, or Gamma-convergence. We demonstrate the broad applicability of the theoretical framework by developing asymptotically compatible finite element discretizations of some representative nonlinear nonlocal variational problems on a bounded domain. These include nonlocal nonlinear problems with classically-defined, local boundary constraints through heterogeneous localization at the boundary, as well as nonlocal problems posed on parameter-dependent domains. △ Less

Submitted 12 February, 2024; originally announced February 2024.

arXiv:2402.05123 [pdf, ps, other]

A Survey on Data Selection for LLM Instruction Tuning

Authors: Jiahao Wang, Bolin Zhang, Qianlong Du, Jiajun Zhang, Dianhui Chu

Abstract: Instruction tuning is a vital step of training large language models (LLM), so how to enhance the effect of instruction tuning has received increased attention. Existing works indicate that the quality of the dataset is more crucial than the quantity during instruction tuning of LLM. Therefore, recently a lot of studies focus on exploring the methods of selecting high-quality subset from instructi… ▽ More Instruction tuning is a vital step of training large language models (LLM), so how to enhance the effect of instruction tuning has received increased attention. Existing works indicate that the quality of the dataset is more crucial than the quantity during instruction tuning of LLM. Therefore, recently a lot of studies focus on exploring the methods of selecting high-quality subset from instruction datasets, aiming to reduce training costs and enhance the instruction-following capabilities of LLMs. This paper presents a comprehensive survey on data selection for LLM instruction tuning. Firstly, we introduce the wildly used instruction datasets. Then, we propose a new taxonomy of the data selection methods and provide a detailed introduction of recent advances,and the evaluation strategies and results of data selection methods are also elaborated in detail. Finally, we emphasize the open challenges and present new frontiers of this task. △ Less

Submitted 4 February, 2024; originally announced February 2024.

arXiv:2401.05679 [pdf, other]

Ohta-Kawasaki energy for amphiphiles: asymptotics and phase-field simulations

Authors: Qiang Du, James M. Scott, Zirui Xu

Abstract: We study the minimizers of a degenerate case of the Ohta-Kawasaki energy, defined as the sum of the perimeter and a Coulombic nonlocal term. We start by investigating radially symmetric candidates which give us insights into the asymptotic behaviors of energy minimizers in the large mass limit. In order to numerically study the problems that are analytically challenging, we propose a phase-field r… ▽ More We study the minimizers of a degenerate case of the Ohta-Kawasaki energy, defined as the sum of the perimeter and a Coulombic nonlocal term. We start by investigating radially symmetric candidates which give us insights into the asymptotic behaviors of energy minimizers in the large mass limit. In order to numerically study the problems that are analytically challenging, we propose a phase-field reformulation which is shown to Gamma-converge to the original sharp interface model. Our phase-field simulations and asymptotic results suggest that the energy minimizers exhibit behaviors similar to the self-assembly of amphiphiles, including the formation of lipid bilayer membranes. △ Less

Submitted 23 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

Comments: 51 pages; 14 figures; submitted for publication; minor typos corrected

MSC Class: 49Q20 (Primary) 35Q92; 35B36 (Secondary)

arXiv:2401.02954 [pdf, other]

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Authors: DeepSeek-AI, :, Xiao Bi, Deli Chen, Guanting Chen, Shanhuang Chen, Damai Dai, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Zhe Fu, Huazuo Gao, Kaige Gao, Wenjun Gao, Ruiqi Ge, Kang Guan, Daya Guo, Jianzhong Guo, Guangbo Hao, Zhewen Hao, Ying He, Wenjie Hu, Panpan Huang, Erhang Li , et al. (63 additional authors not shown)

Abstract: The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B… ▽ More The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source language models with a long-term perspective. To support the pre-training phase, we have developed a dataset that currently consists of 2 trillion tokens and is continuously expanding. We further conduct supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the creation of DeepSeek Chat models. Our evaluation results demonstrate that DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in the domains of code, mathematics, and reasoning. Furthermore, open-ended evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance compared to GPT-3.5. △ Less

Submitted 5 January, 2024; originally announced January 2024.

arXiv:2312.08926

Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent

Authors: Haoran Liao, Qinyi Du, Shaohua Hu, Hao He, Yanyan Xu, Jidong Tian, Yaohui Jin

Abstract: Large language models (LLMs) face challenges in solving complex mathematical problems that require comprehensive capacities to parse the statements, associate domain knowledge, perform compound logical reasoning, and integrate the intermediate rationales. Tackling all these problems once could be arduous for LLMs, thus leading to confusion in generation. In this work, we explore the potential of e… ▽ More Large language models (LLMs) face challenges in solving complex mathematical problems that require comprehensive capacities to parse the statements, associate domain knowledge, perform compound logical reasoning, and integrate the intermediate rationales. Tackling all these problems once could be arduous for LLMs, thus leading to confusion in generation. In this work, we explore the potential of enhancing LLMs with agents by meticulous decomposition and modeling of mathematical reasoning process. Specifically, we propose a formal description of the mathematical solving and extend LLMs with an agent-based zero-shot framework named $\bf{P}$lanner-$\bf{R}$easoner-$\bf{E}$xecutor-$\bf{R}$eflector (PRER). We further provide and implement two MathAgents that define the logical forms and inherent relations via a pool of actions in different grains and orientations: MathAgent-M adapts its actions to LLMs, while MathAgent-H aligns with humankind. Experiments on miniF2F and MATH have demonstrated the effectiveness of PRER and proposed MathAgents, achieving an increase of $12.3\%$($53.9\%\xrightarrow{}66.2\%$) on the MiniF2F, $9.2\%$ ($49.8\%\xrightarrow{}59.0\%$) on MATH, and $13.2\%$($23.2\%\xrightarrow{}35.4\%$) for level-5 problems of MATH against GPT-4. Further analytical results provide more insightful perspectives on exploiting the behaviors of LLMs as agents. △ Less

Submitted 16 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: There are unfair comparisons on miniF2F. This will be fixed in the future

arXiv:2311.15653 [pdf, other]

MoDS: Model-oriented Data Selection for Instruction Tuning

Authors: Qianlong Du, Chengqing Zong, Jiajun Zhang

Abstract: Instruction tuning has become the de facto method to equip large language models (LLMs) with the ability of following user instructions. Usually, hundreds of thousands or millions of instruction-following pairs are employed to fine-tune the foundation LLMs. Recently, some studies show that a small number of high-quality instruction data is enough. However, how to select appropriate instruction dat… ▽ More Instruction tuning has become the de facto method to equip large language models (LLMs) with the ability of following user instructions. Usually, hundreds of thousands or millions of instruction-following pairs are employed to fine-tune the foundation LLMs. Recently, some studies show that a small number of high-quality instruction data is enough. However, how to select appropriate instruction data for a given LLM is still an open problem. To address this problem, in this paper we present a model-oriented data selection (MoDS) approach, which selects instruction data based on a new criteria considering three aspects: quality, coverage and necessity. First, our approach utilizes a quality evaluation model to filter out the high-quality subset from the original instruction dataset, and then designs an algorithm to further select from the high-quality subset a seed instruction dataset with good coverage. The seed dataset is applied to fine-tune the foundation LLM to obtain an initial instruction-following LLM. Finally, we develop a necessity evaluation model to find out the instruction data which are performed badly in the initial instruction-following LLM and consider them necessary instructions to further improve the LLMs. In this way, we can get a small high-quality, broad-coverage and high-necessity subset from the original instruction datasets. Experimental results show that, the model fine-tuned with 4,000 instruction pairs selected by our approach could perform better than the model fine-tuned with the full original dataset which includes 214k instruction data. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.15173 [pdf, other]

Stretched Non-negative Matrix Factorization

Authors: Ran Gu, Yevgeny Rakita, Ling Lan, Zach Thatcher, Gabrielle E. Kamm, Daniel O'Nolan, Brennan Mcbride, Allison Wustrow, James R. Neilson, Karena W. Chapman, Qiang Du, Simon J. L. Billinge

Abstract: An algorithm is described and tested that carries out a non negative matrix factorization (NMF) ignoring any stretching of the signal along the axis of the independent variable. This extended NMF model is called StretchedNMF. Variability in a set of signals due to this stretching is then ignored in the decomposition. This can be used, for example, to study sets of powder diffraction data collected… ▽ More An algorithm is described and tested that carries out a non negative matrix factorization (NMF) ignoring any stretching of the signal along the axis of the independent variable. This extended NMF model is called StretchedNMF. Variability in a set of signals due to this stretching is then ignored in the decomposition. This can be used, for example, to study sets of powder diffraction data collected at different temperatures where the materials are undergoing thermal expansion. It gives a more meaningful decomposition in this case where the component signals resemble signals from chemical components in the sample. The StretchedNMF model introduces a new variable, the stretching factor, to describe any expansion of the signal. To solve StretchedNMF, we discretize it and employ Block Coordinate Descent framework algorithms. The initial experimental results indicate that StretchedNMF model outperforms the conventional NMF for sets of data with such an expansion. A further enhancement to StretchedNMF for the case of powder diffraction data from crystalline materials called Sparse-StretchedNMF, which makes use of the sparsity of the powder diffraction signals, allows correct extractions even for very small stretches where StretchedNMF struggles. As well as demonstrating the model performance on simulated PXRD patterns and atomic pair distribution functions (PDFs), it also proved successful when applied to real data taken from an in situ chemical reaction experiment. △ Less

Submitted 25 November, 2023; originally announced November 2023.

Comments: 39 pages, 16 figures

arXiv:2311.04442 [pdf, other]

SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image Classification

Authors: Junyan Lin, Feng Gao, Xiaocheng Shi, Junyu Dong, Qian Du

Abstract: Masked image modeling (MIM) is a highly popular and effective self-supervised learning method for image understanding. Existing MIM-based methods mostly focus on spatial feature modeling, neglecting spectral feature modeling. Meanwhile, existing MIM-based methods use Transformer for feature extraction, some local or high-frequency information may get lost. To this end, we propose a spatial-spectra… ▽ More Masked image modeling (MIM) is a highly popular and effective self-supervised learning method for image understanding. Existing MIM-based methods mostly focus on spatial feature modeling, neglecting spectral feature modeling. Meanwhile, existing MIM-based methods use Transformer for feature extraction, some local or high-frequency information may get lost. To this end, we propose a spatial-spectral masked auto-encoder (SS-MAE) for HSI and LiDAR/SAR data joint classification. Specifically, SS-MAE consists of a spatial-wise branch and a spectral-wise branch. The spatial-wise branch masks random patches and reconstructs missing pixels, while the spectral-wise branch masks random spectral channels and reconstructs missing channels. Our SS-MAE fully exploits the spatial and spectral representations of the input data. Furthermore, to complement local features in the training stage, we add two lightweight CNNs for feature extraction. Both global and local features are taken into account for feature modeling. To demonstrate the effectiveness of the proposed SS-MAE, we conduct extensive experiments on three publicly available datasets. Extensive experiments on three multi-source datasets verify the superiority of our SS-MAE compared with several state-of-the-art baselines. The source codes are available at \url{https://github.com/summitgao/SS-MAE}. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: IEEE TGRS 2023

arXiv:2311.01149 [pdf, other]

ChineseWebText: Large-scale High-quality Chinese Web Text Extracted with Effective Evaluation Model

Authors: Jianghao Chen, Pu Jian, Tengxiao Xi, Dongyi Yi, Qianlong Du, Chenglin Ding, Guibo Zhu, Chengqing Zong, Jinqiao Wang, Jiajun Zhang

Abstract: During the development of large language models (LLMs), the scale and quality of the pre-training data play a crucial role in shaping LLMs' capabilities. To accelerate the research of LLMs, several large-scale datasets, such as C4 [1], Pile [2], RefinedWeb [3] and WanJuan [4], have been released to the public. However, most of the released corpus focus mainly on English, and there is still lack of… ▽ More During the development of large language models (LLMs), the scale and quality of the pre-training data play a crucial role in shaping LLMs' capabilities. To accelerate the research of LLMs, several large-scale datasets, such as C4 [1], Pile [2], RefinedWeb [3] and WanJuan [4], have been released to the public. However, most of the released corpus focus mainly on English, and there is still lack of complete tool-chain for extracting clean texts from web data. Furthermore, fine-grained information of the corpus, e.g. the quality of each text, is missing. To address these challenges, we propose in this paper a new complete tool-chain EvalWeb to extract Chinese clean texts from noisy web data. First, similar to previous work, manually crafted rules are employed to discard explicit noisy texts from the raw crawled web contents. Second, a well-designed evaluation model is leveraged to assess the remaining relatively clean data, and each text is assigned a specific quality score. Finally, we can easily utilize an appropriate threshold to select the high-quality pre-training data for Chinese. Using our proposed approach, we release the largest and latest large-scale high-quality Chinese web text ChineseWebText, which consists of 1.42 TB and each text is associated with a quality score, facilitating the LLM researchers to choose the data according to the desired quality thresholds. We also release a much cleaner subset of 600 GB Chinese data with the quality exceeding 90%. △ Less

Submitted 10 November, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

arXiv:2310.16051 [pdf]

Sketched Nanoscale KTaO3-Based Superconducting Quantum Interference Device

Authors: Muqing Yu, Nicholas Hougland, Qianheng Du, Junyi Yang, Sayanwita Biswas, Ranjani Ramachandran, Dengyu Yang, Anand Bhattacharya, David Pekker, Patrick Irvin, Jeremy Levy

Abstract: The discovery of two-dimensional superconductivity in LaAlO3/KTaO3 (111) and (110) interfaces has raised significant interest in this system. In this manuscript we report the first successful fabrication of a superconducting quantum interference device (DC-SQUID) in the KTO system. The key device elements, superconducting weak links, are created by conductive atomic force microscope (c-AFM) lithog… ▽ More The discovery of two-dimensional superconductivity in LaAlO3/KTaO3 (111) and (110) interfaces has raised significant interest in this system. In this manuscript we report the first successful fabrication of a superconducting quantum interference device (DC-SQUID) in the KTO system. The key device elements, superconducting weak links, are created by conductive atomic force microscope (c-AFM) lithography which can reversibly control the conductivity at the LAO/KTO(110) interface with nanoscale resolution. The periodic modulation of the SQUID critical current, Ic(B), with magnetic field corresponds well with our theoretical modeling, which reveals a large kinetic inductance of the superconducting two-dimensional electron gas in KTO. The kinetic inductance of the SQUID is tunable by electrical gating from the back, due to the large dielectric constant of KTO. The demonstration of weak links and SQUIDs in KTO broadens the scope for exploring the underlying physics of KTO superconductivity, including the role of spin-orbit-coupling, pairing symmetry, and inhomogeneity. It also promotes KTO as a versatile platform for a growing family of quantum devices, which could be applicable in the realm of quantum computing and information. △ Less

Submitted 5 February, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

Comments: 40 pages, 16 figures

arXiv:2310.15509 [pdf, other]

Dual frequency master oscillator generation and distribution for ALS and ALS-U

Authors: Shreeharshini Dharanesh Murthy, Angel Jurado, Michael Betz, Qiang Du, Benjamin Flugstad

Abstract: The ongoing work to upgrade ALS to ALS-U demands strict RF requirements such as low jitter and low spurs frequency reference to meet its accelerator and science goals. A low phase noise dual frequency Master Oscillator (MO), where the two frequencies are related by a fractional ratio of 608/609 and flexible divide by four frequency outputs has been consolidated into a single chassis. Optical fiber… ▽ More The ongoing work to upgrade ALS to ALS-U demands strict RF requirements such as low jitter and low spurs frequency reference to meet its accelerator and science goals. A low phase noise dual frequency Master Oscillator (MO), where the two frequencies are related by a fractional ratio of 608/609 and flexible divide by four frequency outputs has been consolidated into a single chassis. Optical fiber clock distribution system has been selected over the old coax system used in ALS to distribute these signals to various clients across the facility, providing high electrical isolation between outputs and therefore lower phase errors. A Xilinx FPGA ties the MO chassis together by providing a RS-485 interface to monitor and control the system. The new system aims to deliver phase-continuous frequencies with a phase noise (integrated RMS jitter) from 1 Hz to 1 MHz of less than 200 femtosecond per output. This paper will discuss the design, implementation, performance and installation of the new MO generation and distribution system. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: Poster presented at LLRF Workshop 2023 (LLRF2023, arXiv: 2310.03199)

Report number: LLRF2023/15

arXiv:2310.06144 [pdf]

Ions-induced Epitaxial Growth of Perovskite Nanocomposites for Highly Efficient Light-Emitting Diodes with EQE Exceeding 30%

Authors: Zhaohui Xing, Qing Du, Peiyuan Pang, Guangrong Jin, Tanghao Liu, Yang Shen, Dengliang Zhang, Bufan Yu, Yue Liang, Jianxin Tang, Lei Wang, Guichuang Xing, Jiangshan Chen, Dongge Ma

Abstract: Metal halide perovskites, a class of cost-effective semiconductor materials, are of great interest for modern and upcoming display technologies that prioritize the light-emitting diodes (LEDs) with high efficiency and excellent color purity. The prevailing approach to achieving efficient luminescence from pervoskites is enhancing exciton binding effect and confining carriers by reducing their dime… ▽ More Metal halide perovskites, a class of cost-effective semiconductor materials, are of great interest for modern and upcoming display technologies that prioritize the light-emitting diodes (LEDs) with high efficiency and excellent color purity. The prevailing approach to achieving efficient luminescence from pervoskites is enhancing exciton binding effect and confining carriers by reducing their dimensionality or grain size. However, splitting pervoskite lattice into smaller ones generates abundant boundaries in solid films and results in more surface trap states, needing exact passivation to suppress trap-assisted nonradiative losses. Here, an ions-induced heteroepitaxial growth method is employed to assembe perovskite lattices with different structures into large-sized grains to produce lattice-anchored nanocomposites for efficient LEDs with high color purity. This approach enables the nanocomposite thin films, composed of three-dimensional (3D) CsPbBr3 and its variant of zero-dimensional (0D) Cs4PbBr6, to feature significant low trap-assisted nonradiative recombination, enhanced light out-coupling with a corrugated surface, and well-balanced charge carrier transport. Based on the resultant 3D/0D perovskite nanocomposites, we demonstrate the perovskite LEDs achieving an remarkable external quantum efficiency of 31.0% at the emission peak of 521 nm with a narrow full width at half-maximum of only 18 nm. This research introduces a novel approach to the development of well-assembled nanocomposites for perovskite LEDs, demonstrating high efficiency comparable to that of state-of-the-art organic LEDs. △ Less

Submitted 2 March, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

arXiv:2309.12010 [pdf, other]

Convolution and Attention Mixer for Synthetic Aperture Radar Image Change Detection

Authors: Haopeng Zhang, Zijing Lin, Feng Gao, Junyu Dong, Qian Du, Heng-Chao Li

Abstract: Synthetic aperture radar (SAR) image change detection is a critical task and has received increasing attentions in the remote sensing community. However, existing SAR change detection methods are mainly based on convolutional neural networks (CNNs), with limited consideration of global attention mechanism. In this letter, we explore Transformer-like architecture for SAR change detection to incorpo… ▽ More Synthetic aperture radar (SAR) image change detection is a critical task and has received increasing attentions in the remote sensing community. However, existing SAR change detection methods are mainly based on convolutional neural networks (CNNs), with limited consideration of global attention mechanism. In this letter, we explore Transformer-like architecture for SAR change detection to incorporate global attention. To this end, we propose a convolution and attention mixer (CAMixer). First, to compensate the inductive bias for Transformer, we combine self-attention with shift convolution in a parallel way. The parallel design effectively captures the global semantic information via the self-attention and performs local feature extraction through shift convolution simultaneously. Second, we adopt a gating mechanism in the feed-forward network to enhance the non-linear feature transformation. The gating mechanism is formulated as the element-wise multiplication of two parallel linear layers. Important features can be highlighted, leading to high-quality representations against speckle noise. Extensive experiments conducted on three SAR datasets verify the superior performance of the proposed CAMixer. The source codes will be publicly available at https://github.com/summitgao/CAMixer . △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: Accepted by IEEE GRSL

arXiv:2309.10352 [pdf, ps, other]

$Γ$-convergence of Nonlocal Dirichlet Energies With Penalty Formulations of Dirichlet Boundary Data

Authors: Weiye Gan, Qiang Du, Zuoqiang Shi

Abstract: We study nonlocal Dirichlet energies associated with a class of nonlocal diffusion models on a bounded domain subject to the conventional local Dirichlet boundary condition. The Dirichlet boundary condition is imposed through a specifically designed penalty formulation. We prove that the nonlocal Dirichlet energies with the penalty terms converge to local Dirichlet energies with Dirichlet boundary… ▽ More We study nonlocal Dirichlet energies associated with a class of nonlocal diffusion models on a bounded domain subject to the conventional local Dirichlet boundary condition. The Dirichlet boundary condition is imposed through a specifically designed penalty formulation. We prove that the nonlocal Dirichlet energies with the penalty terms converge to local Dirichlet energies with Dirichlet boundary conditions in the sense of $\varGamma$-convergence. △ Less

Submitted 19 September, 2023; originally announced September 2023.

arXiv:2308.13906 [pdf, other]

A Two-Dimensional Deep Network for RF-based Drone Detection and Identification Towards Secure Coverage Extension

Authors: Zixiao Zhao, Qinghe Du, Xiang Yao, Lei Lu, Shijiao Zhang

Abstract: As drones become increasingly prevalent in human life, they also raises security concerns such as unauthorized access and control, as well as collisions and interference with manned aircraft. Therefore, ensuring the ability to accurately detect and identify between different drones holds significant implications for coverage extension. Assisted by machine learning, radio frequency (RF) detection c… ▽ More As drones become increasingly prevalent in human life, they also raises security concerns such as unauthorized access and control, as well as collisions and interference with manned aircraft. Therefore, ensuring the ability to accurately detect and identify between different drones holds significant implications for coverage extension. Assisted by machine learning, radio frequency (RF) detection can recognize the type and flight mode of drones based on the sampled drone signals. In this paper, we first utilize Short-Time Fourier. Transform (STFT) to extract two-dimensional features from the raw signals, which contain both time-domain and frequency-domain information. Then, we employ a Convolutional Neural Network (CNN) built with ResNet structure to achieve multi-class classifications. Our experimental results show that the proposed ResNet-STFT can achieve higher accuracy and faster convergence on the extended dataset. Additionally, it exhibits balanced performance compared to other baselines on the raw dataset. △ Less

Submitted 26 August, 2023; originally announced August 2023.

arXiv:2308.05180 [pdf, other]

Nonlocal problems with local boundary conditions II: Green's identities and regularity of solutions

Authors: James M. Scott, Qiang Du

Abstract: We study nonlocal integral equations on bounded domains with finite-range nonlocal interactions that are localized at the boundary. We establish a Green's identity for the nonlocal operator that recovers the classical boundary integral, which, along with the variational analysis established previously, leads to the well-posedness of these nonlocal problems with various types of classical local bou… ▽ More We study nonlocal integral equations on bounded domains with finite-range nonlocal interactions that are localized at the boundary. We establish a Green's identity for the nonlocal operator that recovers the classical boundary integral, which, along with the variational analysis established previously, leads to the well-posedness of these nonlocal problems with various types of classical local boundary conditions. We continue our analysis via boundary-localized convolutions, using them to analyze the Euler-Lagrange equations, which permits us to establish global regularity properties and classical Sobolev convergence to their classical local counterparts. △ Less

Submitted 9 August, 2023; originally announced August 2023.

MSC Class: 45K05; 35J20; 46E35

arXiv:2308.04386 [pdf, other]

Learning Evaluation Models from Large Language Models for Sequence Generation

Authors: Chenglong Wang, Hang Zhou, Kaiyan Chang, Tongran Liu, Chunliang Zhang, Quan Du, Tong Xiao, Jingbo Zhu

Abstract: Large language models achieve state-of-the-art performance on sequence generation evaluation, but typically have a large number of parameters. This is a computational challenge as presented by applying their evaluation capability at scale. To overcome the challenge, in this paper, we propose \textbf{ECT}, an \textbf{e}valuation \textbf{c}apability \textbf{t}ransfer method, to transfer the evaluati… ▽ More Large language models achieve state-of-the-art performance on sequence generation evaluation, but typically have a large number of parameters. This is a computational challenge as presented by applying their evaluation capability at scale. To overcome the challenge, in this paper, we propose \textbf{ECT}, an \textbf{e}valuation \textbf{c}apability \textbf{t}ransfer method, to transfer the evaluation capability from LLMs to relatively lightweight language models. Based on the proposed ECT, we learn various evaluation models from ChatGPT, and employ them as reward models to improve sequence generation models via reinforcement learning and reranking approaches. Experimental results on machine translation, text style transfer, and summarization tasks demonstrate the effectiveness of our ECT. Notably, applying the learned evaluation models to sequence generation models results in better generated sequences as evaluated by commonly used metrics and ChatGPT. △ Less

Submitted 8 August, 2023; originally announced August 2023.

arXiv:2308.03220 [pdf]

A15 Phase Ta3Sb Thin Films: Direct Synthesis and Giant Spin-Orbit Effects

Authors: J. S. Jiang, Qianheng Du, Ulrich Welp, Ramakanta Chapai, Hanu Arava, Yuzi Liu, Yue Li, John Pearson, Anand Bhattacharya, Hyowon Park

Abstract: We use co-sputtering to directly synthesize thin films of the A15 phase intermetallic compound Ta3Sb, which has been predicted to have a giant spin Hall conductivity. We identify a large window of Ta:Sb flux ratio that stabilizes single-phase A15 Ta3Sb. Composition analyses of these films show a Ta:Sb atomic ratio of 4:1, which is consistent with the known Ta-Sb phase diagram. The spin Hall conduc… ▽ More We use co-sputtering to directly synthesize thin films of the A15 phase intermetallic compound Ta3Sb, which has been predicted to have a giant spin Hall conductivity. We identify a large window of Ta:Sb flux ratio that stabilizes single-phase A15 Ta3Sb. Composition analyses of these films show a Ta:Sb atomic ratio of 4:1, which is consistent with the known Ta-Sb phase diagram. The spin Hall conductivity of thin film Ta3Sb is -3400+/-400 (hbar/2e) S/cm and the spin-orbit torque efficiency is -0.6+/-0.1 at 20 K, as determined from harmonic Hall measurements of Ta3Sb/permalloy bilayer structures. These giant values make Ta3Sb a promising material for efficient charge-to-spin conversion in spintronic applications. Large field-like spin-orbit effective fields that are independent of the ferromagnetic layer thickness have also been measured in the Ta3Sb/permalloy bilayers. We attribute the field-like spin-orbit effective field to the Rashba effect at the interface. △ Less

Submitted 6 August, 2023; originally announced August 2023.

arXiv:2308.01969 [pdf]

Complete mode conversion for elastic waves reflected by elastic metamaterial slab with double hexapole resonances

Authors: Di Liu, Wenjie Yu, Qiujiao Du, Fengming Liu, Pai Peng

Abstract: In this study, we investigate the phenomenon of mode conversion in elastic bulk waves using coupled hexapole resonances. A metamaterial slab is proposed enabling the complete conversion between longitudinal and transverse modes. Each unit of the elastic metamaterial slab comprises a pair of scatterers, and their relative direction is oriented at an oblique angle. The interaction between the couple… ▽ More In this study, we investigate the phenomenon of mode conversion in elastic bulk waves using coupled hexapole resonances. A metamaterial slab is proposed enabling the complete conversion between longitudinal and transverse modes. Each unit of the elastic metamaterial slab comprises a pair of scatterers, and their relative direction is oriented at an oblique angle. The interaction between the coupled hexapoles and the background results in oblique displacements, which are responsible for the mode conversion. Moreover, this conversion exhibits a broader frequency range compared to the quadrupole resonance. This innovative design significantly broadens the range of possibilities for developing mode-converting metamaterials. △ Less

Submitted 3 August, 2023; originally announced August 2023.

arXiv:2307.14340 [pdf, ps, other]

doi 10.1140/epjb/s10051-024-00714-3

Non-Hermitian tearing by dissipation

Authors: Qian Du, Xin-Ran Ma, Su-Peng Kou

Abstract: In the paper, we study the non-Hermitian system under dissipation and give the effective 2*2 Hamiltonian in the k-space by reducing the N*N Hamiltonian in the real space for them. It is discovered that the energy band shows an imaginary line gap. To describe these phenomena, we propose the theory of "non-Hermitian tearing", in which the tearability we define reveals a continuous phase transition a… ▽ More In the paper, we study the non-Hermitian system under dissipation and give the effective 2*2 Hamiltonian in the k-space by reducing the N*N Hamiltonian in the real space for them. It is discovered that the energy band shows an imaginary line gap. To describe these phenomena, we propose the theory of "non-Hermitian tearing", in which the tearability we define reveals a continuous phase transition at the exceptional point. The non-Hermitian tearing manifests in two forms -- separation of bulk state and decoupling of boundary state. In addition, we also explore the one-dimensional Su-Schrieffer-Heeger model and the Qi-Wu-Zhang model under dissipation using the theory of non-Hermitian tearing. Our results provide a theoretical approach for exploring the controlling of non-Hermitian physics on topological quantum states. △ Less

Submitted 24 June, 2024; v1 submitted 26 July, 2023; originally announced July 2023.

Showing 1–50 of 262 results for author: Du, Q