Search | arXiv e-print repository

Evolutionary Trigger Detection and Lightweight Model Repair Based Backdoor Defense

Authors: Qi Zhou, Zipeng Ye, Yubo Tang, Wenjian Luo, Yuhui Shi, Yan Jia

Abstract: Deep Neural Networks (DNNs) have been widely used in many areas such as autonomous driving and face recognition. However, DNN model is fragile to backdoor attack. A backdoor in the DNN model can be activated by a poisoned input with trigger and leads to wrong prediction, which causes serious security issues in applications. It is challenging for current defenses to eliminate the backdoor effective… ▽ More Deep Neural Networks (DNNs) have been widely used in many areas such as autonomous driving and face recognition. However, DNN model is fragile to backdoor attack. A backdoor in the DNN model can be activated by a poisoned input with trigger and leads to wrong prediction, which causes serious security issues in applications. It is challenging for current defenses to eliminate the backdoor effectively with limited computing resources, especially when the sizes and numbers of the triggers are variable as in the physical world. We propose an efficient backdoor defense based on evolutionary trigger detection and lightweight model repair. In the first phase of our method, CAM-focus Evolutionary Trigger Filter (CETF) is proposed for trigger detection. CETF is an effective sample-preprocessing based method with the evolutionary algorithm, and our experimental results show that CETF not only distinguishes the images with triggers accurately from the clean images, but also can be widely used in practice for its simplicity and stability in different backdoor attack situations. In the second phase of our method, we leverage several lightweight unlearning methods with the trigger detected by CETF for model repair, which also constructively demonstrate the underlying correlation of the backdoor with Batch Normalization layers. Source code will be published after accepted. △ Less

Submitted 14 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

Comments: 13 pages, 9 figures

arXiv:2407.03203 [pdf, other]

TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

Authors: Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang

Abstract: Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs. Similar methods have shown promising results in code generation. However, most modern LLMs exhibit suboptimal performance… ▽ More Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs. Similar methods have shown promising results in code generation. However, most modern LLMs exhibit suboptimal performance due to the scarcity of aligned NL and Formal Language (FL) theorem-proving data. This scarcity results in a paucity of methodologies for training LLMs and techniques to fully utilize their capabilities in composing formal proofs. To address the challenges, this paper proposes **TheoremLlama**, an end-to-end framework to train a general-purpose LLM to become a Lean4 expert. This framework encompasses NL-FL aligned dataset generation methods, training approaches for the LLM formal theorem prover, and techniques for LLM Lean4 proof writing. Using the dataset generation method, we provide *Open Bootstrapped Theorems* (OBT), an NL-FL aligned and bootstrapped dataset. A key innovation in this framework is the NL-FL bootstrapping method, where NL proofs are integrated into Lean4 code for training datasets, leveraging the NL reasoning ability of LLMs for formal reasoning. The **TheoremLlama** framework achieves cumulative accuracies of 36.48% and 33.61% on MiniF2F-Valid and Test datasets respectively, surpassing the GPT-4 baseline of 22.95% and 25.41%. We have also open-sourced our model checkpoints and generated dataset, and will soon make all the code publicly available. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02888 [pdf, ps, other]

Joint Optimization of Resource Allocation and Data Selection for Fast and Cost-Efficient Federated Edge Learning

Authors: Yunjian Jia, Zhen Huang, Jiping Yan, Yulu Zhang, Kun Luo, Wanli Wen

Abstract: Deploying federated learning at the wireless edge introduces federated edge learning (FEEL). Given FEEL's limited communication resources and potential mislabeled data on devices, improper resource allocation or data selection can hurt convergence speed and increase training costs. Thus, to realize an efficient FEEL system, this paper emphasizes jointly optimizing resource allocation and data sele… ▽ More Deploying federated learning at the wireless edge introduces federated edge learning (FEEL). Given FEEL's limited communication resources and potential mislabeled data on devices, improper resource allocation or data selection can hurt convergence speed and increase training costs. Thus, to realize an efficient FEEL system, this paper emphasizes jointly optimizing resource allocation and data selection. Specifically, in this work, through rigorously modeling the training process and deriving an upper bound on FEEL's one-round convergence rate, we establish a problem of joint resource allocation and data selection, which, unfortunately, cannot be solved directly. Toward this end, we equivalently transform the original problem into a solvable form via a variable substitution and then break it into two subproblems, that is, the resource allocation problem and the data selection problem. The two subproblems are mixed-integer non-convex and integer non-convex problems, respectively, and achieving their optimal solutions is a challenging task. Based on the matching theory and applying the convex-concave procedure and gradient projection methods, we devise a low-complexity suboptimal algorithm for the two subproblems, respectively. Finally, the superiority of our proposed scheme of joint resource allocation and data selection is validated by numerical results. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02554 [pdf, other]

Localization of the free energy in supergravity

Authors: Pietro Benetti Genolini, Jerome P. Gauntlett, Yusheng Jiao, Alice Lüscher, James Sparks

Abstract: We derive a general formula for the gravitational free energy of Euclidean supersymmetric solutions to $D=4$, $\mathcal{N}=2$ gauged supergravity coupled to vector multiplet matter. This allows one to compute the free energy without solving any supergravity equations, just assuming the solutions exist. As well as recovering some known results in the literature with ease, we also present new superg… ▽ More We derive a general formula for the gravitational free energy of Euclidean supersymmetric solutions to $D=4$, $\mathcal{N}=2$ gauged supergravity coupled to vector multiplet matter. This allows one to compute the free energy without solving any supergravity equations, just assuming the solutions exist. As well as recovering some known results in the literature with ease, we also present new supergravity results that match with holographically dual field theory computations. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 6 pages

arXiv:2407.01523 [pdf, other]

MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations

Authors: Yubo Ma, Yuhang Zang, Liangyu Chen, Meiqi Chen, Yizhu Jiao, Xinze Li, Xinyuan Lu, Ziyu Liu, Yan Ma, Xiaoyi Dong, Pan Zhang, Liangming Pan, Yu-Gang Jiang, Jiaqi Wang, Yixin Cao, Aixin Sun

Abstract: Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark co… ▽ More Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions. Distinct from previous datasets, it is constructed upon 130 lengthy PDF-formatted documents with an average of 49.4 pages and 20,971 textual tokens. Towards comprehensive evaluation, answers to these questions rely on pieces of evidence from (1) different sources (text, image, chart, table, and layout structure) and (2) various locations (i.e. page number). Moreover, 33.2% of the questions are cross-page questions requiring evidence across multiple pages. 22.8% of the questions are designed to be unanswerable for detecting potential hallucinations. Experiments on 14 LVLMs demonstrate that long-context DU greatly challenges current models. Notably, the best-performing model, GPT-4o, achieves an F1 score of only 42.7%, while the second-best, GPT-4V, scores 31.4%. Furthermore, 12 LVLMs (all except GPT-4o and GPT-4V) even present worse performance than their LLM counterparts which are fed with lossy-parsed OCR documents. These results validate the necessity of future research toward more capable long-context LVLMs. Project Page: https://mayubo2333.github.io/MMLongBench-Doc △ Less

Submitted 10 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.00412 [pdf, other]

C-MASS: Combinatorial Mobility-Aware Sensor Scheduling for Collaborative Perception with Second-Order Topology Approximation

Authors: Yukuan Jia, Yuxuan Sun, Ruiqing Mao, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

Abstract: Collaborative Perception (CP) has been a promising solution to address occlusions in the traffic environment by sharing sensor data among collaborative vehicles (CoV) via vehicle-to-everything (V2X) network. With limited wireless bandwidth, CP necessitates task-oriented and receiver-aware sensor scheduling to prioritize important and complementary sensor data. However, due to vehicular mobility, i… ▽ More Collaborative Perception (CP) has been a promising solution to address occlusions in the traffic environment by sharing sensor data among collaborative vehicles (CoV) via vehicle-to-everything (V2X) network. With limited wireless bandwidth, CP necessitates task-oriented and receiver-aware sensor scheduling to prioritize important and complementary sensor data. However, due to vehicular mobility, it is challenging and costly to obtain the up-to-date perception topology, i.e., whether a combination of CoVs can jointly detect an object. In this paper, we propose a combinatorial mobility-aware sensor scheduling (C-MASS) framework for CP with minimal communication overhead. Specifically, detections are replayed with sensor data from individual CoVs and pairs of CoVs to maintain an empirical perception topology up to the second order, which approximately represents the complete perception topology. A hybrid greedy algorithm is then proposed to solve a variant of the budgeted maximum coverage problem with a worst-case performance guarantee. The C-MASS scheduling algorithm adapts the greedy algorithm by incorporating the topological uncertainty and the unexplored time of CoVs to balance exploration and exploitation, addressing the mobility challenge. Extensive numerical experiments demonstrate the near-optimality of the proposed C-MASS framework in both edge-assisted and distributed CP configurations. The weighted recall improvements over object-level CP are 5.8% and 4.2%, respectively. Compared to distance-based and area-based greedy heuristics, the gaps to the offline optimal solutions are reduced by up to 75% and 71%, respectively. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: 14 pages, 10 figures

arXiv:2406.19994 [pdf, ps, other]

Next-to-leading-order QCD corrections to nucleon Dirac form factors

Authors: Wen Chen, Feng Feng, Yu Jia

Abstract: The leading-order perturbative QCD (pQCD) predictions to nucleon electromagnetic form factors were first made in late 70s. Nearly half a century later, we accomplish the calculation of the next-to-leading-order (NLO) QCD corrections to proton and neutron's Dirac form factors at large momentum transfer, to the leading-twist accuracy in collinear factorization. The effect of NLO perturbative correct… ▽ More The leading-order perturbative QCD (pQCD) predictions to nucleon electromagnetic form factors were first made in late 70s. Nearly half a century later, we accomplish the calculation of the next-to-leading-order (NLO) QCD corrections to proton and neutron's Dirac form factors at large momentum transfer, to the leading-twist accuracy in collinear factorization. The effect of NLO perturbative corrections turns out to be positive and substantial. Confronting our state-of-the-art pQCD predictions with the available data for nucleon Dirac form factors, in both space-like and time-like regions, imposes a stringent test on the validity of the parameterized forms of nucleon leading-twist light-cone-distribution amplitudes (LCDAs), which have been predicted by a class of QCD sum rules-based models and lattice QCD simulation. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: 8 pages, 1 table, 5 figures

arXiv:2406.18840 [pdf]

Shorter SPECT Scans Using Self-supervised Coordinate Learning to Synthesize Skipped Projection Views

Authors: Zongyu Li, Yixuan Jia, Xiaojian Xu, Jason Hu, Jeffrey A. Fessler, Yuni K. Dewaraja

Abstract: Purpose: This study addresses the challenge of extended SPECT imaging duration under low-count conditions, as encountered in Lu-177 SPECT imaging, by developing a self-supervised learning approach to synthesize skipped SPECT projection views, thus shortening scan times in clinical settings. Methods: We employed a self-supervised coordinate-based learning technique, adapting the neural radiance fie… ▽ More Purpose: This study addresses the challenge of extended SPECT imaging duration under low-count conditions, as encountered in Lu-177 SPECT imaging, by developing a self-supervised learning approach to synthesize skipped SPECT projection views, thus shortening scan times in clinical settings. Methods: We employed a self-supervised coordinate-based learning technique, adapting the neural radiance field (NeRF) concept in computer vision to synthesize under-sampled SPECT projection views. For each single scan, we used self-supervised coordinate learning to estimate skipped SPECT projection views. The method was tested with various down-sampling factors (DFs=2, 4, 8) on both Lu-177 phantom SPECT/CT measurements and clinical SPECT/CT datasets, from 11 patients undergoing Lu-177 DOTATATE and 6 patients undergoing Lu-177 PSMA-617 radiopharmaceutical therapy. Results: For SPECT reconstructions, our method outperformed the use of linearly interpolated projections and partial projection views in relative contrast-to-noise-ratios (RCNR) averaged across different downsampling factors: 1) DOTATATE: 83% vs. 65% vs. 67% for lesions and 86% vs. 70% vs. 67% for kidney, 2) PSMA: 76% vs. 69% vs. 68% for lesions and 75% vs. 55% vs. 66% for organs, including kidneys, lacrimal glands, parotid glands, and submandibular glands. Conclusion: The proposed method enables reduction in acquisition time (by factors of 2, 4, or 8) while maintaining quantitative accuracy in clinical SPECT protocols by allowing for the collection of fewer projections. Importantly, the self-supervised nature of this NeRF-based approach eliminates the need for extensive training data, instead learning from each patient's projection data alone. The reduction in acquisition time is particularly relevant for imaging under low-count conditions and for protocols that require multiple-bed positions such as whole-body imaging. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 25 pages, 5568 words

arXiv:2406.17797 [pdf, other]

MoleculeCLA: Rethinking Molecular Benchmark via Computational Ligand-Target Binding Analysis

Authors: Shikun Feng, Jiaxin Zheng, Yinjun Jia, Yanwen Huang, Fengfeng Zhou, Wei-Ying Ma, Yanyan Lan

Abstract: Molecular representation learning is pivotal for various molecular property prediction tasks related to drug discovery. Robust and accurate benchmarks are essential for refining and validating current methods. Existing molecular property benchmarks derived from wet experiments, however, face limitations such as data volume constraints, unbalanced label distribution, and noisy labels. To address th… ▽ More Molecular representation learning is pivotal for various molecular property prediction tasks related to drug discovery. Robust and accurate benchmarks are essential for refining and validating current methods. Existing molecular property benchmarks derived from wet experiments, however, face limitations such as data volume constraints, unbalanced label distribution, and noisy labels. To address these issues, we construct a large-scale and precise molecular representation dataset of approximately 140,000 small molecules, meticulously designed to capture an extensive array of chemical, physical, and biological properties, derived through a robust computational ligand-target binding analysis pipeline. We conduct extensive experiments on various deep learning models, demonstrating that our dataset offers significant physicochemical interpretability to guide model development and design. Notably, the dataset's properties are linked to binding affinity metrics, providing additional insights into model performance in drug-target interaction tasks. We believe this dataset will serve as a more accurate and reliable benchmark for molecular representation learning, thereby expediting progress in the field of artificial intelligence-driven drug discovery. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.17745 [pdf, ps, other]

Light-weight End-to-End Graph Interest Network for CTR Prediction in E-commerce Search

Authors: Pipi Peng, Yunqing Jia, Ziqiang Zhou, murmurhash, Zichong Xiao

Abstract: Click-through-rate (CTR) prediction has an essential impact on improving user experience and revenue in e-commerce search. With the development of deep learning, graph-based methods are well exploited to utilize graph structure extracted from user behaviors and other information to help embedding learning. However, most of the previous graph-based methods mainly focus on recommendation scenarios,… ▽ More Click-through-rate (CTR) prediction has an essential impact on improving user experience and revenue in e-commerce search. With the development of deep learning, graph-based methods are well exploited to utilize graph structure extracted from user behaviors and other information to help embedding learning. However, most of the previous graph-based methods mainly focus on recommendation scenarios, and therefore their graph structures highly depend on item's sequential information from user behaviors, ignoring query's sequential signal and query-item correlation. In this paper, we propose a new approach named Light-weight End-to-End Graph Interest Network (EGIN) to effectively mine users' search interests and tackle previous challenges. (i) EGIN utilizes query and item's correlation and sequential information from the search system to build a heterogeneous graph for better CTR prediction in e-commerce search. (ii) EGIN's graph embedding learning shares the same training input and is jointly trained with CTR prediction, making the end-to-end framework effortless to deploy in large-scale search systems. The proposed EGIN is composed of three parts: query-item heterogeneous graph, light-weight graph sampling, and multi-interest network. The query-item heterogeneous graph captures correlation and sequential information of query and item efficiently by the proposed light-weight graph sampling. The multi-interest network is well designed to utilize graph embedding to capture various similarity relationships between query and item to enhance the final CTR prediction. We conduct extensive experiments on both public and industrial datasets to demonstrate the effectiveness of the proposed EGIN. At the same time, the training cost of graph learning is relatively low compared with the main CTR prediction task, ensuring efficiency in practical applications. △ Less

Submitted 4 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

Comments: 8 pages, 4 figures

ACM Class: H.3.3

arXiv:2406.14955 [pdf, other]

ICLEval: Evaluating In-Context Learning Ability of Large Language Models

Authors: Wentong Chen, Yankai Lin, ZhenHao Zhou, HongYun Huang, Yantao Jia, Zhao Cao, Ji-Rong Wen

Abstract: In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs. Evaluating the ICL ability of LLMs can enhance their utilization and deepen our understanding of how this ability is acquired at the training stage. However, existing evaluation frameworks primarily focus on language abilities and knowledge,… ▽ More In-Context Learning (ICL) is a critical capability of Large Language Models (LLMs) as it empowers them to comprehend and reason across interconnected inputs. Evaluating the ICL ability of LLMs can enhance their utilization and deepen our understanding of how this ability is acquired at the training stage. However, existing evaluation frameworks primarily focus on language abilities and knowledge, often overlooking the assessment of ICL ability. In this work, we introduce the ICLEval benchmark to evaluate the ICL abilities of LLMs, which encompasses two key sub-abilities: exact copying and rule learning. Through the ICLEval benchmark, we demonstrate that ICL ability is universally present in different LLMs, and model size is not the sole determinant of ICL efficacy. Surprisingly, we observe that ICL abilities, particularly copying, develop early in the pretraining process and stabilize afterward. Our source codes and benchmark are released at https://github.com/yiye3/ICLEval. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.12195 [pdf, other]

Quantum Compiling with Reinforcement Learning on a Superconducting Processor

Authors: Z. T. Wang, Qiuhao Chen, Yuxuan Du, Z. H. Yang, Xiaoxia Cai, Kaixuan Huang, Jingning Zhang, Kai Xu, Jun Du, Yinan Li, Yuling Jiao, Xingyao Wu, Wu Liu, Xiliang Lu, Huikai Xu, Yirong Jin, Ruixia Wang, Haifeng Yu, S. P. Zhao

Abstract: To effectively implement quantum algorithms on noisy intermediate-scale quantum (NISQ) processors is a central task in modern quantum technology. NISQ processors feature tens to a few hundreds of noisy qubits with limited coherence times and gate operations with errors, so NISQ algorithms naturally require employing circuits of short lengths via quantum compilation. Here, we develop a reinforcemen… ▽ More To effectively implement quantum algorithms on noisy intermediate-scale quantum (NISQ) processors is a central task in modern quantum technology. NISQ processors feature tens to a few hundreds of noisy qubits with limited coherence times and gate operations with errors, so NISQ algorithms naturally require employing circuits of short lengths via quantum compilation. Here, we develop a reinforcement learning (RL)-based quantum compiler for a superconducting processor and demonstrate its capability of discovering novel and hardware-amenable circuits with short lengths. We show that for the three-qubit quantum Fourier transformation, a compiled circuit using only seven CZ gates with unity circuit fidelity can be achieved. The compiler is also able to find optimal circuits under device topological constraints, with lengths considerably shorter than those by the conventional method. Our study exemplifies the codesign of the software with hardware for efficient quantum compilation, offering valuable insights for the advancement of RL-based compilers. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.11586 [pdf, other]

Multistability of Small Zero-One Reaction Networks

Authors: Yue Jiao, Xiaoxian Tang, Xiaowei Zeng

Abstract: Zero-one reaction networks play key roles in cell signaling such as signalling pathways regulated by protein phosphorylation. Multistability of zero-one networks is a key dynamics feature enabling decision-making in cells. Since multistability (or, nondegenerate multistationarity) can be lifted from a smaller subnetwork (low-dimensional networks with less species and fewer reactions) to large netw… ▽ More Zero-one reaction networks play key roles in cell signaling such as signalling pathways regulated by protein phosphorylation. Multistability of zero-one networks is a key dynamics feature enabling decision-making in cells. Since multistability (or, nondegenerate multistationarity) can be lifted from a smaller subnetwork (low-dimensional networks with less species and fewer reactions) to large networks, we aim to explore the multistability problem of small zero-one networks. We prove that any zero-one network with a one-dimensional stoichiometric subspace admits at most one (stable) positive steady state (this steady state is also called a structural attractor), and we completely classify all the one-dimensional zero-one networks according to if they indeed admits a (stable) positive steady state or not. Also, we prove that any two-dimensional zero-one network with up to three species either admits only degenerate positive steady states, or admits at most one (stable) positive steady state. In these proofs, we apply the theorem based on the Brouwer degree theory and the theory of real algebraic geometry. Moreover, using the tools of computational algebraic geometry, we provide a systematical way for detecting the smallest zero-one networks that admit nondegenerate multistationarity/multistability. We show that the smallest zero-one networks that admit nondegenerate multistationarity contain three species and five reactions, and the smallest zero-one networks that admit multistability contain three species and six reactions. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 45 pages, 6 figures

arXiv:2406.09381 [pdf, ps, other]

Novel azimuthal observables from two-photon collision at $e^+e^-$ colliders

Authors: Yu Jia, Jian Zhou, Ya-jin Zhou

Abstract: In this work we advocate a set of novel azimuthal-angle-related observables associated with exclusive hadron production from two-photon fusion at $e^+ e^-$ colliders, taking the $γγ\to ππ$ as a benchmark process. As a direct consequence of the linearly polarized quasi-real photons emitted off the electron and positron beams, the $\cos 2φ$ azimuthal asymmetry in dipion production is predicted withi… ▽ More In this work we advocate a set of novel azimuthal-angle-related observables associated with exclusive hadron production from two-photon fusion at $e^+ e^-$ colliders, taking the $γγ\to ππ$ as a benchmark process. As a direct consequence of the linearly polarized quasi-real photons emitted off the electron and positron beams, the $\cos 2φ$ azimuthal asymmetry in dipion production is predicted within the transverse-momentum-dependent (TMD) factorization framework. In numerical analysis, we take the helicity amplitudes of $γγ\to ππ$ determined from the partial wave solutions in dispersion relation as input, and find that the predicted $\cos2φ$ azimuthal modulation may reach 40\% for the typical kinematical setup of {\tt Belle} experiment. Future accurate measurement of this azimuthal asymmetry may facilitate the direct extraction of the relative phase between two helicity amplitudes with photon helicity configurations $++$ and $+-$. This knowledge provides a valuable input for the dispersive determination of the hadronic light-by-light (Hlbl) contributions, which constitutes one of the largest theoretical uncertainties in predictions for the muon anomalous magnetic moment. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 8 pages, 5 figures

arXiv:2406.08961 [pdf, other]

SIU: A Million-Scale Structural Small Molecule-Protein Interaction Dataset for Unbiased Bioactivity Prediction

Authors: Yanwen Huang, Bowen Gao, Yinjun Jia, Hongbo Ma, Wei-Ying Ma, Ya-Qin Zhang, Yanyan Lan

Abstract: Small molecules play a pivotal role in modern medicine, and scrutinizing their interactions with protein targets is essential for the discovery and development of novel, life-saving therapeutics. The term "bioactivity" encompasses various biological effects resulting from these interactions, including both binding and functional responses. The magnitude of bioactivity dictates the therapeutic or t… ▽ More Small molecules play a pivotal role in modern medicine, and scrutinizing their interactions with protein targets is essential for the discovery and development of novel, life-saving therapeutics. The term "bioactivity" encompasses various biological effects resulting from these interactions, including both binding and functional responses. The magnitude of bioactivity dictates the therapeutic or toxic pharmacological outcomes of small molecules, rendering accurate bioactivity prediction crucial for the development of safe and effective drugs. However, existing structural datasets of small molecule-protein interactions are often limited in scale and lack systematically organized bioactivity labels, thereby impeding our understanding of these interactions and precise bioactivity prediction. In this study, we introduce a comprehensive dataset of small molecule-protein interactions, consisting of over a million binding structures, each annotated with real biological activity labels. This dataset is designed to facilitate unbiased bioactivity prediction. We evaluated several classical models on this dataset, and the results demonstrate that the task of unbiased bioactivity prediction is challenging yet essential. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.08806 [pdf, ps, other]

Adaptive Cooperative Streaming of Holographic Video Over Wireless Networks: A Proximal Policy Optimization Solution

Authors: Wanli Wen, Jiping Yan, Yulu Zhang, Zhen Huang, Liang Liang, Yunjian Jia

Abstract: Adapting holographic video streaming to fluctuating wireless channels is essential to maintain consistent and satisfactory Quality of Experience (QoE) for users, which, however, is a challenging task due to the dynamic and uncertain characteristics of wireless networks. To address this issue, we propose a holographic video cooperative streaming framework designed for a generic wireless network in… ▽ More Adapting holographic video streaming to fluctuating wireless channels is essential to maintain consistent and satisfactory Quality of Experience (QoE) for users, which, however, is a challenging task due to the dynamic and uncertain characteristics of wireless networks. To address this issue, we propose a holographic video cooperative streaming framework designed for a generic wireless network in which multiple access points can cooperatively transmit video with different bitrates to multiple users. Additionally, we model a novel QoE metric tailored specifically for holographic video streaming, which can effectively encapsulate the nuances of holographic video quality, quality fluctuations, and rebuffering occurrences simultaneously. Furthermore, we formulate a formidable QoE maximization problem, which is a non-convex mixed integer nonlinear programming problem. Using proximal policy optimization (PPO), a new class of reinforcement learning algorithms, we devise a joint beamforming and bitrate control scheme, which can be wisely adapted to fluctuations in the wireless channel. The numerical results demonstrate the superiority of the proposed scheme over representative baselines. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: This paper has been accepted for publication in IEEE Wireless Communications Letters

arXiv:2406.08698 [pdf, other]

Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 17 pages, 12 figures, accepted by PRL

arXiv:2406.07111 [pdf, other]

NeRSP: Neural 3D Reconstruction for Reflective Objects with Sparse Polarized Images

Authors: Yufei Han, Heng Guo, Koki Fukai, Hiroaki Santo, Boxin Shi, Fumio Okura, Zhanyu Ma, Yunpeng Jia

Abstract: We present NeRSP, a Neural 3D reconstruction technique for Reflective surfaces with Sparse Polarized images. Reflective surface reconstruction is extremely challenging as specular reflections are view-dependent and thus violate the multiview consistency for multiview stereo. On the other hand, sparse image inputs, as a practical capture setting, commonly cause incomplete or distorted results due t… ▽ More We present NeRSP, a Neural 3D reconstruction technique for Reflective surfaces with Sparse Polarized images. Reflective surface reconstruction is extremely challenging as specular reflections are view-dependent and thus violate the multiview consistency for multiview stereo. On the other hand, sparse image inputs, as a practical capture setting, commonly cause incomplete or distorted results due to the lack of correspondence matching. This paper jointly handles the challenges from sparse inputs and reflective surfaces by leveraging polarized images. We derive photometric and geometric cues from the polarimetric image formation model and multiview azimuth consistency, which jointly optimize the surface geometry modeled via implicit neural representation. Based on the experiments on our synthetic and real datasets, we achieve the state-of-the-art surface reconstruction results with only 6 views as input. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 10 pages

arXiv:2406.05746 [pdf]

doi 10.1007/s10462-024-10763-w

Methodology and Real-World Applications of Dynamic Uncertain Causality Graph for Clinical Diagnosis with Explainability and Invariance

Authors: Zhan Zhang, Qin Zhang, Yang Jiao, Lin Lu, Lin Ma, Aihua Liu, Xiao Liu, Juan Zhao, Yajun Xue, Bing Wei, Mingxia Zhang, Ru Gao, Hong Zhao, Jie Lu, Fan Li, Yang Zhang, Yiming Wang, Lei Zhang, Fengwei Tian, Jie Hu, Xin Gou

Abstract: AI-aided clinical diagnosis is desired in medical care. Existing deep learning models lack explainability and mainly focus on image analysis. The recently developed Dynamic Uncertain Causality Graph (DUCG) approach is causality-driven, explainable, and invariant across different application scenarios, without problems of data collection, labeling, fitting, privacy, bias, generalization, high cost… ▽ More AI-aided clinical diagnosis is desired in medical care. Existing deep learning models lack explainability and mainly focus on image analysis. The recently developed Dynamic Uncertain Causality Graph (DUCG) approach is causality-driven, explainable, and invariant across different application scenarios, without problems of data collection, labeling, fitting, privacy, bias, generalization, high cost and high energy consumption. Through close collaboration between clinical experts and DUCG technicians, 46 DUCG models covering 54 chief complaints were constructed. Over 1,000 diseases can be diagnosed without triage. Before being applied in real-world, the 46 DUCG models were retrospectively verified by third-party hospitals. The verified diagnostic precisions were no less than 95%, in which the diagnostic precision for every disease including uncommon ones was no less than 80%. After verifications, the 46 DUCG models were applied in the real-world in China. Over one million real diagnosis cases have been performed, with only 17 incorrect diagnoses identified. Due to DUCG's transparency, the mistakes causing the incorrect diagnoses were found and corrected. The diagnostic abilities of the clinicians who applied DUCG frequently were improved significantly. Following the introduction to the earlier presented DUCG methodology, the recommendation algorithm for potential medical checks is presented and the key idea of DUCG is extracted. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Journal ref: Artificaial Intelligence Review, (2024) 57:151

arXiv:2406.04124 [pdf, ps, other]

Light quark mass dependence of nucleon mass to two-loop order

Authors: Long-Bin Chen, Siwei Hu, Yu Jia, Zhewen Mo

Abstract: We investigate the nucleon self energy through the sixth chiral order in the covariant $SU(2)$ chiral perturbation theory ($χ$PT) in the single baryon sector. The validity of the extended on-mass-shell (EOMS) renormalization scheme is explicitly verified to two-loop order, manifested by the miraculous cancellation of all nonlocal divergences and power-counting-breaking (PCB) terms that are nonanal… ▽ More We investigate the nucleon self energy through the sixth chiral order in the covariant $SU(2)$ chiral perturbation theory ($χ$PT) in the single baryon sector. The validity of the extended on-mass-shell (EOMS) renormalization scheme is explicitly verified to two-loop order, manifested by the miraculous cancellation of all nonlocal divergences and power-counting-breaking (PCB) terms that are nonanalytic in pion mass. Using the $σ_{πN}$ term determined from the latest lattice simulation to constrain some unknown higher-order low energy constants (LECs), we predict the nucleon mass in the chiral limit to be $856.6\pm 1.7$ MeV. It is found that the EOMS scheme exhibits quite satisfactory convergence behavior through ${\cal O}(q^6)$ around physical point. We also predict the pion mass dependence of the nucleon mass to the accuracy of ${\cal O}(q^6)$, which is in satisfactory agreement with the recent lattice results over a wide range of pion mass. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 13 pages, 4 figures

arXiv:2406.03871 [pdf]

Development of high-level applications for High Energy Photon Source booster

Authors: Yuemei Peng, Daheng Ji, Hongfei Ji, Nan Li, Xiaohan Lu, Saike Tian, Yuanyuan Wei, Haisheng Xu, Yaliang Zhao, Yi Jiao, Jingyi Li

Abstract: The High Energy Photon Source (HEPS), is the first fourth-generation storage ring light source being built in the suburb of Beijing, China. The storage ring was designed with the emittance lower than 60 pm.rad with a circumference of 1.36 km and beam energy of 6 GeV. Its injector contains a 500 MeV S-band Linac and a 454 m booster which was designed as an accumulator at the extraction energy. In t… ▽ More The High Energy Photon Source (HEPS), is the first fourth-generation storage ring light source being built in the suburb of Beijing, China. The storage ring was designed with the emittance lower than 60 pm.rad with a circumference of 1.36 km and beam energy of 6 GeV. Its injector contains a 500 MeV S-band Linac and a 454 m booster which was designed as an accumulator at the extraction energy. In the energy ramping control design of HEPS booster, the ramping process was programed to be able to stop and stay at any energy between the injection energy and the extraction energy. This feature enables us to conduct energy-dependent machine studies and ramping curve optimization. The beam commissioning of HEPS Linac finished in June, 2023. And the beam commissioning of booster started in the end of July, 2023. In November 17, main target values proposed in the preliminary design report has been reached. The high-level applications (HLAs) are essential tools for beam commissioning. The development of HLAs, which are based on the framework named Python accelerator physics application set (Pyapas), started in the end of 2021. The HEPS physics team spent more than one year to develop and test the HLAs to meet the requirements of beam commissioning of the booster. Thanks to the modular design, the principle based on physical quantities, and the ability of running simulation models online from the Pyapas, the development efficiency and reliability of the HLAs have been greatly improved. In particular, the principle based on physical quantities allows us to control the beam more intuitively. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.03086 [pdf, other]

Task-Oriented Wireless Communications for Collaborative Perception in Intelligent Unmanned Systems

Authors: Sheng Zhou, Yukuan Jia, Ruiqing Mao, Zhaojun Nan, Yuxuan Sun, Zhisheng Niu

Abstract: Collaborative Perception (CP) has shown great potential to achieve more holistic and reliable environmental perception in intelligent unmanned systems (IUSs). However, implementing CP still faces key challenges due to the characteristics of the CP task and the dynamics of wireless channels. In this article, a task-oriented wireless communication framework is proposed to jointly optimize the commun… ▽ More Collaborative Perception (CP) has shown great potential to achieve more holistic and reliable environmental perception in intelligent unmanned systems (IUSs). However, implementing CP still faces key challenges due to the characteristics of the CP task and the dynamics of wireless channels. In this article, a task-oriented wireless communication framework is proposed to jointly optimize the communication scheme and the CP procedure. We first propose channel-adaptive compression and robust fusion approaches to extract and exploit the most valuable semantic information under wireless communication constraints. We then propose a task-oriented distributed scheduling algorithm to identify the best collaborators for CP under dynamic environments. The main idea is learning while scheduling, where the collaboration utility is effectively learned with low computation and communication overhead. Case studies are carried out in connected autonomous driving scenarios to verify the proposed framework. Finally, we identify several future research directions. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: Accepted by IEEE Network Magazine

arXiv:2406.02133 [pdf, other]

SimulTron: On-Device Simultaneous Speech to Speech Translation

Authors: Alex Agranovich, Eliya Nachmani, Oleg Rybakov, Yifan Ding, Ye Jia, Nadav Bar, Heiga Zen, Michelle Tadmor Ramanovich

Abstract: Simultaneous speech-to-speech translation (S2ST) holds the promise of breaking down communication barriers and enabling fluid conversations across languages. However, achieving accurate, real-time translation through mobile devices remains a major challenge. We introduce SimulTron, a novel S2ST architecture designed to tackle this task. SimulTron is a lightweight direct S2ST model that uses the st… ▽ More Simultaneous speech-to-speech translation (S2ST) holds the promise of breaking down communication barriers and enabling fluid conversations across languages. However, achieving accurate, real-time translation through mobile devices remains a major challenge. We introduce SimulTron, a novel S2ST architecture designed to tackle this task. SimulTron is a lightweight direct S2ST model that uses the strengths of the Translatotron framework while incorporating key modifications for streaming operation, and an adjustable fixed delay. Our experiments show that SimulTron surpasses Translatotron 2 in offline evaluations. Furthermore, real-time evaluations reveal that SimulTron improves upon the performance achieved by Translatotron 1. Additionally, SimulTron achieves superior BLEU scores and latency compared to previous real-time S2ST method on the MuST-C dataset. Significantly, we have successfully deployed SimulTron on a Pixel 7 Pro device, show its potential for simultaneous S2ST on-device. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02055 [pdf]

Stochastic Carbon Footprint Tracing Methods in Power Systems

Authors: Jiashuo Hu, Xiao-Ping Zhang, Youwei Jia

Abstract: As the penetration of distributed energy resources (DER) and renewable energy sources (RES) increases, carbon footprint tracking requires more granular analysis results. Existing carbon footprint tracking methods focus on deterministic steady-state analysis where the high uncertainties of RES cannot be considered. Considering the deficiency of the existing deterministic method, this paper proposes… ▽ More As the penetration of distributed energy resources (DER) and renewable energy sources (RES) increases, carbon footprint tracking requires more granular analysis results. Existing carbon footprint tracking methods focus on deterministic steady-state analysis where the high uncertainties of RES cannot be considered. Considering the deficiency of the existing deterministic method, this paper proposes two stochastic carbon footprint tracking methods to cope with the impact of RES uncertainty on load-side carbon footprint tracing. The first method introduces probabilistic analysis in the framework of carbon emissions flow (CEF) to provide a global reference for the spatial characteristic of the power system component carbon intensity distribution. Considering that the CEF network expands with the increasing penetration of DERs, the second method can effectively improve the computational efficiency over the first method while ensuring the computational accuracy on the large power systems. These proposed models are tested and compared in a synthetic 1004-bus test system in the case study to demonstrate the performance of the two proposed methods △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.00745 [pdf, other]

doi 10.1364/OE.524680

Chiral photon blockade in the spinning Kerr resonator

Authors: Yunlan Zuo, Ya-Feng Jiao, Xun-Wei Xu, Adam Miranowicz, Le-Man Kuang, Hui Jing

Abstract: We propose how to achieve chiral photon blockade by spinning a nonlinear optical resonator. We show that by driving such a device at a fixed direction, completely different quantum effects can emerge for the counter-propagating optical modes, due to the spinning-induced breaking of time-reversal symmetry, which otherwise is unattainable for the same device in the static regime. Also, we find that… ▽ More We propose how to achieve chiral photon blockade by spinning a nonlinear optical resonator. We show that by driving such a device at a fixed direction, completely different quantum effects can emerge for the counter-propagating optical modes, due to the spinning-induced breaking of time-reversal symmetry, which otherwise is unattainable for the same device in the static regime. Also, we find that in comparison with the static case, robust non-classical correlations against random backscattering losses can be achieved for such a quantum chiral system. Our work, extending previous works on the spontaneous breaking of optical chiral symmetry from the classical to purely quantum regimes, can stimulate more efforts towards making and utilizing various chiral quantum effects, including applications for chiral quantum networks or noise-tolerant quantum sensors. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Journal ref: Opt. Express 32, 22020-22030 (2024)

arXiv:2405.20411 [pdf, other]

doi 10.1038/s41550-024-02258-z

Asteroid Kamo`oalewa's journey from the lunar Giordano Bruno crater to Earth 1:1 resonance

Authors: Yifei Jiao, Bin Cheng, Yukun Huang, Erik Asphaug, Brett Gladman, Renu Malhotra, Patrick Michel, Yang Yu, Hexi Baoyin

Abstract: Among the nearly 30,000 known near-Earth asteroids (NEAs), only tens of them possess Earth co-orbital characteristics with semi-major axes $\sim$1 au. In particular, 469219 Kamo`oalewa (2016 HO3), upcoming target of China's Tianwen-2 asteroid sampling mission, exhibits a meta-stable 1:1 mean-motion resonance with Earth. Intriguingly, recent ground-based observations show that Kamo`oalewa has spect… ▽ More Among the nearly 30,000 known near-Earth asteroids (NEAs), only tens of them possess Earth co-orbital characteristics with semi-major axes $\sim$1 au. In particular, 469219 Kamo`oalewa (2016 HO3), upcoming target of China's Tianwen-2 asteroid sampling mission, exhibits a meta-stable 1:1 mean-motion resonance with Earth. Intriguingly, recent ground-based observations show that Kamo`oalewa has spectroscopic characteristics similar to space-weathered lunar silicates, hinting at a lunar origin instead of an asteroidal one like the vast majority of NEAs. Here we use numerical simulations to demonstrate that Kamo`oalewa's physical and orbital properties are compatible with a fragment from a crater larger than 10--20 km formed on the Moon in the last few million years. The impact could have ejected sufficiently large fragments into heliocentric orbits, some of which could be transferred to Earth 1:1 resonance and persist today. This leads us to suggest the young lunar crater Giordano Bruno (22 km diameter, 1--10 Ma age) as the most likely source, linking a specific asteroid in space to its source crater on the Moon. The hypothesis will be tested by the Tianwen-2 mission when it returns a sample of Kamo`oalewa. And the upcoming NEO Surveyor mission will possibly help us to identify such a lunar-derived NEA population. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 29 pages, 4 figures. Published in Nature Astronomy, 19 April 2024

arXiv:2405.17802 [pdf, other]

Multi-level Interaction Modeling for Protein Mutational Effect Prediction

Authors: Yuanle Mo, Xin Hong, Bowen Gao, Yinjun Jia, Yanyan Lan

Abstract: Protein-protein interactions are central mediators in many biological processes. Accurately predicting the effects of mutations on interactions is crucial for guiding the modulation of these interactions, thereby playing a significant role in therapeutic development and drug discovery. Mutations generally affect interactions hierarchically across three levels: mutated residues exhibit different si… ▽ More Protein-protein interactions are central mediators in many biological processes. Accurately predicting the effects of mutations on interactions is crucial for guiding the modulation of these interactions, thereby playing a significant role in therapeutic development and drug discovery. Mutations generally affect interactions hierarchically across three levels: mutated residues exhibit different sidechain conformations, which lead to changes in the backbone conformation, eventually affecting the binding affinity between proteins. However, existing methods typically focus only on sidechain-level interaction modeling, resulting in suboptimal predictions. In this work, we propose a self-supervised multi-level pre-training framework, ProMIM, to fully capture all three levels of interactions with well-designed pretraining objectives. Experiments show ProMIM outperforms all the baselines on the standard benchmark, especially on mutations where significant changes in backbone conformations may occur. In addition, leading results from zero-shot evaluations for SARS-CoV-2 mutational effect prediction and antibody optimization underscore the potential of ProMIM as a powerful next-generation tool for developing novel therapeutic approaches and new drugs. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.16474 [pdf, other]

Inaccurate Label Distribution Learning with Dependency Noise

Authors: Zhiqiang Kou, Jing Wang, Yuheng Jia, Xin Geng

Abstract: In this paper, we introduce the Dependent Noise-based Inaccurate Label Distribution Learning (DN-ILDL) framework to tackle the challenges posed by noise in label distribution learning, which arise from dependencies on instances and labels. We start by modeling the inaccurate label distribution matrix as a combination of the true label distribution and a noise matrix influenced by specific instance… ▽ More In this paper, we introduce the Dependent Noise-based Inaccurate Label Distribution Learning (DN-ILDL) framework to tackle the challenges posed by noise in label distribution learning, which arise from dependencies on instances and labels. We start by modeling the inaccurate label distribution matrix as a combination of the true label distribution and a noise matrix influenced by specific instances and labels. To address this, we develop a linear mapping from instances to their true label distributions, incorporating label correlations, and decompose the noise matrix using feature and label representations, applying group sparsity constraints to accurately capture the noise. Furthermore, we employ graph regularization to align the topological structures of the input and output spaces, ensuring accurate reconstruction of the true label distribution matrix. Utilizing the Alternating Direction Method of Multipliers (ADMM) for efficient optimization, we validate our method's capability to recover true labels accurately and establish a generalization error bound. Extensive experiments demonstrate that DN-ILDL effectively addresses the ILDL problem and outperforms existing LDL methods. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.13686 [pdf, other]

Embedding Generalized Semantic Knowledge into Few-Shot Remote Sensing Segmentation

Authors: Yuyu Jia, Wei Huang, Junyu Gao, Qi Wang, Qiang Li

Abstract: Few-shot segmentation (FSS) for remote sensing (RS) imagery leverages supporting information from limited annotated samples to achieve query segmentation of novel classes. Previous efforts are dedicated to mining segmentation-guiding visual cues from a constrained set of support samples. However, they still struggle to address the pronounced intra-class differences in RS images, as sparse visual c… ▽ More Few-shot segmentation (FSS) for remote sensing (RS) imagery leverages supporting information from limited annotated samples to achieve query segmentation of novel classes. Previous efforts are dedicated to mining segmentation-guiding visual cues from a constrained set of support samples. However, they still struggle to address the pronounced intra-class differences in RS images, as sparse visual cues make it challenging to establish robust class-specific representations. In this paper, we propose a holistic semantic embedding (HSE) approach that effectively harnesses general semantic knowledge, i.e., class description (CD) embeddings.Instead of the naive combination of CD embeddings and visual features for segmentation decoding, we investigate embedding the general semantic knowledge during the feature extraction stage.Specifically, in HSE, a spatial dense interaction module allows the interaction of visual support features with CD embeddings along the spatial dimension via self-attention.Furthermore, a global content modulation module efficiently augments the global information of the target category in both support and query features, thanks to the transformative fusion of visual features and CD embeddings.These two components holistically synergize general CD embeddings and visual cues, constructing a robust class-specific representation.Through extensive experiments on the standard FSS benchmark, the proposed HSE approach demonstrates superior performance compared to peer work, setting a new state-of-the-art. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.12684 [pdf, other]

Model Free Prediction with Uncertainty Assessment

Authors: Yuling Jiao, Lican Kang, Jin Liu, Heng Peng, Heng Zuo

Abstract: Deep nonparametric regression, characterized by the utilization of deep neural networks to learn target functions, has emerged as a focus of research attention in recent years. Despite considerable progress in understanding convergence rates, the absence of asymptotic properties hinders rigorous statistical inference. To address this gap, we propose a novel framework that transforms the deep estim… ▽ More Deep nonparametric regression, characterized by the utilization of deep neural networks to learn target functions, has emerged as a focus of research attention in recent years. Despite considerable progress in understanding convergence rates, the absence of asymptotic properties hinders rigorous statistical inference. To address this gap, we propose a novel framework that transforms the deep estimation paradigm into a platform conducive to conditional mean estimation, leveraging the conditional diffusion model. Theoretically, we develop an end-to-end convergence rate for the conditional diffusion model and establish the asymptotic normality of the generated samples. Consequently, we are equipped to construct confidence regions, facilitating robust statistical inference. Furthermore, through numerical experiments, we empirically validate the efficacy of our proposed methodology. △ Less

Submitted 31 July, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.12543 [pdf, other]

Like Humans to Few-Shot Learning through Knowledge Permeation of Vision and Text

Authors: Yuyu Jia, Qing Zhou, Wei Huang, Junyu Gao, Qi Wang

Abstract: Few-shot learning aims to generalize the recognizer from seen categories to an entirely novel scenario. With only a few support samples, several advanced methods initially introduce class names as prior knowledge for identifying novel classes. However, obstacles still impede achieving a comprehensive understanding of how to harness the mutual advantages of visual and textual knowledge. In this pap… ▽ More Few-shot learning aims to generalize the recognizer from seen categories to an entirely novel scenario. With only a few support samples, several advanced methods initially introduce class names as prior knowledge for identifying novel classes. However, obstacles still impede achieving a comprehensive understanding of how to harness the mutual advantages of visual and textual knowledge. In this paper, we propose a coherent Bidirectional Knowledge Permeation strategy called BiKop, which is grounded in a human intuition: A class name description offers a general representation, whereas an image captures the specificity of individuals. BiKop primarily establishes a hierarchical joint general-specific representation through bidirectional knowledge permeation. On the other hand, considering the bias of joint representation towards the base set, we disentangle base-class-relevant semantics during training, thereby alleviating the suppression of potential novel-class-relevant information. Experiments on four challenging benchmarks demonstrate the remarkable superiority of BiKop. Our code will be publicly available. △ Less

Submitted 22 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.11457 [pdf, other]

Deep Dive into Model-free Reinforcement Learning for Biological and Robotic Systems: Theory and Practice

Authors: Yusheng Jiao, Feng Ling, Sina Heydari, Nicolas Heess, Josh Merel, Eva Kanso

Abstract: Animals and robots exist in a physical world and must coordinate their bodies to achieve behavioral objectives. With recent developments in deep reinforcement learning, it is now possible for scientists and engineers to obtain sensorimotor strategies (policies) for specific tasks using physically simulated bodies and environments. However, the utility of these methods goes beyond the constraints o… ▽ More Animals and robots exist in a physical world and must coordinate their bodies to achieve behavioral objectives. With recent developments in deep reinforcement learning, it is now possible for scientists and engineers to obtain sensorimotor strategies (policies) for specific tasks using physically simulated bodies and environments. However, the utility of these methods goes beyond the constraints of a specific task; they offer an exciting framework for understanding the organization of an animal sensorimotor system in connection to its morphology and physical interaction with the environment, as well as for deriving general design rules for sensing and actuation in robotic systems. Algorithms and code implementing both learning agents and environments are increasingly available, but the basic assumptions and choices that go into the formulation of an embodied feedback control problem using deep reinforcement learning may not be immediately apparent. Here, we present a concise exposition of the mathematical and algorithmic aspects of model-free reinforcement learning, specifically through the use of \textit{actor-critic} methods, as a tool for investigating the feedback control underlying animal and robotic behavior. △ Less

Submitted 19 May, 2024; originally announced May 2024.

Comments: 20 pages, 3 figures

arXiv:2405.11451 [pdf, ps, other]

Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method

Authors: Yuling Jiao, Yanming Lai, Yang Wang

Abstract: Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations wi… ▽ More Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations with three different types of boundary conditions. We perform projected gradient descent(PDG) to train the three-layer network and we establish its global convergence. To the best of our knowledge, we are the first to provide a comprehensive error analysis of using overparameterized networks to solve PDE problems, as our analysis simultaneously includes estimates for approximation error, generalization error, and optimization error. We present error bound in terms of the sample size $n$ and our work provides guidance on how to set the network depth, width, step size, and number of iterations for the projected gradient descent algorithm. Importantly, our assumptions in this work are classical and we do not require any additional assumptions on the solution of the equation. This ensures the broad applicability and generality of our results. △ Less

Submitted 19 May, 2024; originally announced May 2024.

MSC Class: 65N12; 65N15; 68T07; 62G05; 35J25

arXiv:2405.11329 [pdf]

Risk-neutral valuation of options under arithmetic Brownian motions

Authors: Qiang Liu, Yuhan Jiao, Shuxin Guo

Abstract: On April 22, 2020, the CME Group switched to Bachelier pricing for a group of oil futures options. The Bachelier model, or more generally the arithmetic Brownian motion (ABM), is not so widely used in finance, though. This paper provides the first comprehensive survey of options pricing under ABM. Using the risk-neutral valuation, we derive formulas for European options for three underlying types,… ▽ More On April 22, 2020, the CME Group switched to Bachelier pricing for a group of oil futures options. The Bachelier model, or more generally the arithmetic Brownian motion (ABM), is not so widely used in finance, though. This paper provides the first comprehensive survey of options pricing under ABM. Using the risk-neutral valuation, we derive formulas for European options for three underlying types, namely an underlying that does not pay dividends, an underlying that pays a continuous dividend yield, and futures. Further, we derive Black-Scholes-Merton-like partial differential equations, which can in principle be utilized to price American options numerically via finite difference. △ Less

Submitted 18 May, 2024; originally announced May 2024.

Comments: 19 pages, 4 figures

arXiv:2405.10573 [pdf, other]

A method for reversing the laser modulation in a Storage ring

Authors: Weihang Liu, Yu Zhao, Yi Jiao, Sheng Wang, Chao Feng

Abstract: The pursuit of coherent radiation generation remains a central focus in advancing storage ring light sources. Despite the promise of laser modulation in achieving this goal, it brings about a noticeable decline in beam quality. Efforts to mitigate this decline have resulted in the proposal of demodulation schemes. However, implementing modulation and demodulation within the storage ring presents s… ▽ More The pursuit of coherent radiation generation remains a central focus in advancing storage ring light sources. Despite the promise of laser modulation in achieving this goal, it brings about a noticeable decline in beam quality. Efforts to mitigate this decline have resulted in the proposal of demodulation schemes. However, implementing modulation and demodulation within the storage ring presents significant challenges due to dynamical and spatial constraints within straight sections. In this study, we propose a straightforward and easily implementable method for achieving reversible laser modulation in a storage ring. Notably, our approach circumvents the need for special storage ring requirements, such as lengthy straight sections or bypass section. Simulation results demonstrate a substantial restoration of beam quality following demodulation. This innovative scheme holds great promise for the realization of high repetition rate coherent storage ring light sources. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 7 pages, 7 figures

arXiv:2405.09665 [pdf]

Sign-Alternating Thermoelectric Quantum Oscillations and Insulating Landau Levels in Monolayer WTe2

Authors: Yue Tang, Tiancheng Song, Haosen Guan, Yanyu Jia, Guo Yu, Zhaoyi Joy Zheng, Ayelet J. Uzan, Michael Onyszczak, Ratnadwip Singha, Xin Gui, Kenji Watanabe, Takashi Taniguchi, Robert J. Cava, Leslie M. Schoop, N. P. Ong, Sanfeng Wu

Abstract: The detection of Landau-level-like energy structures near the chemical potential of an insulator is essential to the search for a class of correlated electronic matter hosting charge-neutral fermions and Fermi surfaces, a long-proposed concept that remains elusive experimentally. Here we introduce and demonstrate that the magneto-thermoelectric response of a quantum insulator can reveal critical i… ▽ More The detection of Landau-level-like energy structures near the chemical potential of an insulator is essential to the search for a class of correlated electronic matter hosting charge-neutral fermions and Fermi surfaces, a long-proposed concept that remains elusive experimentally. Here we introduce and demonstrate that the magneto-thermoelectric response of a quantum insulator can reveal critical information not available via other approaches. We report large quantum oscillations (QOs) in the Seebeck response of the hole-doped insulating state of monolayer tungsten ditelluride (WTe2) in magnetic fields. The QOs remarkably undergo sign-changes as the field is swept, mimicking those in metals with Landau quantization. The sign-change in the thermoelectric response directly implies the presence of a field-induced Landau-level-like structure at the chemical potential of the insulator. Our results reinforce WTe2 as a platform for investigating insulating Landau levels and mobile neutral fermions in two-dimensional insulators. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 19 pages, 9 figures

arXiv:2405.08851 [pdf]

Enhanced Terahertz Spectroscopy of a Monolayer Transition Metal Dichalcogenide

Authors: Xin Jin, Vincenzo Aglieri, Young-Gyun Jeong, Atiye Pezeshki, Lilian Skokan, Mostafa Shagar, Yuechen Jia, Pablo Bianucci, Andreas Ruediger, Emanuele Orgiu, Andrea Toma, Luca Razzari

Abstract: Two-dimensional materials, including transition metal dichalcogenides, are attractive for a variety of applications in electronics as well as photonics and have recently been envisioned as an appealing platform for phonon polaritonics. However, their direct characterization in the terahertz spectral region, of interest for retrieving, e.g., their phonon response, represents a major challenge, due… ▽ More Two-dimensional materials, including transition metal dichalcogenides, are attractive for a variety of applications in electronics as well as photonics and have recently been envisioned as an appealing platform for phonon polaritonics. However, their direct characterization in the terahertz spectral region, of interest for retrieving, e.g., their phonon response, represents a major challenge, due to the limited sensitivity of typical terahertz spectroscopic tools and the weak interaction of such long-wavelength radiation with sub-nanometer systems. In this work, by exploiting an ad-hoc engineered metallic surface enabling a ten-thousand-fold local absorption boost, we perform enhanced terahertz spectroscopy of a monolayer transition metal dichalcogenide (tungsten diselenide) and extract its dipole-active phonon resonance features. In addition, we use these data to obtain the monolayer effective permittivity around its phonon resonance. Via the direct terahertz characterization of the phonon response of such two-dimensional systems, this method opens the path to the rational design of phonon polariton devices exploiting monolayer transition metal dichalcogenides. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.06093 [pdf, other]

Selective Fine-tuning on LLM-labeled Data May Reduce Reliance on Human Annotation: A Case Study Using Schedule-of-Event Table Detection

Authors: Bhawesh Kumar, Jonathan Amar, Eric Yang, Nan Li, Yugang Jia

Abstract: Large Language Models (LLMs) have demonstrated their efficacy across a broad spectrum of tasks in healthcare applications. However, often LLMs need to be fine-tuned on task-specific expert annotated data to achieve optimal performance, which can be expensive and time consuming. In this study, we fine-tune PaLM-2 with parameter efficient fine-tuning (PEFT) using noisy labels obtained from gemini-pr… ▽ More Large Language Models (LLMs) have demonstrated their efficacy across a broad spectrum of tasks in healthcare applications. However, often LLMs need to be fine-tuned on task-specific expert annotated data to achieve optimal performance, which can be expensive and time consuming. In this study, we fine-tune PaLM-2 with parameter efficient fine-tuning (PEFT) using noisy labels obtained from gemini-pro 1.0 for the detection of Schedule-of-Event (SoE) tables, which specify care plan in clinical trial protocols. We introduce a filtering mechanism to select high-confidence labels for this table classification task, thereby reducing the noise in the auto-generated labels. We show that fine-tuned PaLM-2 with those labels achieves performance that exceeds the gemini-pro 1.0 and other LLMs. Furthermore, its performance is close to a PaLM-2 fine-tuned on labels obtained from non-expert annotators. Our results show that leveraging LLM-generated labels through powerful models like gemini-pro can potentially serve as a viable strategy for improving LLM performance through fine-tuning in specialized tasks, particularly in domains where expert annotations are scarce, expensive, or time-consuming to obtain. △ Less

Submitted 5 August, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: 23 pages. Published in MLHC 2024

arXiv:2405.06014 [pdf, other]

Superconformal Monodromy Defects in $\mathcal{N}$=4 SYM and LS theory

Authors: Igal Arav, Jerome P. Gauntlett, Yusheng Jiao, Matthew M. Roberts, Christopher Rosen

Abstract: We study type IIB supergravity solutions that are dual to two-dimensional superconformal defects in $d=4$ SCFTs which preserve $\mathcal{N}=(0,2)$ supersymmetry. We consider solutions dual to defects in $\mathcal{N}=4$ SYM theory that have non-trivial monodromy for $U(1)^3\subset SO(6)$ global symmetry and we also allow for the possibility of conical singularities. In addition, we consider the add… ▽ More We study type IIB supergravity solutions that are dual to two-dimensional superconformal defects in $d=4$ SCFTs which preserve $\mathcal{N}=(0,2)$ supersymmetry. We consider solutions dual to defects in $\mathcal{N}=4$ SYM theory that have non-trivial monodromy for $U(1)^3\subset SO(6)$ global symmetry and we also allow for the possibility of conical singularities. In addition, we consider the addition of fermionic and bosonic mass terms that have non trivial dependence on the spatial directions transverse to the defect, while preserving the superconformal symmetry of the defect. We compute various physical quantities including the central charges of the defect expressed as a function of the monodromy, the on-shell action as well as associated supersymmetric Renyi entropies. Analogous computations are carried out for superconformal defects in the $\mathcal{N}=1$, $d=4$ Leigh-Strassler SCFT. We also show that the defects of the two SCFTs are connected by a line of bulk marginal mass deformations and argue that they are also related by bulk RG flow. △ Less

Submitted 23 July, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: 93 pages, 8 figures. References and figure added. Various discussions refined, including on the existence of defect solutions

Report number: APCTP Pre2024-005,CCTP-2024-9, ITCP-2024/9

arXiv:2405.05512 [pdf, other]

Characteristic Learning for Provable One Step Generation

Authors: Zhao Ding, Chenguang Duan, Yuling Jiao, Ruoxuan Li, Jerry Zhijian Yang, Pingwen Zhang

Abstract: We propose the characteristic generator, a novel one-step generative model that combines the efficiency of sampling in Generative Adversarial Networks (GANs) with the stable performance of flow-based models. Our model is driven by characteristics, along which the probability density transport can be described by ordinary differential equations (ODEs). Specifically, We estimate the velocity field t… ▽ More We propose the characteristic generator, a novel one-step generative model that combines the efficiency of sampling in Generative Adversarial Networks (GANs) with the stable performance of flow-based models. Our model is driven by characteristics, along which the probability density transport can be described by ordinary differential equations (ODEs). Specifically, We estimate the velocity field through nonparametric regression and utilize Euler method to solve the probability flow ODE, generating a series of discrete approximations to the characteristics. We then use a deep neural network to fit these characteristics, ensuring a one-step mapping that effectively pushes the prior distribution towards the target distribution. In the theoretical aspect, we analyze the errors in velocity matching, Euler discretization, and characteristic fitting to establish a non-asymptotic convergence rate for the characteristic generator in 2-Wasserstein distance. To the best of our knowledge, this is the first thorough analysis for simulation-free one step generative models. Additionally, our analysis refines the error analysis of flow-based generative models in prior works. We apply our method on both synthetic and real datasets, and the results demonstrate that the characteristic generator achieves high generation quality with just a single evaluation of neural network. △ Less

Submitted 16 July, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.03464 [pdf, ps, other]

Next-to-leading-order electroweak correction to $H\to Z^0γ$

Authors: Wen-Long Sang, Feng Feng, Yu Jia

Abstract: Inspired by the recent observation of the Higgs boson radiative decay into $Z^0$ by {\tt ATLAS} and {\tt CMS} Collaborations, we investigate the next-to-leading-order (NLO) electroweak correction to this rare decay process in Standard Model (SM). Implementing the on-shell renormalization scheme, we find that the magnitude of the NLO electroweak correction may reach $7\%$ of the leading order (LO)… ▽ More Inspired by the recent observation of the Higgs boson radiative decay into $Z^0$ by {\tt ATLAS} and {\tt CMS} Collaborations, we investigate the next-to-leading-order (NLO) electroweak correction to this rare decay process in Standard Model (SM). Implementing the on-shell renormalization scheme, we find that the magnitude of the NLO electroweak correction may reach $7\%$ of the leading order (LO) prediction, much more significant than that of the NLO QCD correction, which is merely about $0.3\%$. After incorporating the ${\cal O}(α)$ correction, the predicted partial width from various $α$ schemes tend to converge to each other. Including both NLO electroweak and QCD corrections, the SM prediction for the branching fraction shifts from the LO value of $(1.40-1.71)\times 10^{-3}$ to $(1.55\pm 0.06)\times 10^{-3}$, considerably lower than the measured value ${\cal B}_{\rm exp}[H\to Z^0γ]=(3.4\pm 1.1)\times 10^{-3}$. Resolving this alarming discrepancy clearly calls for further theoretical investigations, and, more importantly, experimental efforts from {\tt HL-LHC} and the prospective Higgs factories such as {\tt CEPC} and {\tt FCC-ee}. △ Less

Submitted 25 June, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

Comments: 7 pages, 2 figures, 1 table; References added, text improved

arXiv:2405.02688 [pdf, other]

Semi-supervised Symmetric Matrix Factorization with Low-Rank Tensor Representation

Authors: Yuheng Jia, Jia-Nan Li, Wenhui Wu, Ran Wang

Abstract: Semi-supervised symmetric non-negative matrix factorization (SNMF) utilizes the available supervisory information (usually in the form of pairwise constraints) to improve the clustering ability of SNMF. The previous methods introduce the pairwise constraints from the local perspective, i.e., they either directly refine the similarity matrix element-wisely or restrain the distance of the decomposed… ▽ More Semi-supervised symmetric non-negative matrix factorization (SNMF) utilizes the available supervisory information (usually in the form of pairwise constraints) to improve the clustering ability of SNMF. The previous methods introduce the pairwise constraints from the local perspective, i.e., they either directly refine the similarity matrix element-wisely or restrain the distance of the decomposed vectors in pairs according to the pairwise constraints, which overlook the global perspective, i.e., in the ideal case, the pairwise constraint matrix and the ideal similarity matrix possess the same low-rank structure. To this end, we first propose a novel semi-supervised SNMF model by seeking low-rank representation for the tensor synthesized by the pairwise constraint matrix and a similarity matrix obtained by the product of the embedding matrix and its transpose, which could strengthen those two matrices simultaneously from a global perspective. We then propose an enhanced SNMF model, making the embedding matrix tailored to the above tensor low-rank representation. We finally refine the similarity matrix by the strengthened pairwise constraints. We repeat the above steps to continuously boost the similarity matrix and pairwise constraint matrix, leading to a high-quality embedding matrix. Extensive experiments substantiate the superiority of our method. The code is available at https://github.com/JinaLeejnl/TSNMF. △ Less

Submitted 4 May, 2024; originally announced May 2024.

arXiv:2405.00515 [pdf, other]

GAD-Generative Learning for HD Map-Free Autonomous Driving

Authors: Weijian Sun, Yanbo Jia, Qi Zeng, Zihao Liu, Jiang Liao, Yue Li, Xianfeng Li

Abstract: Deep-learning-based techniques have been widely adopted for autonomous driving software stacks for mass production in recent years, focusing primarily on perception modules, with some work extending this method to prediction modules. However, the downstream planning and control modules are still designed with hefty handcrafted rules, dominated by optimization-based methods such as quadratic progra… ▽ More Deep-learning-based techniques have been widely adopted for autonomous driving software stacks for mass production in recent years, focusing primarily on perception modules, with some work extending this method to prediction modules. However, the downstream planning and control modules are still designed with hefty handcrafted rules, dominated by optimization-based methods such as quadratic programming or model predictive control. This results in a performance bottleneck for autonomous driving systems in that corner cases simply cannot be solved by enumerating hand-crafted rules. We present a deep-learning-based approach that brings prediction, decision, and planning modules together with the attempt to overcome the rule-based methods' deficiency in real-world applications of autonomous driving, especially for urban scenes. The DNN model we proposed is solely trained with 10 hours of human driver data, and it supports all mass-production ADAS features available on the market to date. This method is deployed onto a Jiyue test car with no modification to its factory-ready sensor set and compute platform. the feasibility, usability, and commercial potential are demonstrated in this article. △ Less

Submitted 31 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

arXiv:2404.19581 [pdf, ps, other]

Soft pattern of gravitational Rutherford scattering from heavy target mass expansion

Authors: Yu Jia, Jichen Pan, Jia-Yue Zhang

Abstract: We investigate the soft behavior of the tree-level Rutherford scattering processes mediated via $t$-channel one-graviton exchange. We consider two types of Rutherford scattering processes, {\it e.g.}, a low-energy massless structureless projectile (up to spin-$1$) hits a static massive composite particle carrying various spins (up to spin-$2$), and a slowly-moving light projectile hits a heavy sta… ▽ More We investigate the soft behavior of the tree-level Rutherford scattering processes mediated via $t$-channel one-graviton exchange. We consider two types of Rutherford scattering processes, {\it e.g.}, a low-energy massless structureless projectile (up to spin-$1$) hits a static massive composite particle carrying various spins (up to spin-$2$), and a slowly-moving light projectile hits a heavy static composite target. The unpolarized cross sections in the first type are found to exhibit universal forms at the first two orders in $1/M$ expansion, yet differ at the next-to-next-to-leading order, though some terms at this order still remain universal or depend on the target spin in a definite manner. The unpolarized cross sections in the second type are universal at the lowest order in projectile velocity expansion and through all orders in $1/M$, independent of the spins of both projectile and target. The universality partially breaks down at relative order-$v^2/M^2$, albeit some terms at this order still depend on the target spin in a specific manner. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: 12 pages, 1 figure

arXiv:2404.19527 [pdf, other]

Revealing the Two Sides of Data Augmentation: An Asymmetric Distillation-based Win-Win Solution for Open-Set Recognition

Authors: Yunbing Jia, Xiaoyu Kong, Fan Tang, Yixing Gao, Weiming Dong, Yi Yang

Abstract: In this paper, we reveal the two sides of data augmentation: enhancements in closed-set recognition correlate with a significant decrease in open-set recognition. Through empirical investigation, we find that multi-sample-based augmentations would contribute to reducing feature discrimination, thereby diminishing the open-set criteria. Although knowledge distillation could impair the feature via i… ▽ More In this paper, we reveal the two sides of data augmentation: enhancements in closed-set recognition correlate with a significant decrease in open-set recognition. Through empirical investigation, we find that multi-sample-based augmentations would contribute to reducing feature discrimination, thereby diminishing the open-set criteria. Although knowledge distillation could impair the feature via imitation, the mixed feature with ambiguous semantics hinders the distillation. To this end, we propose an asymmetric distillation framework by feeding teacher model extra raw data to enlarge the benefit of teacher. Moreover, a joint mutual information loss and a selective relabel strategy are utilized to alleviate the influence of hard mixed samples. Our method successfully mitigates the decline in open-set and outperforms SOTAs by 2%~3% AUROC on the Tiny-ImageNet dataset and experiments on large-scale dataset ImageNet-21K demonstrate the generalization of our method. △ Less

Submitted 28 April, 2024; originally announced April 2024.

arXiv:2404.18526 [pdf, ps, other]

Optomechanically Induced Transparency on Exceptional Surfaces

Authors: Y. Pan, H. -L. Zhang, Y. -F. Jiao, D. -Y. Wang, S. -L. Su, H. Jing

Abstract: Exceptional points (EPs) are singularities in non-Hermitian systems, where the system transmission spectrum varies significantly at the phase transition point. Here, we propose a practical scheme to study the changes of the optomechanically induced transparency (OMIT) spectrum on the exceptional surface (ES), which is formed by designing the structure of the waveguide in a non-Hermitian cavity opt… ▽ More Exceptional points (EPs) are singularities in non-Hermitian systems, where the system transmission spectrum varies significantly at the phase transition point. Here, we propose a practical scheme to study the changes of the optomechanically induced transparency (OMIT) spectrum on the exceptional surface (ES), which is formed by designing the structure of the waveguide in a non-Hermitian cavity optomechanical system. By comparing the transmission spectra of the system at different normal points, EPs on the same or different ESs, and exceptional derived points, we find that the peak-valley conversion of the system transmission spectra is obtained at the phase transition point and the arbitrary manipulation of the system transmission spectrum can be realized by moving the system on the same or different ESs. Furthermore, the phenomena of conversion and enhancement of the fast-slow light in the system transmission spectra have also been discovered in our researches. Different from the isolated EP, our proposal can discuss the system properties at different EPs, can find a richer transmission spectrum, and can provide more convenient options for experimental implementation, which paves the way for studying the nature of non-Hermitian systems in a higher dimension. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.13309 [pdf, ps, other]

Latent Schr{ö}dinger Bridge Diffusion Model for Generative Learning

Authors: Yuling Jiao, Lican Kang, Huazhen Lin, Jin Liu, Heng Zuo

Abstract: This paper aims to conduct a comprehensive theoretical analysis of current diffusion models. We introduce a novel generative learning methodology utilizing the Schr{ö}dinger bridge diffusion model in latent space as the framework for theoretical exploration in this domain. Our approach commences with the pre-training of an encoder-decoder architecture using data originating from a distribution tha… ▽ More This paper aims to conduct a comprehensive theoretical analysis of current diffusion models. We introduce a novel generative learning methodology utilizing the Schr{ö}dinger bridge diffusion model in latent space as the framework for theoretical exploration in this domain. Our approach commences with the pre-training of an encoder-decoder architecture using data originating from a distribution that may diverge from the target distribution, thus facilitating the accommodation of a large sample size through the utilization of pre-existing large-scale models. Subsequently, we develop a diffusion model within the latent space utilizing the Schr{ö}dinger bridge framework. Our theoretical analysis encompasses the establishment of end-to-end error analysis for learning distributions via the latent Schr{ö}dinger bridge diffusion model. Specifically, we control the second-order Wasserstein distance between the generated distribution and the target distribution. Furthermore, our obtained convergence rates effectively mitigate the curse of dimensionality, offering robust theoretical support for prevailing diffusion models. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.12966 [pdf, other]

Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models

Authors: Yian Li, Wentao Tian, Yang Jiao, Jingjing Chen

Abstract: Counterfactual reasoning, as a crucial manifestation of human intelligence, refers to making presuppositions based on established facts and extrapolating potential outcomes. Existing multimodal large language models (MLLMs) have exhibited impressive cognitive and reasoning capabilities, which have been examined across a wide range of Visual Question Answering (VQA) benchmarks. Nevertheless, how wi… ▽ More Counterfactual reasoning, as a crucial manifestation of human intelligence, refers to making presuppositions based on established facts and extrapolating potential outcomes. Existing multimodal large language models (MLLMs) have exhibited impressive cognitive and reasoning capabilities, which have been examined across a wide range of Visual Question Answering (VQA) benchmarks. Nevertheless, how will existing MLLMs perform when faced with counterfactual questions? To answer this question, we first curate a novel \textbf{C}ounter\textbf{F}actual \textbf{M}ulti\textbf{M}odal reasoning benchmark, abbreviated as \textbf{CFMM}, to systematically assess the counterfactual reasoning capabilities of MLLMs. Our CFMM comprises six challenging tasks, each including hundreds of carefully human-labeled and GPT-generated counterfactual questions, to evaluate MLLM's counterfactual reasoning capabilities across diverse aspects. Through experiments, interestingly, we find that existing MLLMs prefer to believe what they see, but ignore the counterfactual presuppositions presented in the question, thereby leading to inaccurate responses. Furthermore, we evaluate a wide range of prevalent MLLMs on our proposed CFMM. The significant gap between their performance on our CFMM and that on several VQA benchmarks indicates that there is still considerable room for improvement in existing MLLMs toward approaching human-level intelligence. On the other hand, through boosting MLLMs performances on our CFMM in the future, potential avenues toward developing MLLMs with advanced intelligence can be explored. △ Less

Submitted 30 August, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

Showing 51–100 of 1,148 results for author: Jia, Y