Search | arXiv e-print repository

PRG: Prompt-Based Distillation Without Annotation via Proxy Relational Graph

Authors: Yijin Xu, Jialun Liu, Hualiang Wei, Wenhui Li

Abstract: In this paper, we propose a new distillation method for extracting knowledge from Large Foundation Models (LFM) into lightweight models, introducing a novel supervision mode that does not require manually annotated data. While LFMs exhibit exceptional zero-shot classification abilities across datasets, relying solely on LFM-generated embeddings for distillation poses two main challenges: LFM's tas… ▽ More In this paper, we propose a new distillation method for extracting knowledge from Large Foundation Models (LFM) into lightweight models, introducing a novel supervision mode that does not require manually annotated data. While LFMs exhibit exceptional zero-shot classification abilities across datasets, relying solely on LFM-generated embeddings for distillation poses two main challenges: LFM's task-irrelevant knowledge and the high density of features. The transfer of task-irrelevant knowledge could compromise the student model's discriminative capabilities, and the high density of features within target domains obstructs the extraction of discriminative knowledge essential for the task. To address this issue, we introduce the Proxy Relational Graph (PRG) method. We initially extract task-relevant knowledge from LFMs by calculating a weighted average of logits obtained through text prompt embeddings. Then we construct sample-class proxy graphs for LFM and student models, respectively, to model the correlation between samples and class proxies. Then, we achieve the distillation of selective knowledge by aligning the relational graphs produced by both the LFM and the student model. Specifically, the distillation from LFM to the student model is achieved through two types of alignment: 1) aligning the sample nodes produced by the student model with those produced by the LFM, and 2) aligning the edge relationships in the student model's graph with those in the LFM's graph. Our experimental results validate the effectiveness of PRG, demonstrating its ability to leverage the extensive knowledge base of LFMs while skillfully circumventing their inherent limitations in focused learning scenarios. Notably, in our annotation-free framework, PRG achieves an accuracy of 76.23\% (T: 77.9\%) on CIFAR-100 and 72.44\% (T: 75.3\%) on the ImageNet-1K. △ Less

Submitted 22 August, 2024; originally announced August 2024.

arXiv:2408.12171 [pdf, other]

Recent Advances on Machine Learning for Computational Fluid Dynamics: A Survey

Authors: Haixin Wang, Yadi Cao, Zijie Huang, Yuxuan Liu, Peiyan Hu, Xiao Luo, Zezheng Song, Wanjia Zhao, Jilin Liu, Jinan Sun, Shikun Zhang, Long Wei, Yue Wang, Tailin Wu, Zhi-Ming Ma, Yizhou Sun

Abstract: This paper explores the recent advancements in enhancing Computational Fluid Dynamics (CFD) tasks through Machine Learning (ML) techniques. We begin by introducing fundamental concepts, traditional methods, and benchmark datasets, then examine the various roles ML plays in improving CFD. The literature systematically reviews papers in recent five years and introduces a novel classification for for… ▽ More This paper explores the recent advancements in enhancing Computational Fluid Dynamics (CFD) tasks through Machine Learning (ML) techniques. We begin by introducing fundamental concepts, traditional methods, and benchmark datasets, then examine the various roles ML plays in improving CFD. The literature systematically reviews papers in recent five years and introduces a novel classification for forward modeling: Data-driven Surrogates, Physics-Informed Surrogates, and ML-assisted Numerical Solutions. Furthermore, we also review the latest ML methods in inverse design and control, offering a novel classification and providing an in-depth discussion. Then we highlight real-world applications of ML for CFD in critical scientific and engineering disciplines, including aerodynamics, combustion, atmosphere & ocean science, biology fluid, plasma, symbolic regression, and reduced order modeling. Besides, we identify key challenges and advocate for future research directions to address these challenges, such as multi-scale representation, physical knowledge encoding, scientific foundation model and automatic scientific discovery. This review serves as a guide for the rapidly expanding ML for CFD community, aiming to inspire insights for future advancements. We draw the conclusion that ML is poised to significantly transform CFD research by enhancing simulation accuracy, reducing computational time, and enabling more complex analyses of fluid dynamics. The paper resources can be viewed at https://github.com/WillDreamer/Awesome-AI4CFD. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: 22 pages, 6 figures

arXiv:2408.12152 [pdf, other]

Behavior Pattern Mining-based Multi-Behavior Recommendation

Authors: Haojie Li, Zhiyong Cheng, Xu Yu, Jinhuan Liu, Guanfeng Liu, Junwei Du

Abstract: Multi-behavior recommendation systems enhance effectiveness by leveraging auxiliary behaviors (such as page views and favorites) to address the limitations of traditional models that depend solely on sparse target behaviors like purchases. Existing approaches to multi-behavior recommendations typically follow one of two strategies: some derive initial node representations from individual behavior… ▽ More Multi-behavior recommendation systems enhance effectiveness by leveraging auxiliary behaviors (such as page views and favorites) to address the limitations of traditional models that depend solely on sparse target behaviors like purchases. Existing approaches to multi-behavior recommendations typically follow one of two strategies: some derive initial node representations from individual behavior subgraphs before integrating them for a comprehensive profile, while others interpret multi-behavior data as a heterogeneous graph, applying graph neural networks to achieve a unified node representation. However, these methods do not adequately explore the intricate patterns of behavior among users and items. To bridge this gap, we introduce a novel algorithm called Behavior Pattern mining-based Multi-behavior Recommendation (BPMR). Our method extensively investigates the diverse interaction patterns between users and items, utilizing these patterns as features for making recommendations. We employ a Bayesian approach to streamline the recommendation process, effectively circumventing the challenges posed by graph neural network algorithms, such as the inability to accurately capture user preferences due to over-smoothing. Our experimental evaluation on three real-world datasets demonstrates that BPMR significantly outperforms existing state-of-the-art algorithms, showing an average improvement of 268.29% in Recall@10 and 248.02% in NDCG@10 metrics. The code of our BPMR is openly accessible for use and further research at https://github.com/rookitkitlee/BPMR. △ Less

Submitted 22 August, 2024; originally announced August 2024.

arXiv:2408.12104 [pdf, other]

Minute-Cadence Observations of the LAMOST Fields with the TMTS: IV -- Catalog of Cataclysmic Variables from the First 3-yr Survey

Authors: Qichun Liu, Jie Lin, Xiaofeng Wang, Zhibin Dai, Yongkang Sun, Gaobo Xi, Jun Mo, Jialian Liu, Shengyu Yan, Alexei V. Filippenko, Thomas G. Brink, Yi Yang, Kishore C. Patra, Yongzhi Cai, Zhihao Chen, Liyang Chen, Fangzhou Guo, Xiaojun Jiang, Gaici Li, Wenxiong Li, Weili Lin, Cheng Miao, Xiaoran Ma, Haowei Peng, Qiqi Xia , et al. (2 additional authors not shown)

Abstract: The Tsinghua University--Ma Huateng Telescopes for Survey (TMTS) started to monitor the LAMOST plates in 2020, leading to the discovery of numerous short-period eclipsing binaries, peculiar pulsators, flare stars, and other variable objects. Here, we present the uninterrupted light curves for a sample of 64 cataclysmic variables (CVs) observed/discovered using the TMTS during its first three-year… ▽ More The Tsinghua University--Ma Huateng Telescopes for Survey (TMTS) started to monitor the LAMOST plates in 2020, leading to the discovery of numerous short-period eclipsing binaries, peculiar pulsators, flare stars, and other variable objects. Here, we present the uninterrupted light curves for a sample of 64 cataclysmic variables (CVs) observed/discovered using the TMTS during its first three-year observations, and we introduce new CVs and new light-variation periods (from known CVs) revealed through the TMTS observations. Thanks to the high-cadence observations of TMTS, diverse light variations, including superhumps, quasi-periodic oscillations, large-amplitude orbital modulations, and rotational modulations, are able to be detected in our CV samples, providing key observational clues for understanding the fast-developing physical processes in various CVs. All of these short-timescale light-curve features help further classify the subtypes of CV systems. We highlight the light-curve features observed in our CV sample and discuss further implications of minute-cadence light curves for CV identifications and classifications. Moreover, we examine the H$α$ emission lines in the spectra from our nonmagnetic CV samples (i.e., dwarf novae and nova-like subclasses) and find that the distribution of H$α$ emission strength shows significant differences between the sources with orbital periods above and below the period gap, which agrees with the trend seen from the SDSS nonmagnetic CV sample. △ Less

Submitted 21 August, 2024; originally announced August 2024.

Comments: 27 pages, 12 figures in main text, accepted for the publication in Universe

arXiv:2408.12063 [pdf, other]

A Deconfounding Approach to Climate Model Bias Correction

Authors: Wentao Gao, Jiuyong Li, Debo Cheng, Lin Liu, Jixue Liu, Thuc Duy Le, Xiaojing Du, Xiongren Chen, Yanchang Zhao, Yun Chen

Abstract: Global Climate Models (GCMs) are crucial for predicting future climate changes by simulating the Earth systems. However, GCM outputs exhibit systematic biases due to model uncertainties, parameterization simplifications, and inadequate representation of complex climate phenomena. Traditional bias correction methods, which rely on historical observation data and statistical techniques, often neglec… ▽ More Global Climate Models (GCMs) are crucial for predicting future climate changes by simulating the Earth systems. However, GCM outputs exhibit systematic biases due to model uncertainties, parameterization simplifications, and inadequate representation of complex climate phenomena. Traditional bias correction methods, which rely on historical observation data and statistical techniques, often neglect unobserved confounders, leading to biased results. This paper proposes a novel bias correction approach to utilize both GCM and observational data to learn a factor model that captures multi-cause latent confounders. Inspired by recent advances in causality based time series deconfounding, our method first constructs a factor model to learn latent confounders from historical data and then applies them to enhance the bias correction process using advanced time series forecasting models. The experimental results demonstrate significant improvements in the accuracy of precipitation outputs. By addressing unobserved confounders, our approach offers a robust and theoretically grounded solution for climate model bias correction. △ Less

Submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.11662 [pdf, other]

Optimizing Federated Graph Learning with Inherent Structural Knowledge and Dual-Densely Connected GNNs

Authors: Longwen Wang, Jianchun Liu, Zhi Liu, Jinyang Huang

Abstract: Federated Graph Learning (FGL) is an emerging technology that enables clients to collaboratively train powerful Graph Neural Networks (GNNs) in a distributed manner without exposing their private data. Nevertheless, FGL still faces the challenge of the severe non-Independent and Identically Distributed (non-IID) nature of graphs, which possess diverse node and edge structures, especially across va… ▽ More Federated Graph Learning (FGL) is an emerging technology that enables clients to collaboratively train powerful Graph Neural Networks (GNNs) in a distributed manner without exposing their private data. Nevertheless, FGL still faces the challenge of the severe non-Independent and Identically Distributed (non-IID) nature of graphs, which possess diverse node and edge structures, especially across varied domains. Thus, exploring the knowledge inherent in these structures becomes significantly crucial. Existing methods, however, either overlook the inherent structural knowledge in graph data or capture it at the cost of significantly increased resource demands (e.g., FLOPs and communication bandwidth), which can be detrimental to distributed paradigms. Inspired by this, we propose FedDense, a novel FGL framework that optimizes the utilization efficiency of inherent structural knowledge. To better acquire knowledge of diverse and underexploited structures, FedDense first explicitly encodes the structural knowledge inherent within graph data itself alongside node features. Besides, FedDense introduces a Dual-Densely Connected (DDC) GNN architecture that exploits the multi-scale (i.e., one-hop to multi-hop) feature and structure insights embedded in the aggregated feature maps at each layer. In addition to the exploitation of inherent structures, we consider resource limitations in FGL, devising exceedingly narrow layers atop the DDC architecture and adopting a selective parameter sharing strategy to reduce resource costs substantially. We conduct extensive experiments using 15 datasets across 4 different domains, demonstrating that FedDense consistently surpasses baselines by a large margin in training performance, while demanding minimal resources. △ Less

Submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.11611 [pdf, other]

DTN: Deep Multiple Task-specific Feature Interactions Network for Multi-Task Recommendation

Authors: Yaowen Bi, Yuteng Lian, Jie Cui, Jun Liu, Peijian Wang, Guanghui Li, Xuejun Chen, Jinglin Zhao, Hao Wen, Jing Zhang, Zhaoqi Zhang, Wenzhuo Song, Yang Sun, Weiwei Zhang, Mingchen Cai, Guanxing Zhang

Abstract: Neural-based multi-task learning (MTL) has been successfully applied to many recommendation applications. However, these MTL models (e.g., MMoE, PLE) did not consider feature interaction during the optimization, which is crucial for capturing complex high-order features and has been widely used in ranking models for real-world recommender systems. Moreover, through feature importance analysis acro… ▽ More Neural-based multi-task learning (MTL) has been successfully applied to many recommendation applications. However, these MTL models (e.g., MMoE, PLE) did not consider feature interaction during the optimization, which is crucial for capturing complex high-order features and has been widely used in ranking models for real-world recommender systems. Moreover, through feature importance analysis across various tasks in MTL, we have observed an interesting divergence phenomenon that the same feature can have significantly different importance across different tasks in MTL. To address these issues, we propose Deep Multiple Task-specific Feature Interactions Network (DTN) with a novel model structure design. DTN introduces multiple diversified task-specific feature interaction methods and task-sensitive network in MTL networks, enabling the model to learn task-specific diversified feature interaction representations, which improves the efficiency of joint representation learning in a general setup. We applied DTN to our company's real-world E-commerce recommendation dataset, which consisted of over 6.3 billion samples, the results demonstrated that DTN significantly outperformed state-of-the-art MTL models. Moreover, during online evaluation of DTN in a large-scale E-commerce recommender system, we observed a 3.28% in clicks, a 3.10% increase in orders and a 2.70% increase in GMV (Gross Merchandise Value) compared to the state-of-the-art MTL models. Finally, extensive offline experiments conducted on public benchmark datasets demonstrate that DTN can be applied to various scenarios beyond recommendations, enhancing the performance of ranking models. △ Less

Submitted 23 August, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.11417 [pdf]

Stream-K++: Adaptive GPU GEMM Kernel Scheduling and Selection using Bloom Filters

Authors: Harisankar Sadasivan, Muhammad Osama, Maksim Podkorytov, Carlus Huang, Jun Liu

Abstract: General matrix multiplication (GEMM) operations are crucial in various computational fields. As GPU architectures evolve, optimizing GEMM performance becomes increasingly important. This paper introduces Stream-K++, an enhancement to the promising Stream-K GEMM scheduling algorithm. We expand Stream-K's scheduling policies from three to seven and implement an efficient solution selection mechanism… ▽ More General matrix multiplication (GEMM) operations are crucial in various computational fields. As GPU architectures evolve, optimizing GEMM performance becomes increasingly important. This paper introduces Stream-K++, an enhancement to the promising Stream-K GEMM scheduling algorithm. We expand Stream-K's scheduling policies from three to seven and implement an efficient solution selection mechanism using Bloom filters. Our approach rapidly eliminates up to 95.8% of unsuitable configurations while maintaining a 100% true-negative rate. Implemented using the AMD Composable Kernel library and evaluated on AMD Instinct MI250X GPUs, Stream-K++ demonstrates significant performance gains (up to 43%) in select scenarios. It remains competitive (within 20% of optimal) for 60-97.6% of problem sizes. Our flexible framework, implemented in the Opensieve C++ library, allows for easy adaptation to new problem sizes, scheduling policies, or additional tuning parameters, paving the way for future optimizations in GPU-based GEMM operations. △ Less

Submitted 21 August, 2024; originally announced August 2024.

ACM Class: D.2; I.2

arXiv:2408.11365 [pdf]

Current Status and Trends in Image Anti-Forensics Research: A Bibliometric Analysis

Authors: Yihong Lu, Jianyi Liu, Ru Zhang

Abstract: Image anti-forensics is a critical topic in the field of image privacy and security research. With the increasing ease of manipulating or generating human faces in images, the potential misuse of such forged images is a growing concern. This study aims to comprehensively review the knowledge structure and research hotspots related to image anti-forensics by analyzing publications in the Web of Sci… ▽ More Image anti-forensics is a critical topic in the field of image privacy and security research. With the increasing ease of manipulating or generating human faces in images, the potential misuse of such forged images is a growing concern. This study aims to comprehensively review the knowledge structure and research hotspots related to image anti-forensics by analyzing publications in the Web of Science Core Collection (WoSCC) database. The bibliometric analysis conducted using VOSViewer software has revealed the research trends, major research institutions, most influential publications, top publishing venues, and most active contributors in this field. This is the first comprehensive bibliometric study summarizing research trends and developments in image anti-forensics. The information highlights recent and primary research directions, serving as a reference for future research in image anti-forensics. △ Less

Submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.11299 [pdf]

doi 10.1103/PhysRevApplied.21.024021

Substrate-induced spin-torque-like signal in spin-torque ferromagnetic resonance measurement

Authors: Dingsong Jiang, Hetian Chen, Guiping Ji, Yahong Chai, Chenye Zhang, Yuhan Liang, Jingchun Liu, Witold Skowroński, Pu Yu, Di Yi, Tianxiang Nan

Abstract: Oxide thin films and interfaces with strong spin-orbit coupling have recently shown exceptionally high charge-to-spin conversion, making them potential spin-source materials for spintronics. Epitaxial strain engineering using oxide substrates with different lattice constants and symmetries has emerged as a mean to further enhance charge-to-spin conversion. However, high relative permittivity and d… ▽ More Oxide thin films and interfaces with strong spin-orbit coupling have recently shown exceptionally high charge-to-spin conversion, making them potential spin-source materials for spintronics. Epitaxial strain engineering using oxide substrates with different lattice constants and symmetries has emerged as a mean to further enhance charge-to-spin conversion. However, high relative permittivity and dielectric loss of commonly used oxide substrates, such as SrTiO3, can cause significant current shunting in substrates at high frequency, which may strongly affect spin-torque measurement and potentially result in an inaccurate estimation of charge-to-spin conversion efficiency. In this study, we systematically evaluate the influence of various oxide substrates for the widely-used spin-torque ferromagnetic resonance (ST-FMR) measurement. Surprisingly, we observed substantial spin-torque signals in samples comprising only ferromagnetic metal on oxide substrates with high relative permittivity (e.g., SrTiO3 and KTaO3), where negligible signal should be initially expected. Notably, this unexpected signal shows a strong correlation with the capacitive reactance of oxide substrates and the leakage radio frequency (RF) current within the substrate. By revising the conventional ST-FMR analysis model, we attribute this phenomenon to a 90-degree phase difference between the RF current flowing in the metal layer and in the substrate. We suggest that extra attention should be paid during the ST-FMR measurements, as this artifact could dominate over the real spin-orbit torque signal from high-resistivity spin-source materials grown on substrate with high relative permittivity. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: 36 pages, 22 figures

arXiv:2408.11295 [pdf, ps, other]

Channel Modeling Framework for Both Communications and Bistatic Sensing Under 3GPP Standard

Authors: Chenhao Luo, Aimin Tang, Fei Gao, Jianguo Liu, Xudong Wang

Abstract: Integrated sensing and communications (ISAC) is considered a promising technology in the B5G/6G networks. The channel model is essential for an ISAC system to evaluate the communication and sensing performance. Most existing channel modeling studies focus on the monostatic ISAC channel. In this paper, the channel modeling framework for bistatic ISAC is considered. The proposed channel modeling fra… ▽ More Integrated sensing and communications (ISAC) is considered a promising technology in the B5G/6G networks. The channel model is essential for an ISAC system to evaluate the communication and sensing performance. Most existing channel modeling studies focus on the monostatic ISAC channel. In this paper, the channel modeling framework for bistatic ISAC is considered. The proposed channel modeling framework extends the current 3GPP channel modeling framework and ensures the compatibility with the communication channel model. To support the bistatic sensing function, several key features for sensing are added. First, more clusters with weaker power are generated and retained to characterize the potential sensing targets. Second, the target model can be either deterministic or statistical, based on different sensing scenarios. Furthermore, for the statistical case, different reflection models are employed in the generation of rays, taking into account spatial coherence. The effectiveness of the proposed bistatic ISAC channel model framework is validated by both ray tracing simulations and experiment studies. The compatibility with the 3GPP communication channel model and how to use this framework for sensing evaluation are also demonstrated. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: Accepted by IEEE JOURNALS OF SELECTED AREAS IN SENSORS. Part of this work was presented at VTC-Spring 2024

arXiv:2408.11281 [pdf, other]

BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation

Authors: Haotian Peng, Jiawei Liu, Jinsong Du, Jie Gao, Wei Wang

Abstract: We propose a bearing health management framework leveraging large language models (BearLLM), a novel multimodal model that unifies multiple bearing-related tasks by processing user prompts and vibration signals. Specifically, we introduce a prior knowledge-enhanced unified vibration signal representation to handle various working conditions across multiple datasets. This involves adaptively sampli… ▽ More We propose a bearing health management framework leveraging large language models (BearLLM), a novel multimodal model that unifies multiple bearing-related tasks by processing user prompts and vibration signals. Specifically, we introduce a prior knowledge-enhanced unified vibration signal representation to handle various working conditions across multiple datasets. This involves adaptively sampling the vibration signals based on the sampling rate of the sensor, incorporating the frequency domain to unify input dimensions, and using a fault-free reference signal as an auxiliary input. To extract features from vibration signals, we first train a fault classification network, then convert and align the extracted features into word embedding, and finally concatenate these with text embedding as input to an LLM. To evaluate the performance of the proposed method, we constructed the first large-scale multimodal bearing health management (MBHM) dataset, including paired vibration signals and textual descriptions. With our unified vibration signal representation, BearLLM using one set of pre-trained weights achieves state-of-the-art performance on nine publicly available fault diagnosis benchmarks, outperforming specific methods designed for individual datasets. We provide a dataset, our model, and code to inspire future research on building more capable industrial multimodal models (https://github.com/hatton613/BearLLM). △ Less

Submitted 20 August, 2024; originally announced August 2024.

arXiv:2408.11243 [pdf, other]

Do Neural Scaling Laws Exist on Graph Self-Supervised Learning?

Authors: Qian Ma, Haitao Mao, Jingzhe Liu, Zhehua Zhang, Chunlin Feng, Yu Song, Yihan Shao, Yao Ma

Abstract: Self-supervised learning~(SSL) is essential to obtain foundation models in NLP and CV domains via effectively leveraging knowledge in large-scale unlabeled data. The reason for its success is that a suitable SSL design can help the model to follow the neural scaling law, i.e., the performance consistently improves with increasing model and dataset sizes. However, it remains a mystery whether exist… ▽ More Self-supervised learning~(SSL) is essential to obtain foundation models in NLP and CV domains via effectively leveraging knowledge in large-scale unlabeled data. The reason for its success is that a suitable SSL design can help the model to follow the neural scaling law, i.e., the performance consistently improves with increasing model and dataset sizes. However, it remains a mystery whether existing SSL in the graph domain can follow the scaling behavior toward building Graph Foundation Models~(GFMs) with large-scale pre-training. In this study, we examine whether existing graph SSL techniques can follow the neural scaling behavior with the potential to serve as the essential component for GFMs. Our benchmark includes comprehensive SSL technique implementations with analysis conducted on both the conventional SSL setting and many new settings adopted in other domains. Surprisingly, despite the SSL loss continuously decreasing, no existing graph SSL techniques follow the neural scaling behavior on the downstream performance. The model performance only merely fluctuates on different data scales and model scales. Instead of the scales, the key factors influencing the performance are the choices of model architecture and pretext task design. This paper examines existing SSL techniques for the feasibility of Graph SSL techniques in developing GFMs and opens a new direction for graph SSL design with the new evaluation prototype. Our code implementation is available online to ease reproducibility on https://github.com/GraphSSLScaling/GraphSSLScaling. △ Less

Submitted 26 August, 2024; v1 submitted 20 August, 2024; originally announced August 2024.

arXiv:2408.10809 [pdf, ps, other]

Diameter two orientability of mixed graphs

Authors: Hengzhe Li, Zhiwei Ding, Jianbing Liu, Hong-Jian Lai

Abstract: In 1967, Katona and Szemerédi showed that no undirected graph with $n$ vertices and fewer than $\frac{n}{2}\log_2\frac{n}{2}$ edges admits an orientation of diameter two. In 1978, Chvátal and Thomassen revealed the complexity of determining whether an undirected graph can be oriented to achieve a diameter of two, proving it to be NP-complete. This breakthrough has sparked ongoing interest in ident… ▽ More In 1967, Katona and Szemerédi showed that no undirected graph with $n$ vertices and fewer than $\frac{n}{2}\log_2\frac{n}{2}$ edges admits an orientation of diameter two. In 1978, Chvátal and Thomassen revealed the complexity of determining whether an undirected graph can be oriented to achieve a diameter of two, proving it to be NP-complete. This breakthrough has sparked ongoing interest in identifying sufficient conditions for graphs to be oriented with the smallest possible diameter of two -- critical for optimizing communication and network flow in larger structures. In 2019, Czabarka, Dankelmann, and Székely significantly advanced this field by establishing that the minimum degree threshold for achieving such an orientation in undirected graphs of order $n$ is $\frac{n}{2} + Θ(\ln n)$. In this paper, we extend this foundational result by determining the minimum degree threshold necessary for realizing an orientation with diameter two in mixed graphs, which contain both undirected and directed edges. Mixed graphs offer a versatile framework, representing an intermediate stage in the orientation process, making our findings a substantial generalization of previous results. △ Less

Submitted 20 August, 2024; originally announced August 2024.

MSC Class: 05C07; 05C12; 05C20

arXiv:2408.10659 [pdf, other]

Productions of bottom and bottom-strange mesons in pion and kaon induced reactions

Authors: Jing Liu, Quan-Yun Guo, Qi Wu, Jun He, Dian-Yong Chen

Abstract: In the present work, we propose to explore the productions of the bottom and bottom-strange mesons in the high-energy pion and kaon-induced reactions on a proton target. The cross sections are evaluated with an effective Lagrangian constructed by the heavy-quark limit and chiral symmetry. Our estimations show that at $P_π=80$ GeV, the cross sections for $B(5279)$, $B^\ast (5325)$,… ▽ More In the present work, we propose to explore the productions of the bottom and bottom-strange mesons in the high-energy pion and kaon-induced reactions on a proton target. The cross sections are evaluated with an effective Lagrangian constructed by the heavy-quark limit and chiral symmetry. Our estimations show that at $P_π=80$ GeV, the cross sections for $B(5279)$, $B^\ast (5325)$, $B_0^\ast (5738)$, $B_1^\prime (5757)$, $B_1(5721)$ and $B_2^\ast (5747)$ production processes are estimated to be $3.19 \sim 86.26$, $1.86\sim 51.29$, $0.87 \sim 24.25$, $0.84 \sim 23.14$, $162.35 \sim 4477.66$, and $57.16 \sim 1604.43$ nb, respectively, where uncertainties arise from the model parameter. In addition, the cross sections for the corresponding bottom-strange mesons production processes are very similar. Moreover, our estimations indicate that the ratios of these cross sections are almost independent on the model parameters. In particular, the cross-section ratios related to the states in the same doublets are of order one, which is consistent with the expectation of heavy-quark limit. The cross sections related to the states in the $T$ doublets are about two orders larger than those related to the states in the $S$ doublets. △ Less

Submitted 20 August, 2024; originally announced August 2024.

arXiv:2408.10541 [pdf, other]

The Instance-centric Transformer for the RVOS Track of LSVOS Challenge: 3rd Place Solution

Authors: Bin Cao, Yisi Zhang, Hanyi Wang, Xingjian He, Jing Liu

Abstract: Referring Video Object Segmentation is an emerging multi-modal task that aims to segment objects in the video given a natural language expression. In this work, we build two instance-centric models and fuse predicted results from frame-level and instance-level. First, we introduce instance mask into the DETR-based model for query initialization to achieve temporal enhancement and employ SAM for sp… ▽ More Referring Video Object Segmentation is an emerging multi-modal task that aims to segment objects in the video given a natural language expression. In this work, we build two instance-centric models and fuse predicted results from frame-level and instance-level. First, we introduce instance mask into the DETR-based model for query initialization to achieve temporal enhancement and employ SAM for spatial refinement. Secondly, we build an instance retrieval model conducting binary instance mask classification whether the instance is referred. Finally, we fuse predicted results and our method achieved a score of 52.67 J&F in the validation phase and 60.36 J&F in the test phase, securing the final ranking of 3rd place in the 6-th LSVOS Challenge RVOS Track. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2406.13939

arXiv:2408.10468 [pdf, other]

Tracing Privacy Leakage of Language Models to Training Data via Adjusted Influence Functions

Authors: Jinxin Liu, Zao Yang

Abstract: The responses generated by Large Language Models (LLMs) can include sensitive information from individuals and organizations, leading to potential privacy leakage. This work implements Influence Functions (IFs) to trace privacy leakage back to the training data, thereby mitigating privacy concerns of Language Models (LMs). However, we notice that current IFs struggle to accurately estimate the inf… ▽ More The responses generated by Large Language Models (LLMs) can include sensitive information from individuals and organizations, leading to potential privacy leakage. This work implements Influence Functions (IFs) to trace privacy leakage back to the training data, thereby mitigating privacy concerns of Language Models (LMs). However, we notice that current IFs struggle to accurately estimate the influence of tokens with large gradient norms, potentially overestimating their influence. When tracing the most influential samples, this leads to frequently tracing back to samples with large gradient norm tokens, overshadowing the actual most influential samples even if their influences are well estimated. To address this issue, we propose Heuristically Adjusted IF (HAIF), which reduces the weight of tokens with large gradient norms, thereby significantly improving the accuracy of tracing the most influential samples. To establish easily obtained groundtruth for tracing privacy leakage, we construct two datasets, PII-E and PII-CR, representing two distinct scenarios: one with identical text in the model outputs and pre-training data, and the other where models leverage their reasoning abilities to generate text divergent from pre-training data. HAIF significantly improves tracing accuracy, enhancing it by 20.96% to 73.71% on the PII-E dataset and 3.21% to 45.93% on the PII-CR dataset, compared to the best SOTA IFs against various GPT-2 and QWen-1.5 models. HAIF also outperforms SOTA IFs on real-world pretraining data CLUECorpus2020, demonstrating strong robustness regardless prompt and response lengths. △ Less

Submitted 26 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.10046 [pdf, other]

Exploiting Fine-Grained Prototype Distribution for Boosting Unsupervised Class Incremental Learning

Authors: Jiaming Liu, Hongyuan Liu, Zhili Qin, Wei Han, Yulu Fan, Qinli Yang, Junming Shao

Abstract: The dynamic nature of open-world scenarios has attracted more attention to class incremental learning (CIL). However, existing CIL methods typically presume the availability of complete ground-truth labels throughout the training process, an assumption rarely met in practical applications. Consequently, this paper explores a more challenging problem of unsupervised class incremental learning (UCIL… ▽ More The dynamic nature of open-world scenarios has attracted more attention to class incremental learning (CIL). However, existing CIL methods typically presume the availability of complete ground-truth labels throughout the training process, an assumption rarely met in practical applications. Consequently, this paper explores a more challenging problem of unsupervised class incremental learning (UCIL). The essence of addressing this problem lies in effectively capturing comprehensive feature representations and discovering unknown novel classes. To achieve this, we first model the knowledge of class distribution by exploiting fine-grained prototypes. Subsequently, a granularity alignment technique is introduced to enhance the unsupervised class discovery. Additionally, we proposed a strategy to minimize overlap between novel and existing classes, thereby preserving historical knowledge and mitigating the phenomenon of catastrophic forgetting. Extensive experiments on the five datasets demonstrate that our approach significantly outperforms current state-of-the-art methods, indicating the effectiveness of the proposed method. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.10039 [pdf, other]

MSDiagnosis: An EMR-based Dataset for Clinical Multi-Step Diagnosis

Authors: Ruihui Hou, Shencheng Chen, Yongqi Fan, Lifeng Zhu, Jing Sun, Jingping Liu, Tong Ruan

Abstract: Clinical diagnosis is critical in medical practice, typically requiring a continuous and evolving process that includes primary diagnosis, differential diagnosis, and final diagnosis. However, most existing clinical diagnostic tasks are single-step processes, which does not align with the complex multi-step diagnostic procedures found in real-world clinical settings. In this paper, we propose a mu… ▽ More Clinical diagnosis is critical in medical practice, typically requiring a continuous and evolving process that includes primary diagnosis, differential diagnosis, and final diagnosis. However, most existing clinical diagnostic tasks are single-step processes, which does not align with the complex multi-step diagnostic procedures found in real-world clinical settings. In this paper, we propose a multi-step diagnostic task and annotate a clinical diagnostic dataset (MSDiagnosis). This dataset includes primary diagnosis, differential diagnosis, and final diagnosis questions. Additionally, we propose a novel and effective framework. This framework combines forward inference, backward inference, reflection, and refinement, enabling the LLM to self-evaluate and adjust its diagnostic results. To assess the effectiveness of our proposed method, we design and conduct extensive experiments. The experimental results demonstrate the effectiveness of the proposed method. We also provide a comprehensive experimental analysis and suggest future research directions for this task. △ Less

Submitted 29 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.10032 [pdf]

Tunable interfacial Rashba spin-orbit coupling in asymmetric Al$_x$In$_{1-x}$Sb/InSb/CdTe quantum well heterostructures

Authors: Hanzhi Ruan, Zhenghang Zhi, Yuyang Wu, Jiuming Liu, Puyang Huang, Shan Yao, Xinqi Liu, Chenjia Tang, Qi Yao, Lu Sun, Yifan Zhang, Yujie Xiao, Renchao Che, Xufeng Kou

Abstract: The manipulation of Rashba-type spin-orbit coupling (SOC) in molecular beam epitaxy-grown Al$_x$In$_{1-x}$Sb/InSb/CdTe quantum well heterostructures is reported. The effective band bending provides robust two-dimensional quantum confinement, while the unidirectional built-in electric field from the asymmetric hetero-interfaces results in pronounced Rashba SOC strength. By tuning the Al concentrati… ▽ More The manipulation of Rashba-type spin-orbit coupling (SOC) in molecular beam epitaxy-grown Al$_x$In$_{1-x}$Sb/InSb/CdTe quantum well heterostructures is reported. The effective band bending provides robust two-dimensional quantum confinement, while the unidirectional built-in electric field from the asymmetric hetero-interfaces results in pronounced Rashba SOC strength. By tuning the Al concentration in the top Al$_x$In$_{1-x}$Sb barrier layer, the optimal structure with $x = 0.15$ shows the largest Rashba coefficient of 0.23 eV-Angstrom. and the highest low-temperature electron mobility of 4400 cm$^2$/Vs . Quantitative investigations of the weak anti-localization effect further confirm the dominant D'yakonov-Perel (DP) spin relaxation mechanism during charge-to-spin conversion. These findings highlight the significance of quantum well engineering in shaping magneto-resistance responses, and narrow bandgap semiconductor-based heterostructures may offer a reliable platform for energy-efficient spintronic applications. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.09937 [pdf, other]

The curse of random quantum data

Authors: Kaining Zhang, Junyu Liu, Liu Liu, Liang Jiang, Min-Hsiu Hsieh, Dacheng Tao

Abstract: Quantum machine learning, which involves running machine learning algorithms on quantum devices, may be one of the most significant flagship applications for these devices. Unlike its classical counterparts, the role of data in quantum machine learning has not been fully understood. In this work, we quantify the performances of quantum machine learning in the landscape of quantum data. Provided th… ▽ More Quantum machine learning, which involves running machine learning algorithms on quantum devices, may be one of the most significant flagship applications for these devices. Unlike its classical counterparts, the role of data in quantum machine learning has not been fully understood. In this work, we quantify the performances of quantum machine learning in the landscape of quantum data. Provided that the encoding of quantum data is sufficiently random, the performance, we find that the training efficiency and generalization capabilities in quantum machine learning will be exponentially suppressed with the increase in the number of qubits, which we call "the curse of random quantum data". Our findings apply to both the quantum kernel method and the large-width limit of quantum neural networks. Conversely, we highlight that through meticulous design of quantum datasets, it is possible to avoid these curses, thereby achieving efficient convergence and robust generalization. Our conclusions are corroborated by extensive numerical simulations. △ Less

Submitted 19 August, 2024; originally announced August 2024.

Comments: 40 pages, 8 figures

arXiv:2408.09856 [pdf, other]

TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition

Authors: Tianwei Lin, Jiang Liu, Wenqiao Zhang, Zhaocheng Li, Yang Dai, Haoyuan Li, Zhelun Yu, Wanggui He, Juncheng Li, Hao Jiang, Siliang Tang, Yueting Zhuang

Abstract: While Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA have effectively addressed GPU memory constraints during fine-tuning, their performance often falls short, especially in multidimensional task scenarios. To address this issue, one straightforward solution is to introduce task-specific LoRA modules as domain experts, leveraging the modeling of multiple experts' capabilities and thus en… ▽ More While Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA have effectively addressed GPU memory constraints during fine-tuning, their performance often falls short, especially in multidimensional task scenarios. To address this issue, one straightforward solution is to introduce task-specific LoRA modules as domain experts, leveraging the modeling of multiple experts' capabilities and thus enhancing the general capability of multi-task learning. Despite promising, these additional components often add complexity to the training and inference process, contravening the efficient characterization of PEFT designed for. Considering this, we introduce an innovative PEFT method, TeamLoRA, consisting of a collaboration and competition module for experts, and thus achieving the right balance of effectiveness and efficiency: (i) For collaboration, a novel knowledge-sharing and -organizing mechanism is devised to appropriately reduce the scale of matrix operations, thereby boosting the training and inference speed. (ii) For competition, we propose leveraging a game-theoretic interaction mechanism for experts, encouraging experts to transfer their domain-specific knowledge while facing diverse downstream tasks, and thus enhancing the performance. By doing so, TeamLoRA elegantly connects the experts as a "Team" with internal collaboration and competition, enabling a faster and more accurate PEFT paradigm for multi-task learning. To validate the superiority of TeamLoRA, we curate a comprehensive multi-task evaluation(CME) benchmark to thoroughly assess the capability of multi-task learning. Experiments conducted on our CME and other benchmarks indicate the effectiveness and efficiency of TeamLoRA. Our project is available at https://github.com/Lin-Tianwei/TeamLoRA. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.09752 [pdf, other]

A Unified Framework for Iris Anti-Spoofing: Introducing IrisGeneral Dataset and Masked-MoE Method

Authors: Hang Zou, Chenxi Du, Ajian Liu, Yuan Zhang, Jing Liu, Mingchuan Yang, Jun Wan, Hui Zhang

Abstract: Iris recognition is widely used in high-security scenarios due to its stability and distinctiveness. However, the acquisition of iris images typically requires near-infrared illumination and near-infrared band filters, leading to significant and consistent differences in imaging across devices. This underscores the importance of developing cross-domain capabilities in iris anti-spoofing methods. D… ▽ More Iris recognition is widely used in high-security scenarios due to its stability and distinctiveness. However, the acquisition of iris images typically requires near-infrared illumination and near-infrared band filters, leading to significant and consistent differences in imaging across devices. This underscores the importance of developing cross-domain capabilities in iris anti-spoofing methods. Despite this need, there is no dataset available that comprehensively evaluates the generalization ability of the iris anti-spoofing task. To address this gap, we propose the IrisGeneral dataset, which includes 10 subsets, belonging to 7 databases, published by 4 institutions, collected with 6 types of devices. IrisGeneral is designed with three protocols, aimed at evaluating average performance, cross-racial generalization, and cross-device generalization of iris anti-spoofing models. To tackle the challenge of integrating multiple sub-datasets in IrisGeneral, we employ multiple parameter sets to learn from the various subsets. Specifically, we utilize the Mixture of Experts (MoE) to fit complex data distributions using multiple sub-neural networks. To further enhance the generalization capabilities, we introduce a novel method Masked-MoE (MMoE). It randomly masks a portion of tokens for some experts and requires their outputs to be similar to the unmasked experts, which improves the generalization ability and effectively mitigates the overfitting issue produced by MoE. We selected ResNet50, VIT-B/16, CLIP, and FLIP as representative models and benchmarked them on the IrisGeneral dataset. Experimental results demonstrate that our proposed MMoE with CLIP achieves the best performance on IrisGeneral. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.09739 [pdf, other]

TraDiffusion: Trajectory-Based Training-Free Image Generation

Authors: Mingrui Wu, Oucheng Huang, Jiayi Ji, Jiale Li, Xinyue Cai, Huafeng Kuang, Jianzhuang Liu, Xiaoshuai Sun, Rongrong Ji

Abstract: In this work, we propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion. This novel method allows users to effortlessly guide image generation via mouse trajectories. To achieve precise control, we design a distance awareness energy function to effectively guide latent variables, ensuring that the focus of generation is within the areas defined by the trajectory.… ▽ More In this work, we propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion. This novel method allows users to effortlessly guide image generation via mouse trajectories. To achieve precise control, we design a distance awareness energy function to effectively guide latent variables, ensuring that the focus of generation is within the areas defined by the trajectory. The energy function encompasses a control function to draw the generation closer to the specified trajectory and a movement function to diminish activity in areas distant from the trajectory. Through extensive experiments and qualitative assessments on the COCO dataset, the results reveal that TraDiffusion facilitates simpler, more natural image control. Moreover, it showcases the ability to manipulate salient regions, attributes, and relationships within the generated images, alongside visual input based on arbitrary or enhanced trajectories. △ Less

Submitted 19 August, 2024; originally announced August 2024.

Comments: The code: https://github.com/och-mac/TraDiffusion

arXiv:2408.09688 [pdf, other]

Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts

Authors: Jiaqing Liu, Chong Deng, Qinglin Zhang, Qian Chen, Hai Yu, Wen Wang

Abstract: Automatic Speech Recognition (ASR) transcripts exhibit recognition errors and various spoken language phenomena such as disfluencies, ungrammatical sentences, and incomplete sentences, hence suffering from poor readability. To improve readability, we propose a Contextualized Spoken-to-Written conversion (CoS2W) task to address ASR and grammar errors and also transfer the informal text into the for… ▽ More Automatic Speech Recognition (ASR) transcripts exhibit recognition errors and various spoken language phenomena such as disfluencies, ungrammatical sentences, and incomplete sentences, hence suffering from poor readability. To improve readability, we propose a Contextualized Spoken-to-Written conversion (CoS2W) task to address ASR and grammar errors and also transfer the informal text into the formal style with content preserved, utilizing contexts and auxiliary information. This task naturally matches the in-context learning capabilities of Large Language Models (LLMs). To facilitate comprehensive comparisons of various LLMs, we construct a document-level Spoken-to-Written conversion of ASR Transcripts Benchmark (SWAB) dataset. Using SWAB, we study the impact of different granularity levels on the CoS2W performance, and propose methods to exploit contexts and auxiliary information to enhance the outputs. Experimental results reveal that LLMs have the potential to excel in the CoS2W task, particularly in grammaticality and formality, our methods achieve effective understanding of contexts and auxiliary information by LLMs. We further investigate the effectiveness of using LLMs as evaluators and find that LLM evaluators show strong correlations with human evaluations on rankings of faithfulness and formality, which validates the reliability of LLM evaluators for the CoS2W task. △ Less

Submitted 18 August, 2024; originally announced August 2024.

Comments: 7 pages, 3 figures

arXiv:2408.09468 [pdf, other]

Towards Safe and Robust Autonomous Vehicle Platooning: A Self-Organizing Cooperative Control Framework

Authors: Chengkai Xu, Zihao Deng, Jiaqi Liu, Chao Huang, Peng Hang

Abstract: In the emerging hybrid traffic flow environment, which includes both human-driven vehicles (HDVs) and autonomous vehicles (AVs), ensuring safe and robust decision-making and control is crucial for the effective operation of autonomous vehicle platooning. Current systems for cooperative adaptive cruise control and lane changing are inadequate in responding to real-world emergency situations, limiti… ▽ More In the emerging hybrid traffic flow environment, which includes both human-driven vehicles (HDVs) and autonomous vehicles (AVs), ensuring safe and robust decision-making and control is crucial for the effective operation of autonomous vehicle platooning. Current systems for cooperative adaptive cruise control and lane changing are inadequate in responding to real-world emergency situations, limiting the potential of autonomous vehicle platooning technology. To address the aforementioned challenges, we propose a Twin-World Safety-Enhanced Data-Model-Knowledge Hybrid-Driven autonomous vehicle platooning Cooperative Control Framework. Within this framework, a deep reinforcement learning formation decision model integrating traffic priors is designed, and a twin-world deduction model based on safety priority judgment is proposed. Subsequently, an optimal control-based multi-scenario decision-control right adaptive switching mechanism is designed to achieve adaptive switching between data-driven and model-driven methods. Through simulation experiments and hardware-in-loop tests, our algorithm has demonstrated excellent performance in terms of safety, robustness, and flexibility. A detailed account of the validation results for the model can be found in \url{https://perfectxu88.github.io/towardssafeandrobust.github.io/}. △ Less

Submitted 18 August, 2024; originally announced August 2024.

arXiv:2408.09417 [pdf]

Discovery of terahertz-frequency orbitally-coupled magnons in a kagome ferromagnet

Authors: Mengqian Che, Weizhao Chen, Maoyuan Wang, F. Michael Bartram, Liangyang Liu, Xuebin Dong, Jinjin Liu, Yidian Li, Hao Lin, Zhiwei Wang, Enke Liu, Yugui Yao, Zhe Yuan, Guang-Ming Zhang, Luyi Yang

Abstract: In ferromagnetic materials, magnons - quanta of spin waves - typically resonate in the gigahertz range. Beyond conventional magnons, while theoretical studies have predicted magnons associated with orbital magnetic moments, their direct observation has remained challenging. Here, we present the discovery of two distinct terahertz orbitally-coupled magnon resonances in the topological kagome ferrom… ▽ More In ferromagnetic materials, magnons - quanta of spin waves - typically resonate in the gigahertz range. Beyond conventional magnons, while theoretical studies have predicted magnons associated with orbital magnetic moments, their direct observation has remained challenging. Here, we present the discovery of two distinct terahertz orbitally-coupled magnon resonances in the topological kagome ferromagnet Co3Sn2S2. Using time-resolved Kerr rotation spectroscopy, we pinpoint two magnon resonances at 0.61 and 0.49 THz at 6 K, surpassing all previously reported magnon resonances in ferromagnets due to strong magnetocrystalline anisotropy. These dual modes originate from the strong coupling of localized spin and orbital magnetic moments. These findings unveil a novel category of magnons stemming from orbital magnetic moments, and position Co3Sn2S2 as a promising candidate for high-speed terahertz spintronic applications △ Less

Submitted 18 August, 2024; originally announced August 2024.

arXiv:2408.09174 [pdf, other]

TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Authors: Xianjie Wu, Jian Yang, Linzheng Chai, Ge Zhang, Jiaheng Liu, Xinrun Du, Di Liang, Daixin Shu, Xianfu Cheng, Tianzhen Sun, Guanglin Niu, Tongliang Li, Zhoujun Li

Abstract: Recent advancements in Large Language Models (LLMs) have markedly enhanced the interpretation and processing of tabular data, introducing previously unimaginable capabilities. Despite these achievements, LLMs still encounter significant challenges when applied in industrial scenarios, particularly due to the increased complexity of reasoning required with real-world tabular data, underscoring a no… ▽ More Recent advancements in Large Language Models (LLMs) have markedly enhanced the interpretation and processing of tabular data, introducing previously unimaginable capabilities. Despite these achievements, LLMs still encounter significant challenges when applied in industrial scenarios, particularly due to the increased complexity of reasoning required with real-world tabular data, underscoring a notable disparity between academic benchmarks and practical applications. To address this discrepancy, we conduct a detailed investigation into the application of tabular data in industrial scenarios and propose a comprehensive and complex benchmark TableBench, including 18 fields within four major categories of table question answering (TableQA) capabilities. Furthermore, we introduce TableLLM, trained on our meticulously constructed training set TableInstruct, achieving comparable performance with GPT-3.5. Massive experiments conducted on TableBench indicate that both open-source and proprietary LLMs still have significant room for improvement to meet real-world demands, where the most advanced model, GPT-4, achieves only a modest score compared to humans. △ Less

Submitted 17 August, 2024; originally announced August 2024.

Comments: 12 pages

arXiv:2408.09155 [pdf, other]

Learning Robust Treatment Rules for Censored Data

Authors: Yifan Cui, Junyi Liu, Tao Shen, Zhengling Qi, Xi Chen

Abstract: There is a fast-growing literature on estimating optimal treatment rules directly by maximizing the expected outcome. In biomedical studies and operations applications, censored survival outcome is frequently observed, in which case the restricted mean survival time and survival probability are of great interest. In this paper, we propose two robust criteria for learning optimal treatment rules wi… ▽ More There is a fast-growing literature on estimating optimal treatment rules directly by maximizing the expected outcome. In biomedical studies and operations applications, censored survival outcome is frequently observed, in which case the restricted mean survival time and survival probability are of great interest. In this paper, we propose two robust criteria for learning optimal treatment rules with censored survival outcomes; the former one targets at an optimal treatment rule maximizing the restricted mean survival time, where the restriction is specified by a given quantile such as median; the latter one targets at an optimal treatment rule maximizing buffered survival probabilities, where the predetermined threshold is adjusted to account the restricted mean survival time. We provide theoretical justifications for the proposed optimal treatment rules and develop a sampling-based difference-of-convex algorithm for learning them. In simulation studies, our estimators show improved performance compared to existing methods. We also demonstrate the proposed method using AIDS clinical trial data. △ Less

Submitted 17 August, 2024; originally announced August 2024.

arXiv:2408.08826 [pdf, other]

Search for the rare decay $J/ψ\to γD^0+c.c.$ at BESIII

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (642 additional authors not shown)

Abstract: Using $(10087\pm44)\times10^6J/ψ$ events collected with the BESIII detector, we search for the rare decay $J/ψ\to γD^0+c.c.$ for the first time. No obvious signal is observed and the upper limit on the branching fraction is determined to be ${\cal B}(J/ψ\to γD^{0}+c.c.)< 9.1 \times 10^{-8}$ at 90\% confidence level. Using $(10087\pm44)\times10^6J/ψ$ events collected with the BESIII detector, we search for the rare decay $J/ψ\to γD^0+c.c.$ for the first time. No obvious signal is observed and the upper limit on the branching fraction is determined to be ${\cal B}(J/ψ\to γD^{0}+c.c.)< 9.1 \times 10^{-8}$ at 90\% confidence level. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.08754 [pdf, other]

SE-SGformer: A Self-Explainable Signed Graph Transformer for Link Sign Prediction

Authors: Lu Li, Jiale Liu, Xingyu Ji, Maojun Wang, Zeyu Zhang

Abstract: Signed Graph Neural Networks (SGNNs) have been shown to be effective in analyzing complex patterns in real-world situations where positive and negative links coexist. However, SGNN models suffer from poor explainability, which limit their adoptions in critical scenarios that require understanding the rationale behind predictions. To the best of our knowledge, there is currently no research work on… ▽ More Signed Graph Neural Networks (SGNNs) have been shown to be effective in analyzing complex patterns in real-world situations where positive and negative links coexist. However, SGNN models suffer from poor explainability, which limit their adoptions in critical scenarios that require understanding the rationale behind predictions. To the best of our knowledge, there is currently no research work on the explainability of the SGNN models. Our goal is to address the explainability of decision-making for the downstream task of link sign prediction specific to signed graph neural networks. Since post-hoc explanations are not derived directly from the models, they may be biased and misrepresent the true explanations. Therefore, in this paper we introduce a Self-Explainable Signed Graph transformer (SE-SGformer) framework, which can not only outputs explainable information while ensuring high prediction accuracy. Specifically, We propose a new Transformer architecture for signed graphs and theoretically demonstrate that using positional encoding based on signed random walks has greater expressive power than current SGNN methods and other positional encoding graph Transformer-based approaches. We constructs a novel explainable decision process by discovering the $K$-nearest (farthest) positive (negative) neighbors of a node to replace the neural network-based decoder for predicting edge signs. These $K$ positive (negative) neighbors represent crucial information about the formation of positive (negative) edges between nodes and thus can serve as important explanatory information in the decision-making process. We conducted experiments on several real-world datasets to validate the effectiveness of SE-SGformer, which outperforms the state-of-the-art methods by improving 2.2\% prediction accuracy and 73.1\% explainablity accuracy in the best-case scenario. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.08746 [pdf, other]

Accelerating Iteratively Linear Detectors in Multi-User (ELAA-)MIMO Systems with UW-SVD

Authors: Jiuyu Liu, Yi Ma, Jinfei Wang, Rahim Tafazolli

Abstract: Current iterative multiple-input multiple-output (MIMO) detectors suffer from slow convergence when the wireless channel is ill-conditioned. The ill-conditioning is mainly caused by spatial correlation between channel columns corresponding to the same user equipment, known as intra-user interference. In addition, in the emerging MIMO systems using an extremely large aperture array (ELAA), spatial… ▽ More Current iterative multiple-input multiple-output (MIMO) detectors suffer from slow convergence when the wireless channel is ill-conditioned. The ill-conditioning is mainly caused by spatial correlation between channel columns corresponding to the same user equipment, known as intra-user interference. In addition, in the emerging MIMO systems using an extremely large aperture array (ELAA), spatial non-stationarity can make the channel even more ill-conditioned. In this paper, user-wise singular value decomposition (UW-SVD) is proposed to accelerate the convergence of iterative MIMO detectors. Its basic principle is to perform SVD on each user's sub-channel matrix to eliminate intra-user interference. Then, the MIMO signal model is effectively transformed into an equivalent signal (e-signal) model, comprising an e-channel matrix and an e-signal vector. Existing iterative algorithms can be used to recover the e-signal vector, which undergoes post-processing to obtain the signal vector. It is proven that the e-channel matrix is better conditioned than the original MIMO channel for spatially correlated (ELAA-)MIMO channels. This implies that UW-SVD can accelerate current iterative algorithms, which is confirmed by our simulation results. Specifically, it can speed up convergence by up to 10 times in both uncoded and coded systems. △ Less

Submitted 16 August, 2024; originally announced August 2024.

Comments: This work has been accepted by IEEE Transactions on Wireless Communications. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2408.08671 [pdf, other]

Towards Physical World Backdoor Attacks against Skeleton Action Recognition

Authors: Qichen Zheng, Yi Yu, Siyuan Yang, Jun Liu, Kwok-Yan Lam, Alex Kot

Abstract: Skeleton Action Recognition (SAR) has attracted significant interest for its efficient representation of the human skeletal structure. Despite its advancements, recent studies have raised security concerns in SAR models, particularly their vulnerability to adversarial attacks. However, such strategies are limited to digital scenarios and ineffective in physical attacks, limiting their real-world a… ▽ More Skeleton Action Recognition (SAR) has attracted significant interest for its efficient representation of the human skeletal structure. Despite its advancements, recent studies have raised security concerns in SAR models, particularly their vulnerability to adversarial attacks. However, such strategies are limited to digital scenarios and ineffective in physical attacks, limiting their real-world applicability. To investigate the vulnerabilities of SAR in the physical world, we introduce the Physical Skeleton Backdoor Attacks (PSBA), the first exploration of physical backdoor attacks against SAR. Considering the practicalities of physical execution, we introduce a novel trigger implantation method that integrates infrequent and imperceivable actions as triggers into the original skeleton data. By incorporating a minimal amount of this manipulated data into the training set, PSBA enables the system misclassify any skeleton sequences into the target class when the trigger action is present. We examine the resilience of PSBA in both poisoned and clean-label scenarios, demonstrating its efficacy across a range of datasets, poisoning ratios, and model architectures. Additionally, we introduce a trigger-enhancing strategy to strengthen attack performance in the clean label setting. The robustness of PSBA is tested against three distinct backdoor defenses, and the stealthiness of PSBA is evaluated using two quantitative metrics. Furthermore, by employing a Kinect V2 camera, we compile a dataset of human actions from the real world to mimic physical attack situations, with our findings confirming the effectiveness of our proposed attacks. Our project website can be found at https://qichenzheng.github.io/psba-website. △ Less

Submitted 16 August, 2024; originally announced August 2024.

Comments: Accepted by ECCV 2024

arXiv:2408.08642 [pdf, other]

The Power of Bias: Optimizing Client Selection in Federated Learning with Heterogeneous Differential Privacy

Authors: Jiating Ma, Yipeng Zhou, Qi Li, Quan Z. Sheng, Laizhong Cui, Jiangchuan Liu

Abstract: To preserve the data privacy, the federated learning (FL) paradigm emerges in which clients only expose model gradients rather than original data for conducting model training. To enhance the protection of model gradients in FL, differentially private federated learning (DPFL) is proposed which incorporates differentially private (DP) noises to obfuscate gradients before they are exposed. Yet, an… ▽ More To preserve the data privacy, the federated learning (FL) paradigm emerges in which clients only expose model gradients rather than original data for conducting model training. To enhance the protection of model gradients in FL, differentially private federated learning (DPFL) is proposed which incorporates differentially private (DP) noises to obfuscate gradients before they are exposed. Yet, an essential but largely overlooked problem in DPFL is the heterogeneity of clients' privacy requirement, which can vary significantly between clients and extremely complicates the client selection problem in DPFL. In other words, both the data quality and the influence of DP noises should be taken into account when selecting clients. To address this problem, we conduct convergence analysis of DPFL under heterogeneous privacy, a generic client selection strategy, popular DP mechanisms and convex loss. Based on convergence analysis, we formulate the client selection problem to minimize the value of loss function in DPFL with heterogeneous privacy, which is a convex optimization problem and can be solved efficiently. Accordingly, we propose the DPFL-BCS (biased client selection) algorithm. The extensive experiment results with real datasets under both convex and non-convex loss functions indicate that DPFL-BCS can remarkably improve model utility compared with the SOTA baselines. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.08600 [pdf, other]

MM-UNet: A Mixed MLP Architecture for Improved Ophthalmic Image Segmentation

Authors: Zunjie Xiao, Xiaoqing Zhang, Risa Higashita, Jiang Liu

Abstract: Ophthalmic image segmentation serves as a critical foundation for ocular disease diagnosis. Although fully convolutional neural networks (CNNs) are commonly employed for segmentation, they are constrained by inductive biases and face challenges in establishing long-range dependencies. Transformer-based models address these limitations but introduce substantial computational overhead. Recently, a s… ▽ More Ophthalmic image segmentation serves as a critical foundation for ocular disease diagnosis. Although fully convolutional neural networks (CNNs) are commonly employed for segmentation, they are constrained by inductive biases and face challenges in establishing long-range dependencies. Transformer-based models address these limitations but introduce substantial computational overhead. Recently, a simple yet efficient Multilayer Perceptron (MLP) architecture was proposed for image classification, achieving competitive performance relative to advanced transformers. However, its effectiveness for ophthalmic image segmentation remains unexplored. In this paper, we introduce MM-UNet, an efficient Mixed MLP model tailored for ophthalmic image segmentation. Within MM-UNet, we propose a multi-scale MLP (MMLP) module that facilitates the interaction of features at various depths through a grouping strategy, enabling simultaneous capture of global and local information. We conducted extensive experiments on both a private anterior segment optical coherence tomography (AS-OCT) image dataset and a public fundus image dataset. The results demonstrated the superiority of our MM-UNet model in comparison to state-of-the-art deep segmentation networks. △ Less

Submitted 16 August, 2024; originally announced August 2024.

Comments: OMIA2024

arXiv:2408.08575 [pdf, other]

Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs

Authors: Jinming Liu, Yuntao Wei, Junyan Lin, Shengyang Zhao, Heming Sun, Zhibo Chen, Wenjun Zeng, Xin Jin

Abstract: We present a new image compression paradigm to achieve ``intelligently coding for machine'' by cleverly leveraging the common sense of Large Multimodal Models (LMMs). We are motivated by the evidence that large language/multimodal models are powerful general-purpose semantics predictors for understanding the real world. Different from traditional image compression typically optimized for human eye… ▽ More We present a new image compression paradigm to achieve ``intelligently coding for machine'' by cleverly leveraging the common sense of Large Multimodal Models (LMMs). We are motivated by the evidence that large language/multimodal models are powerful general-purpose semantics predictors for understanding the real world. Different from traditional image compression typically optimized for human eyes, the image coding for machines (ICM) framework we focus on requires the compressed bitstream to more comply with different downstream intelligent analysis tasks. To this end, we employ LMM to \textcolor{red}{tell codec what to compress}: 1) first utilize the powerful semantic understanding capability of LMMs w.r.t object grounding, identification, and importance ranking via prompts, to disentangle image content before compression, 2) and then based on these semantic priors we accordingly encode and transmit objects of the image in order with a structured bitstream. In this way, diverse vision benchmarks including image classification, object detection, instance segmentation, etc., can be well supported with such a semantically structured bitstream. We dub our method ``\textit{SDComp}'' for ``\textit{S}emantically \textit{D}isentangled \textit{Comp}ression'', and compare it with state-of-the-art codecs on a wide variety of different vision tasks. SDComp codec leads to more flexible reconstruction results, promised decoded visual quality, and a more generic/satisfactory intelligent task-supporting ability. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.08567 [pdf, other]

doi 10.1109/JSTSP.2024.3446173

S$^3$Attention: Improving Long Sequence Attention with Smoothed Skeleton Sketching

Authors: Xue Wang, Tian Zhou, Jianqing Zhu, Jialin Liu, Kun Yuan, Tao Yao, Wotao Yin, Rong Jin, HanQin Cai

Abstract: Attention based models have achieved many remarkable breakthroughs in numerous applications. However, the quadratic complexity of Attention makes the vanilla Attention based models hard to apply to long sequence tasks. Various improved Attention structures are proposed to reduce the computation cost by inducing low rankness and approximating the whole sequence by sub-sequences. The most challengin… ▽ More Attention based models have achieved many remarkable breakthroughs in numerous applications. However, the quadratic complexity of Attention makes the vanilla Attention based models hard to apply to long sequence tasks. Various improved Attention structures are proposed to reduce the computation cost by inducing low rankness and approximating the whole sequence by sub-sequences. The most challenging part of those approaches is maintaining the proper balance between information preservation and computation reduction: the longer sub-sequences used, the better information is preserved, but at the price of introducing more noise and computational costs. In this paper, we propose a smoothed skeleton sketching based Attention structure, coined S$^3$Attention, which significantly improves upon the previous attempts to negotiate this trade-off. S$^3$Attention has two mechanisms to effectively minimize the impact of noise while keeping the linear complexity to the sequence length: a smoothing block to mix information over long sequences and a matrix sketching method that simultaneously selects columns and rows from the input matrix. We verify the effectiveness of S$^3$Attention both theoretically and empirically. Extensive studies over Long Range Arena (LRA) datasets and six time-series forecasting show that S$^3$Attention significantly outperforms both vanilla Attention and other state-of-the-art variants of Attention structures. △ Less

Submitted 23 August, 2024; v1 submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.08519 [pdf, other]

An inexact golden ratio primal-dual algorithm with linesearch step for a saddle point problem

Authors: Changjie Fang, Jinxiu Liu, Jingtao Qiu, Shenglan Chen

Abstract: In this paper, we propose an inexact golden ratio primal-dual algorithm with linesearch step(IP-GRPDAL) for solving the saddle point problems, where two subproblems can be approximately solved by applying the notations of inexact extended proximal operators with matrix norm. Our proposed IP-GRPDAL method allows for larger stepsizes by replacing the extrapolation step with a convex combination step… ▽ More In this paper, we propose an inexact golden ratio primal-dual algorithm with linesearch step(IP-GRPDAL) for solving the saddle point problems, where two subproblems can be approximately solved by applying the notations of inexact extended proximal operators with matrix norm. Our proposed IP-GRPDAL method allows for larger stepsizes by replacing the extrapolation step with a convex combination step. Each iteration of the linesearch requires to update only the dual variable, and hence it is quite cheap. In addition, we prove convergence of the proposed algorithm and show an O(1/N) ergodic convergence rate for our algorithm, where N represents the number of iterations. When one of the component functions is strongly convex, the accelerated O(1/N2) convergence rate results are established by choosing adaptively some algorithmic parameters. Furthermore, when both component functions are strongy convex, the linear convergence rate results are achieved. Numerical simulation results on the sparse recovery and image deblurring problems illustrate the feasibility and efficiency of our inexact algorithms. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2408.08505 [pdf, ps, other]

Fluctuations in Wasserstein dynamics on Graphs

Authors: Yuan Gao, Wuchen Li, Jian-Guo Liu

Abstract: In this paper, we propose a drift-diffusion process on the probability simplex to study stochastic fluctuations in probability spaces. We construct a counting process for linear detailed balanced chemical reactions with finite species such that its thermodynamic limit is a system of ordinary differential equations (ODE) on the probability simplex. This ODE can be formulated as a gradient flow with… ▽ More In this paper, we propose a drift-diffusion process on the probability simplex to study stochastic fluctuations in probability spaces. We construct a counting process for linear detailed balanced chemical reactions with finite species such that its thermodynamic limit is a system of ordinary differential equations (ODE) on the probability simplex. This ODE can be formulated as a gradient flow with an Onsager response matrix that induces a Riemannian metric on the probability simplex. After incorporating the induced Riemannian structure, we propose a diffusion approximation of the rescaled counting process for molecular species in the chemical reactions, which leads to Langevin dynamics on the probability simplex with a degenerate Brownian motion constructed from the eigen-decomposition of Onsager's response matrix. The corresponding Fokker-Planck equation on the simplex can be regarded as the usual drift-diffusion equation with the extrinsic representation of the divergence and Laplace-Beltrami operator. The invariant measure is the Gibbs measure, which is the product of the original macroscopic free energy and a volume element. When the drift term vanishes, the Fokker-Planck equation reduces to the heat equation with the Laplace-Beltrami operator, which we refer to as canonical Wasserstein diffusions on graphs. In the case of a two-point probability simplex, the constructed diffusion process is converted to one dimensional Wright-Fisher diffusion process, which leads to a natural boundary condition ensuring that the process remains within the probability simplex. △ Less

Submitted 15 August, 2024; originally announced August 2024.

Comments: 24 pages

arXiv:2408.08442 [pdf, other]

A semi-centralized multi-agent RL framework for efficient irrigation scheduling

Authors: Bernard T. Agyeman, Benjamin Decard-Nelson, Jinfeng Liu, Sirish L. Shah

Abstract: This paper proposes a Semi-Centralized Multi-Agent Reinforcement Learning (SCMARL) approach for irrigation scheduling in spatially variable agricultural fields, where management zones address spatial variability. The SCMARL framework is hierarchical in nature, with a centralized coordinator agent at the top level and decentralized local agents at the second level. The coordinator agent makes daily… ▽ More This paper proposes a Semi-Centralized Multi-Agent Reinforcement Learning (SCMARL) approach for irrigation scheduling in spatially variable agricultural fields, where management zones address spatial variability. The SCMARL framework is hierarchical in nature, with a centralized coordinator agent at the top level and decentralized local agents at the second level. The coordinator agent makes daily binary irrigation decisions based on field-wide conditions, which are communicated to the local agents. Local agents determine appropriate irrigation amounts for specific management zones using local conditions. The framework employs state augmentation approach to handle non-stationarity in the local agents' environments. An extensive evaluation on a large-scale field in Lethbridge, Canada, compares the SCMARL approach with a learning-based multi-agent model predictive control scheduling approach, highlighting its enhanced performance, resulting in water conservation and improved Irrigation Water Use Efficiency (IWUE). Notably, the proposed approach achieved a 4.0% savings in irrigation water while enhancing the IWUE by 6.3%. △ Less

Submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.08143 [pdf, other]

Unlearnable Examples Detection via Iterative Filtering

Authors: Yi Yu, Qichen Zheng, Siyuan Yang, Wenhan Yang, Jun Liu, Shijian Lu, Yap-Peng Tan, Kwok-Yan Lam, Alex Kot

Abstract: Deep neural networks are proven to be vulnerable to data poisoning attacks. Recently, a specific type of data poisoning attack known as availability attacks has led to the failure of data utilization for model learning by adding imperceptible perturbations to images. Consequently, it is quite beneficial and challenging to detect poisoned samples, also known as Unlearnable Examples (UEs), from a mi… ▽ More Deep neural networks are proven to be vulnerable to data poisoning attacks. Recently, a specific type of data poisoning attack known as availability attacks has led to the failure of data utilization for model learning by adding imperceptible perturbations to images. Consequently, it is quite beneficial and challenging to detect poisoned samples, also known as Unlearnable Examples (UEs), from a mixed dataset. In response, we propose an Iterative Filtering approach for UEs identification. This method leverages the distinction between the inherent semantic mapping rules and shortcuts, without the need for any additional information. We verify that when training a classifier on a mixed dataset containing both UEs and clean data, the model tends to quickly adapt to the UEs compared to the clean data. Due to the accuracy gaps between training with clean/poisoned samples, we employ a model to misclassify clean samples while correctly identifying the poisoned ones. The incorporation of additional classes and iterative refinement enhances the model's ability to differentiate between clean and poisoned samples. Extensive experiments demonstrate the superiority of our method over state-of-the-art detection approaches across various attacks, datasets, and poison ratios, significantly reducing the Half Total Error Rate (HTER) compared to existing methods. △ Less

Submitted 15 August, 2024; originally announced August 2024.

Comments: Accepted by ICANN 2024

arXiv:2408.08072 [pdf, other]

I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm

Authors: Yiming Liang, Ge Zhang, Xingwei Qu, Tianyu Zheng, Jiawei Guo, Xinrun Du, Zhenzhu Yang, Jiaheng Liu, Chenghua Lin, Lei Ma, Wenhao Huang, Jiajun Zhang

Abstract: Large Language Models (LLMs) have achieved significant advancements, however, the common learning paradigm treats LLMs as passive information repositories, neglecting their potential for active learning and alignment. Some approaches train LLMs using their own generated synthetic data, exploring the possibility of active alignment. However, there is still a huge gap between these one-time alignmen… ▽ More Large Language Models (LLMs) have achieved significant advancements, however, the common learning paradigm treats LLMs as passive information repositories, neglecting their potential for active learning and alignment. Some approaches train LLMs using their own generated synthetic data, exploring the possibility of active alignment. However, there is still a huge gap between these one-time alignment methods and the continuous automatic alignment of humans. In this paper, we introduce \textbf{I-SHEEP}, an \textbf{I}terative \textbf{S}elf-En\textbf{H}anc\textbf{E}m\textbf{E}nt \textbf{P}aradigm.This human-like paradigm enables LLMs to \textbf{continuously self-align from scratch with nothing}. Compared to the one-time alignment method Dromedary \cite{sun2023principledriven}, which refers to the first iteration in this paper, I-SHEEP can significantly enhance capacities on both Qwen and Llama models. I-SHEEP achieves a maximum relative improvement of 78.2\% in the Alpaca Eval, 24.0\% in the MT Bench, and an absolute increase of 8.88\% in the IFEval accuracy over subsequent iterations in Qwen-1.5 72B model. Additionally, I-SHEEP surpasses the base model in various standard benchmark generation tasks, achieving an average improvement of 24.77\% in code generation tasks, 12.04\% in TrivialQA, and 20.29\% in SQuAD. We also provide new insights based on the experiment results. Our codes, datasets, and models are available at \textbf{https://anonymous.4open.science/r/I-SHEEP}. △ Less

Submitted 27 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.08011 [pdf, other]

Intensity correlations in measurement-device-independent quantum key distribution

Authors: Junxuan Liu, Tianyi Xing, Ruiyin Liu, Zihao Chen, Hao Tan, Anqi Huang

Abstract: The intensity correlations due to imperfect modulation during the quantum-state preparation in a measurement-device-independent quantum key distribution (MDI QKD) system compromise its security performance. Therefore, it is crucial to assess the impact of intensity correlations on the practical security of MDI QKD systems. In this work, we propose a theoretical model that quantitatively analyzes t… ▽ More The intensity correlations due to imperfect modulation during the quantum-state preparation in a measurement-device-independent quantum key distribution (MDI QKD) system compromise its security performance. Therefore, it is crucial to assess the impact of intensity correlations on the practical security of MDI QKD systems. In this work, we propose a theoretical model that quantitatively analyzes the secure key rate of MDI QKD systems under intensity correlations. Furthermore, we apply the theoretical model to a practical MDI QKD system with measured intensity correlations, which shows that the system struggles to generate keys efficiently under this model. We also explore the boundary conditions of intensity correlations to generate secret keys. This study extends the security analysis of intensity correlations to MDI QKD protocols, providing a methodology to evaluate the practical security of MDI QKD systems. △ Less

Submitted 18 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.07960 [pdf, other]

Characterization of Intensity Correlation via Single-photon Detection in Quantum Key Distribution

Authors: Tianyi Xing, Junxuan Liu, Likang Zhang, Min-Yan Wang, Yu-Huai Li, Ruiyin Liu, Qingquan Peng, Dongyang Wang, Yaxuan Wang, Hongwei Liu, Wei Li, Yuan Cao, Anqi Huang

Abstract: One of the most significant vulnerabilities in the source unit of quantum key distribution (QKD) is the correlation between quantum states after modulation, which shall be characterized and evaluated for its practical security performance. In this work, we propose a methodology to characterize the intensity correlation according to the single-photon detection results in the measurement unit withou… ▽ More One of the most significant vulnerabilities in the source unit of quantum key distribution (QKD) is the correlation between quantum states after modulation, which shall be characterized and evaluated for its practical security performance. In this work, we propose a methodology to characterize the intensity correlation according to the single-photon detection results in the measurement unit without modifying the configuration of the QKD system. In contrast to the previous research that employs extra classical optical detector to measure the correlation, our method can directly analyse the detection data generated during the raw key exchange, enabling to characterize the feature of correlation in real-time system operation. The basic method is applied to a BB84 QKD system and the characterized correlation decreases the secure key rate shown by the security proof. Furthermore, the method is extended and applied to characterize the correlation from the result of Bell-state measurement, which demonstrates its applicability to a running full-scheme MDI QKD system. This study provides an approach for standard certification of a QKD system. △ Less

Submitted 18 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.07954 [pdf, ps, other]

A Complex Scaling Method for Efficient and Accurate Scattering Emulation in Nuclear Reactions

Authors: Junzhe Liu, Jin Lei, Zhongzhou Ren

Abstract: We present a novel scattering emulator utilizing the complex scaling method to enhance nuclear reaction analysis. This approach leverages a single set of reduced bases, allowing for efficient and simultaneous emulation across multiple channels and potential parameters, significantly reducing computational storage and accelerating calculations. Demonstrated through $n$+$^{40}$Ca and $^{11}$Be… ▽ More We present a novel scattering emulator utilizing the complex scaling method to enhance nuclear reaction analysis. This approach leverages a single set of reduced bases, allowing for efficient and simultaneous emulation across multiple channels and potential parameters, significantly reducing computational storage and accelerating calculations. Demonstrated through $n$+$^{40}$Ca and $^{11}$Be+$^{64}$Zn elastic scattering, our method achieves high accuracy and efficiency. This emulator exhibits stable and reliable performance without anomalies inherent in other techniques, showcasing its robustness. △ Less

Submitted 15 August, 2024; originally announced August 2024.

Comments: 5 pages, 3 figures

arXiv:2408.07852 [pdf, other]

Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability

Authors: Jiri Hron, Laura Culp, Gamaleldin Elsayed, Rosanne Liu, Ben Adlam, Maxwell Bileschi, Bernd Bohnet, JD Co-Reyes, Noah Fiedel, C. Daniel Freeman, Izzeddin Gur, Kathleen Kenealy, Jaehoon Lee, Peter J. Liu, Gaurav Mishra, Igor Mordatch, Azade Nova, Roman Novak, Aaron Parisi, Jeffrey Pennington, Alex Rizkowsky, Isabelle Simpson, Hanie Sedghi, Jascha Sohl-dickstein, Kevin Swersky , et al. (6 additional authors not shown)

Abstract: While many capabilities of language models (LMs) improve with increased training budget, the influence of scale on hallucinations is not yet fully understood. Hallucinations come in many forms, and there is no universally accepted definition. We thus focus on studying only those hallucinations where a correct answer appears verbatim in the training set. To fully control the training data content,… ▽ More While many capabilities of language models (LMs) improve with increased training budget, the influence of scale on hallucinations is not yet fully understood. Hallucinations come in many forms, and there is no universally accepted definition. We thus focus on studying only those hallucinations where a correct answer appears verbatim in the training set. To fully control the training data content, we construct a knowledge graph (KG)-based dataset, and use it to train a set of increasingly large LMs. We find that for a fixed dataset, larger and longer-trained LMs hallucinate less. However, hallucinating on $\leq5$% of the training data requires an order of magnitude larger model, and thus an order of magnitude more compute, than Hoffmann et al. (2022) reported was optimal. Given this costliness, we study how hallucination detectors depend on scale. While we see detector size improves performance on fixed LM's outputs, we find an inverse relationship between the scale of the LM and the detectability of its hallucinations. △ Less

Submitted 14 August, 2024; originally announced August 2024.

Comments: Published at COLM 2024. 16 pages, 11 figures

arXiv:2408.07759 [pdf, other]

SWaT: Statistical Modeling of Video Watch Time through User Behavior Analysis

Authors: Shentao Yang, Haichuan Yang, Linna Du, Adithya Ganesh, Bo Peng, Boying Liu, Serena Li, Ji Liu

Abstract: The significance of estimating video watch time has been highlighted by the rising importance of (short) video recommendation, which has become a core product of mainstream social media platforms. Modeling video watch time, however, has been challenged by the complexity of user-video interaction, such as different user behavior modes in watching the recommended videos and varying watching probabil… ▽ More The significance of estimating video watch time has been highlighted by the rising importance of (short) video recommendation, which has become a core product of mainstream social media platforms. Modeling video watch time, however, has been challenged by the complexity of user-video interaction, such as different user behavior modes in watching the recommended videos and varying watching probabilities over the video horizon. Despite the importance and challenges, existing literature on modeling video watch time mostly focuses on relatively black-box mechanical enhancement of the classical regression/classification losses, without factoring in user behavior in a principled manner. In this paper, we for the first time take on a user-centric perspective to model video watch time, from which we propose a white-box statistical framework that directly translates various user behavior assumptions in watching (short) videos into statistical watch time models. These behavior assumptions are portrayed by our domain knowledge on users' behavior modes in video watching. We further employ bucketization to cope with user's non-stationary watching probability over the video horizon, which additionally helps to respect the constraint of video length and facilitate the practical compatibility between the continuous regression event of watch time and other binary classification events. We test our models extensively on two public datasets, a large-scale offline industrial dataset, and an online A/B test on a short video platform with hundreds of millions of daily-active users. On all experiments, our models perform competitively against strong relevant baselines, demonstrating the efficacy of our user-centric perspective and proposed framework. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2408.07719 [pdf, other]

Operator Feature Neural Network for Symbolic Regression

Authors: Yusong Deng, Min Wu, Lina Yu, Jingyi Liu, Shu Wei, Yanjie Li, Weijun Li

Abstract: Symbolic regression is a task aimed at identifying patterns in data and representing them through mathematical expressions, generally involving skeleton prediction and constant optimization. Many methods have achieved some success, however they treat variables and symbols merely as characters of natural language without considering their mathematical essence. This paper introduces the operator fea… ▽ More Symbolic regression is a task aimed at identifying patterns in data and representing them through mathematical expressions, generally involving skeleton prediction and constant optimization. Many methods have achieved some success, however they treat variables and symbols merely as characters of natural language without considering their mathematical essence. This paper introduces the operator feature neural network (OF-Net) which employs operator representation for expressions and proposes an implicit feature encoding method for the intrinsic mathematical operational logic of operators. By substituting operator features for numeric loss, we can predict the combination of operators of target expressions. We evaluate the model on public datasets, and the results demonstrate that the model achieves superior recovery rates and high $R^2$ scores. With the discussion of the results, we analyze the merit and demerit of OF-Net and propose optimizing schemes. △ Less

Submitted 14 August, 2024; originally announced August 2024.

Comments: 12 pages

arXiv:2408.07641 [pdf, other]

Exploring New Physics with PandaX-4T Low Energy Electronic Recoil Data

Authors: PandaX Collaboration, Xinning Zeng, Zihao Bo, Wei Chen, Xun Chen, Yunhua Chen, Zhaokan Cheng, Xiangyi Cui, Yingjie Fan, Deqing Fang, Zhixing Gao, Lisheng Geng, Karl Giboni, Xunan Guo, Xuyuan Guo, Zichao Guo, Chencheng Han, Ke HanChangda He, Jinrong He, Di Huang, Houqi Huang, Junting Huang, Ruquan Hou, Yu Hou, Xiangdong Ji , et al. (76 additional authors not shown)

Abstract: New particles beyond the Standard Model of particle physics, such as axions, can be effectively searched through their interactions with electrons. We use the large liquid xenon detector PandaX-4T to search for novel electronic recoil signals induced by solar axions, neutrinos with anomalous magnetic moment, axion-like particles, dark photons, and light fermionic dark matter. A detailed background… ▽ More New particles beyond the Standard Model of particle physics, such as axions, can be effectively searched through their interactions with electrons. We use the large liquid xenon detector PandaX-4T to search for novel electronic recoil signals induced by solar axions, neutrinos with anomalous magnetic moment, axion-like particles, dark photons, and light fermionic dark matter. A detailed background model is established with the latest datasets with 1.54 $\rm tonne \cdot year$ exposure. No significant excess above the background has been observed, and we have obtained competitive constraints for axion couplings, neutrino magnetic moment, and fermionic dark matter interactions. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2408.07367 [pdf, other]

Risk Occupancy: A New and Efficient Paradigm through Vehicle-Road-Cloud Collaboration

Authors: Jiaxing Chen, Wei Zhong, Bolin Gao, Yifei Liu, Hengduo Zou, Jiaxi Liu, Yanbo Lu, Jin Huang, Zhihua Zhong

Abstract: This study introduces the 4D Risk Occupancy within a vehicle-road-cloud architecture, integrating the road surface spatial, risk, and temporal dimensions, and endowing the algorithm with beyond-line-of-sight, all-angles, and efficient abilities. The algorithm simplifies risk modeling by focusing on directly observable information and key factors, drawing on the concept of Occupancy Grid Maps (OGM)… ▽ More This study introduces the 4D Risk Occupancy within a vehicle-road-cloud architecture, integrating the road surface spatial, risk, and temporal dimensions, and endowing the algorithm with beyond-line-of-sight, all-angles, and efficient abilities. The algorithm simplifies risk modeling by focusing on directly observable information and key factors, drawing on the concept of Occupancy Grid Maps (OGM), and incorporating temporal prediction to effectively map current and future risk occupancy. Compared to conventional driving risk fields and grid occupancy maps, this algorithm can map global risks more efficiently, simply, and reliably. It can integrate future risk information, adapting to dynamic traffic environments. The 4D Risk Occupancy also unifies the expression of BEV detection and lane line detection results, enhancing the intuitiveness and unity of environmental perception. Using DAIR-V2X data, this paper validates the 4D Risk Occupancy algorithm and develops a local path planning model based on it. Qualitative experiments under various road conditions demonstrate the practicality and robustness of this local path planning model. Quantitative analysis shows that the path planning based on risk occupation significantly improves trajectory planning performance, increasing safety redundancy by 12.5% and reducing average deceleration by 5.41% at an initial braking speed of 8 m/s, thereby improving safety and comfort. This work provides a new global perception method and local path planning method through Vehicle-Road-Cloud architecture, offering a new perceptual paradigm for achieving safer and more efficient autonomous driving. △ Less

Submitted 17 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

Comments: 13 pages,9 figures

Showing 51–100 of 10,486 results for author: Liu, J