Skip to main content

Showing 1–50 of 871 results for author: Gao, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13252  [pdf, other

    cs.CV

    Unveiling Structural Memorization: Structural Membership Inference Attack for Text-to-Image Diffusion Models

    Authors: Qiao Li, Xiaomeng Fu, Xi Wang, Jin Liu, Xingyu Gao, Jiao Dai, Jizhong Han

    Abstract: With the rapid advancements of large-scale text-to-image diffusion models, various practical applications have emerged, bringing significant convenience to society. However, model developers may misuse the unauthorized data to train diffusion models. These data are at risk of being memorized by the models, thus potentially violating citizens' privacy rights. Therefore, in order to judge whether a… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.12768  [pdf, other

    quant-ph cs.CC cs.IT math-ph physics.atom-ph

    A polynomial-time classical algorithm for noisy quantum circuits

    Authors: Thomas Schuster, Chao Yin, Xun Gao, Norman Y. Yao

    Abstract: We provide a polynomial-time classical algorithm for noisy quantum circuits. The algorithm computes the expectation value of any observable for any circuit, with a small average error over input states drawn from an ensemble (e.g. the computational basis). Our approach is based upon the intuition that noise exponentially damps non-local correlations relative to local correlations. This enables one… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures + 22 page Supplementary Information

  3. arXiv:2407.12317  [pdf, other

    cs.CV

    Out of Length Text Recognition with Sub-String Matching

    Authors: Yongkun Du, Zhineng Chen, Caiyan Jia, Xieping Gao, Yu-Gang Jiang

    Abstract: Scene Text Recognition (STR) methods have demonstrated robust performance in word-level text recognition. However, in applications the text image is sometimes long due to detected with multiple horizontal words. It triggers the requirement to build long text recognition models from readily available short word-level text datasets, which has been less studied previously. In this paper, we term this… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: preprint, 15 pages

  4. arXiv:2407.11011  [pdf, other

    cs.CR cs.CV cs.LG

    Toward Availability Attacks in 3D Point Clouds

    Authors: Yifan Zhu, Yibo Miao, Yinpeng Dong, Xiao-Shan Gao

    Abstract: Despite the great progress of 3D vision, data privacy and security issues in 3D deep learning are not explored systematically. In the domain of 2D images, many availability attacks have been proposed to prevent data from being illicitly learned by unauthorized deep models. However, unlike images represented on a fixed dimensional grid, point clouds are characterized as unordered and unstructured s… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

    Comments: ICML 2024, 21 pages

  5. arXiv:2407.10848  [pdf, other

    cs.IT eess.SP

    On the Spectral Efficiency of Multi-user Holographic MIMO Uplink Transmission

    Authors: Mengyu Qian, Li You, Xiang-Gen Xia, Xiqi Gao

    Abstract: With antenna spacing much less than half a wavelength in confined space, holographic multiple-input multiple-output (HMIMO) technology presents a promising frontier in next-generation mobile communication. We delve into the research of the multi-user uplink transmission with both the base station and the users equipped with holographic planar arrays. To begin, we construct an HMIMO channel model u… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 14 pages, 7 figures, to appear in IEEE Transactions on Wireless Communications

  6. arXiv:2407.10439  [pdf, other

    cs.CV

    PolyRoom: Room-aware Transformer for Floorplan Reconstruction

    Authors: Yuzhou Liu, Lingjie Zhu, Xiaodong Ma, Hanqiao Ye, Xiang Gao, Xianwei Zheng, Shuhan Shen

    Abstract: Reconstructing geometry and topology structures from raw unstructured data has always been an important research topic in indoor mapping research. In this paper, we aim to reconstruct the floorplan with a vectorized representation from point clouds. Despite significant advancements achieved in recent years, current methods still encounter several challenges, such as missing corners or edges, inacc… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  7. arXiv:2407.10281  [pdf, other

    cs.CV

    Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning

    Authors: Xinyuan Gao, Songlin Dong, Yuhang He, Qiang Wang, Yihong Gong

    Abstract: The problem of Rehearsal-Free Continual Learning (RFCL) aims to continually learn new knowledge while preventing forgetting of the old knowledge, without storing any old samples and prototypes. The latest methods leverage large-scale pre-trained models as the backbone and use key-query matching to generate trainable prompts to learn new knowledge. However, the domain gap between the pre-training d… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: ECCV2024

  8. arXiv:2407.10172  [pdf, other

    cs.CV

    Restoring Images in Adverse Weather Conditions via Histogram Transformer

    Authors: Shangquan Sun, Wenqi Ren, Xinwei Gao, Rui Wang, Xiaochun Cao

    Abstract: Transformer-based image restoration methods in adverse weather have achieved significant progress. Most of them use self-attention along the channel dimension or within spatially fixed-range blocks to reduce computational load. However, such a compromise results in limitations in capturing long-range spatial features. Inspired by the observation that the weather-induced degradation factors mainly… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 19 pages, 7 figures, 10MB

  9. arXiv:2407.07805  [pdf, other

    cs.CV

    SUMix: Mixup with Semantic and Uncertain Information

    Authors: Huafeng Qin, Xin Jin, Hongyu Zhu, Hongchao Liao, Mounîm A. El-Yacoubi, Xinbo Gao

    Abstract: Mixup data augmentation approaches have been applied for various tasks of deep learning to improve the generalization ability of deep neural networks. Some existing approaches CutMix, SaliencyMix, etc. randomly replace a patch in one image with patches from another to generate the mixed image. Similarly, the corresponding labels are linearly combined by a fixed ratio $λ$ by l. The objects in two i… ▽ More

    Submitted 17 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024 [Camera Ready] (19 pages, 7 figures) with the source code at https://github.com/JinXins/SUMix

  10. arXiv:2407.07520  [pdf, other

    cs.CV

    IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection

    Authors: Mingjin Zhang, Yuchun Wang, Jie Guo, Yunsong Li, Xinbo Gao, Jing Zhang

    Abstract: The recent Segment Anything Model (SAM) is a significant advancement in natural image segmentation, exhibiting potent zero-shot performance suitable for various downstream image segmentation tasks. However, directly utilizing the pretrained SAM for Infrared Small Target Detection (IRSTD) task falls short in achieving satisfying performance due to a notable domain gap between natural and infrared i… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 18 pages, 8 figures, to be published in ECCV2024

  11. arXiv:2407.07372  [pdf, other

    eess.IV cs.CV

    Trustworthy Contrast-enhanced Brain MRI Synthesis

    Authors: Jiyao Liu, Yuxin Li, Shangqi Gao, Yuncheng Zhou, Xin Gao, Ningsheng Xu, Xiao-Yong Zhang, Xiahai Zhuang

    Abstract: Contrast-enhanced brain MRI (CE-MRI) is a valuable diagnostic technique but may pose health risks and incur high costs. To create safer alternatives, multi-modality medical image translation aims to synthesize CE-MRI images from other available modalities. Although existing methods can generate promising predictions, they still face two challenges, i.e., exhibiting over-confidence and lacking inte… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures

  12. arXiv:2407.05965  [pdf, other

    cs.CV cs.AI cs.CL cs.CR cs.LG

    T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models

    Authors: Yibo Miao, Yifan Zhu, Yinpeng Dong, Lijia Yu, Jun Zhu, Xiao-Shan Gao

    Abstract: The recent development of Sora leads to a new era in text-to-video (T2V) generation. Along with this comes the rising concern about its security risks. The generated videos may contain illegal or unethical content, and there is a lack of comprehensive quantitative understanding of their safety, posing a challenge to their reliability and practical deployment. Previous evaluations primarily focus o… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  13. Ubiquitous Integrated Sensing and Communications for Massive MIMO LEO Satellite Systems

    Authors: Li You, Yongxiang Zhu, Xiaoyu Qiang, Christos G. Tsinos, Wenjin Wang, Xiqi Gao, Björn Ottersten

    Abstract: The next sixth generation (6G) networks are envisioned to integrate sensing and communications in a single system, thus greatly improving spectrum utilization and reducing hardware costs. Low earth orbit (LEO) satellite communications combined with massive multiple-input multiple-output (MIMO) technology holds significant promise in offering ubiquitous and seamless connectivity with high data rate… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 6 pages,4 figures

    Journal ref: IEEE Internet of Things Magazine, vol. 7, no. 4, pp. 30-35, Jul. 2024

  14. Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation

    Authors: Xiang Gao, Zhengbo Xu, Junhan Zhao, Jiaying Liu

    Abstract: Recently, large-scale text-to-image (T2I) diffusion models have emerged as a powerful tool for image-to-image translation (I2I), allowing open-domain image translation via user-provided text prompts. This paper proposes frequency-controlled diffusion model (FCDiffusion), an end-to-end diffusion-based framework that contributes a novel solution to text-guided I2I from a frequency-domain perspective… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI 2024)

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(3), 1824-1832

  15. arXiv:2407.02805  [pdf, other

    cs.SE cs.AI

    Efficient DNN-Powered Software with Fair Sparse Models

    Authors: Xuanqi Gao, Weipeng Jiang, Juan Zhai, Shiqing Ma, Xiaoyu Zhang, Chao Shen

    Abstract: With the emergence of the Software 3.0 era, there is a growing trend of compressing and integrating large models into software systems, with significant societal implications. Regrettably, in numerous instances, model compression techniques impact the fairness performance of these models and thus the ethical behavior of DNN-powered software. One of the most notable example is the Lottery Ticket Hy… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  16. arXiv:2407.01219  [pdf, other

    cs.CL

    Searching for Best Practices in Retrieval-Augmented Generation

    Authors: Xiaohua Wang, Zhenghua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qi Qian, Ruicheng Yin, Changze Lv, Xiaoqing Zheng, Xuanjing Huang

    Abstract: Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been proposed to enhance large language models through query-dependent retrievals, these approaches still suffer from their complex implementation and prolong… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  17. arXiv:2407.00752  [pdf, other

    cs.CV cs.AI

    Chest-Diffusion: A Light-Weight Text-to-Image Model for Report-to-CXR Generation

    Authors: Peng Huang, Xue Gao, Lihong Huang, Jing Jiao, Xiaokang Li, Yuanyuan Wang, Yi Guo

    Abstract: Text-to-image generation has important implications for generation of diverse and controllable images. Several attempts have been made to adapt Stable Diffusion (SD) to the medical domain. However, the large distribution difference between medical reports and natural texts, as well as high computational complexity in common stable diffusion limit the authenticity and feasibility of the generated m… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  18. Personalized Federated Continual Learning via Multi-granularity Prompt

    Authors: Hao Yu, Xin Yang, Xin Gao, Yan Kang, Hao Wang, Junbo Zhang, Tianrui Li

    Abstract: Personalized Federated Continual Learning (PFCL) is a new practical scenario that poses greater challenges in sharing and personalizing knowledge. PFCL not only relies on knowledge fusion for server aggregation at the global spatial-temporal perspective but also needs model improvement for each client according to the local requirements. Existing methods, whether in Personalized Federated Learning… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Accepted by KDD 2024 Research Track

  19. arXiv:2406.19583  [pdf, other

    cs.IT

    Interference Cancellation Information Geometry Approach for Massive MIMO Channel Estimation

    Authors: An-An Lu, Bingyan Liu, Xiqi Gao

    Abstract: In this paper, the interference cancellation information geometry approaches (IC-IGAs) for massive MIMO channel estimation are proposed. The proposed algorithms are low-complexity approximations of the minimum mean square error (MMSE) estimation. To illustrate the proposed algorithms, a unified framework of the information geometry approach for channel estimation and its geometric explanation are… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 38 pages, 9 figures

  20. arXiv:2406.19130  [pdf, other

    cs.CV

    Evidential Concept Embedding Models: Towards Reliable Concept Explanations for Skin Disease Diagnosis

    Authors: Yibo Gao, Zheyao Gao, Xin Gao, Yuanye Liu, Bomin Wang, Xiahai Zhuang

    Abstract: Due to the high stakes in medical decision-making, there is a compelling demand for interpretable deep learning methods in medical image analysis. Concept Bottleneck Models (CBM) have emerged as an active interpretable framework incorporating human-interpretable concepts into decision-making. However, their concept predictions may lack reliability when applied to clinical diagnosis, impeding conce… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: accepted by MICCAI 2024

  21. arXiv:2406.18938  [pdf, other

    cs.IR

    Towards Personalized Federated Multi-scenario Multi-task Recommendation

    Authors: Yue Ding, Yanbiao Ji, Xun Cai, Xin Xin, Xiaofeng Gao, Hongtao Lu

    Abstract: In modern recommender system applications, such as e-commerce, predicting multiple targets like click-through rate (CTR) and post-view click-through \& conversion rate (CTCVR) is common. Multi-task recommender systems are gaining traction in research and practical use. Existing multi-task recommender systems tackle diverse business scenarios, merging and modeling these scenarios unlocks shared kno… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  22. arXiv:2406.18873  [pdf, other

    cs.AR

    LayoutCopilot: An LLM-powered Multi-agent Collaborative Framework for Interactive Analog Layout Design

    Authors: Bingyang Liu, Haoyi Zhang, Xiaohan Gao, Zichen Kong, Xiyuan Tang, Yibo Lin, Runsheng Wang, Ru Huang

    Abstract: Analog layout design heavily involves interactive processes between humans and design tools. The tools are usually designed to use scripting commands or visualized buttons for manipulation, especially for those interactive automation functionalities, which have a steep learning curve and cumbersome user experience, making a notable barrier to their adoption by designers. Aiming to address such a u… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 8pages, 8figures

  23. arXiv:2406.18588  [pdf, other

    cs.CV cs.LG

    Varying Manifolds in Diffusion: From Time-varying Geometries to Visual Saliency

    Authors: Junhao Chen, Manyi Li, Zherong Pan, Xifeng Gao, Changhe Tu

    Abstract: Deep generative models learn the data distribution, which is concentrated on a low-dimensional manifold. The geometric analysis of distribution transformation provides a better understanding of data structure and enables a variety of applications. In this paper, we study the geometric properties of the diffusion model, whose forward diffusion process and reverse generation process construct a seri… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  24. arXiv:2406.18539  [pdf, other

    cs.CV cs.GR

    TexPainter: Generative Mesh Texturing with Multi-view Consistency

    Authors: Hongkun Zhang, Zherong Pan, Congyi Zhang, Lifeng Zhu, Xifeng Gao

    Abstract: The recent success of pre-trained diffusion models unlocks the possibility of the automatic generation of textures for arbitrary 3D meshes in the wild. However, these models are trained in the screen space, while converting them to a multi-view consistent texture image poses a major obstacle to the output quality. In this paper, we propose a novel method to enforce multi-view consistency. Our meth… ▽ More

    Submitted 17 May, 2024; originally announced June 2024.

    Comments: accepted by Siggraph 2024

  25. arXiv:2406.17260  [pdf, other

    cs.CL

    Mitigating Hallucination in Fictional Character Role-Play

    Authors: Nafis Sadeq, Zhouhang Xie, Byungkyu Kang, Prarit Lamba, Xiang Gao, Julian McAuley

    Abstract: Role-playing has wide-ranging applications in customer support, embodied agents, computational social science, etc. The influence of parametric world knowledge of large language models (LLMs) often causes role-playing characters to act out of character and hallucinate about things outside the scope of their knowledge. In this work, we focus on the evaluation and mitigation of hallucination in fict… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  26. arXiv:2406.16038  [pdf, other

    cs.CV

    LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control

    Authors: Delin Qu, Qizhi Chen, Pingrui Zhang, Xianqiang Gao, Bin Zhao, Dong Wang, Xuelong Li

    Abstract: This paper aims to advance the progress of physical world interactive scene reconstruction by extending the interactive object reconstruction from single object level to complex scene level. To this end, we first construct one simulated and one real scene-level physical interaction dataset containing 28 scenes with multiple interactive objects per scene. Furthermore, to accurately model the intera… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  27. arXiv:2406.14963  [pdf, other

    cs.LG

    Optimised Grouped-Query Attention Mechanism for Transformers

    Authors: Yuang Chen, Cheng Zhang, Xitong Gao, Robert D. Mullins, George A. Constantinides, Yiren Zhao

    Abstract: Grouped-query attention (GQA) has been widely adopted in LLMs to mitigate the complexity of multi-head attention (MHA). To transform an MHA to a GQA, neighbour queries in MHA are evenly split into groups where each group shares the value and key layers. In this work, we propose AsymGQA, an activation-informed approach to asymmetrically grouping an MHA to a GQA for better model performance. Our Asy… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML2024 ES-FoMo-II Workshop

  28. arXiv:2406.14956  [pdf, other

    cs.LG cs.CL

    Unlocking the Global Synergies in Low-Rank Adapters

    Authors: Zixi Zhang, Cheng Zhang, Xitong Gao, Robert D. Mullins, George A. Constantinides, Yiren Zhao

    Abstract: Low-rank Adaption (LoRA) has been the de-facto parameter-efficient fine-tuning technique for large language models. We present HeteroLoRA, a light-weight search algorithm that leverages zero-cost proxies to allocate the limited LoRA trainable parameters across the model for better fine-tuned performance. In addition to the allocation for the standard LoRA-adapted models, we also demonstrate the ef… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML2024 ES-FoMo-II Workshop

  29. arXiv:2406.13200  [pdf, other

    cs.LG

    RobGC: Towards Robust Graph Condensation

    Authors: Xinyi Gao, Hongzhi Yin, Tong Chen, Guanhua Ye, Wentao Zhang, Bin Cui

    Abstract: Graph neural networks (GNNs) have attracted widespread attention for their impressive capability of graph representation learning. However, the increasing prevalence of large-scale graphs presents a significant challenge for GNN training due to their computational demands, limiting the applicability of GNNs in various scenarios. In response to this challenge, graph condensation (GC) is proposed as… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  30. arXiv:2406.11145  [pdf, other

    cs.CV

    Federated Face Forgery Detection Learning with Personalized Representation

    Authors: Decheng Liu, Zhan Dang, Chunlei Peng, Nannan Wang, Ruimin Hu, Xinbo Gao

    Abstract: Deep generator technology can produce high-quality fake videos that are indistinguishable, posing a serious social threat. Traditional forgery detection methods directly centralized training on data and lacked consideration of information sharing in non-public video data scenarios and data privacy. Naturally, the federated learning strategy can be applied for privacy protection, which aggregates m… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: The code is publicly available

  31. arXiv:2406.10933  [pdf, other

    cs.CV

    Improving Adversarial Robustness via Decoupled Visual Representation Masking

    Authors: Decheng Liu, Tao Chen, Chunlei Peng, Nannan Wang, Ruimin Hu, Xinbo Gao

    Abstract: Deep neural networks are proven to be vulnerable to fine-designed adversarial examples, and adversarial defense algorithms draw more and more attention nowadays. Pre-processing based defense is a major strategy, as well as learning robust feature representation has been proven an effective way to boost generalization. However, existing defense works lack considering different depth-level visual fe… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: The code is publicly available

  32. arXiv:2406.10887  [pdf, other

    cs.CV

    Imperceptible Face Forgery Attack via Adversarial Semantic Mask

    Authors: Decheng Liu, Qixuan Su, Chunlei Peng, Nannan Wang, Xinbo Gao

    Abstract: With the great development of generative model techniques, face forgery detection draws more and more attention in the related field. Researchers find that existing face forgery models are still vulnerable to adversarial examples with generated pixel perturbations in the global image. These generated adversarial samples still can't achieve satisfactory performance because of the high detectability… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: The code is publicly available

  33. arXiv:2406.09822  [pdf, other

    cs.IT cs.CV cs.LG eess.IV eess.SP

    An I2I Inpainting Approach for Efficient Channel Knowledge Map Construction

    Authors: Zhenzhou Jin, Li You, Jue Wang, Xiang-Gen Xia, Xiqi Gao

    Abstract: Channel knowledge map (CKM) has received widespread attention as an emerging enabling technology for environment-aware wireless communications. It involves the construction of databases containing location-specific channel knowledge, which are then leveraged to facilitate channel state information (CSI) acquisition and transceiver design. In this context, a fundamental challenge lies in efficientl… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 15 pages, 11 figures

  34. arXiv:2406.09198  [pdf, other

    cs.CV

    CLIP-Driven Cloth-Agnostic Feature Learning for Cloth-Changing Person Re-Identification

    Authors: Shuang Li, Jiaxu Leng, Guozhang Li, Ji Gan, Haosheng chen, Xinbo Gao

    Abstract: Contrastive Language-Image Pre-Training (CLIP) has shown impressive performance in short-term Person Re-Identification (ReID) due to its ability to extract high-level semantic features of pedestrians, yet its direct application to Cloth-Changing Person Re-Identification (CC-ReID) faces challenges due to CLIP's image encoder overly focusing on clothes clues. To address this, we propose a novel fram… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  35. arXiv:2406.08098  [pdf, other

    cs.SE

    Scalable Defect Detection via Traversal on Code Graph

    Authors: Zhengyao Liu, Xitong Zhong, Xingjing Deng, Shuo Hong, Xiang Gao, Hailong Sun

    Abstract: Detecting defects and vulnerabilities in the early stage has long been a challenge in software engineering. Static analysis, a technique that inspects code without execution, has emerged as a key strategy to address this challenge. Among recent advancements, the use of graph-based representations, particularly Code Property Graph (CPG), has gained traction due to its comprehensive depiction of cod… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  36. arXiv:2406.06031  [pdf, other

    cs.IR

    A WT-ResNet based fault diagnosis model for the urban rail train transmission system

    Authors: Zuyu Cheng, Zhengcai Zhao, Yixiao Wang, Wentao Guo, Yufei Wang, Xiang Gao

    Abstract: This study presents a novel fault diagnosis model for urban rail transit systems based on Wavelet Transform Residual Neural Network (WT-ResNet). The model integrates the advantages of wavelet transform for feature extraction and ResNet for pattern recognition, offering enhanced diagnostic accuracy and robustness. Experimental results demonstrate the effectiveness of the proposed model in identifyi… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages,10 figures

  37. arXiv:2406.05366  [pdf, other

    cs.LG math.OC

    Regret Bounds for Episodic Risk-Sensitive Linear Quadratic Regulator

    Authors: Wenhao Xu, Xuefeng Gao, Xuedong He

    Abstract: Risk-sensitive linear quadratic regulator is one of the most fundamental problems in risk-sensitive optimal control. In this paper, we study online adaptive control of risk-sensitive linear quadratic regulator in the finite horizon episodic setting. We propose a simple least-squares greedy algorithm and show that it achieves $\widetilde{\mathcal{O}}(\log N)$ regret under a specific identifiability… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  38. arXiv:2406.05361  [pdf, other

    cs.CL

    Write Summary Step-by-Step: A Pilot Study of Stepwise Summarization

    Authors: Xiuying Chen, Shen Gao, Mingzhe Li, Qingqing Zhu, Xin Gao, Xiangliang Zhang

    Abstract: Nowadays, neural text generation has made tremendous progress in abstractive summarization tasks. However, most of the existing summarization models take in the whole document all at once, which sometimes cannot meet the needs in practice. Practically, social text streams such as news events and tweets keep growing from time to time, and can only be fed to the summarization system step by step. He… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 figures, published in TASLP

  39. arXiv:2406.05360  [pdf, other

    cs.CL

    Flexible and Adaptable Summarization via Expertise Separation

    Authors: Xiuying Chen, Mingzhe Li, Shen Gao, Xin Cheng, Qingqing Zhu, Rui Yan, Xin Gao, Xiangliang Zhang

    Abstract: A proficient summarization model should exhibit both flexibility -- the capacity to handle a range of in-domain summarization tasks, and adaptability -- the competence to acquire new knowledge and adjust to unseen out-of-domain tasks. Unlike large language models (LLMs) that achieve this through parameter scaling, we propose a more parameter-efficient approach in this study. Our motivation rests o… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 10 pages, 7 figures, published in SIGIR 2024

  40. arXiv:2406.05358  [pdf, other

    cs.LG math.OC

    Reinforcement Learning for Intensity Control: An Application to Choice-Based Network Revenue Management

    Authors: Huiling Meng, Ningyuan Chen, Xuefeng Gao

    Abstract: Intensity control is a type of continuous-time dynamic optimization problems with many important applications in Operations Research including queueing and revenue management. In this study, we adapt the reinforcement learning framework to intensity control using choice-based network revenue management as a case study, which is a classical problem in revenue management that features a large state… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  41. arXiv:2406.03603  [pdf, other

    cs.LG

    Alignment Calibration: Machine Unlearning for Contrastive Learning under Auditing

    Authors: Yihan Wang, Yiwei Lu, Guojun Zhang, Franziska Boenisch, Adam Dziedzic, Yaoliang Yu, Xiao-Shan Gao

    Abstract: Machine unlearning provides viable solutions to revoke the effect of certain training data on pre-trained model parameters. Existing approaches provide unlearning recipes for classification and generative models. However, a category of important machine learning models, i.e., contrastive learning (CL) methods, is overlooked. In this paper, we fill this gap by first proposing the framework of Machi… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  42. arXiv:2406.01302  [pdf

    cs.CV

    Pulmonary Embolism Mortality Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data

    Authors: Zhusi Zhong, Helen Zhang, Fayez H. Fayad, Andrew C. Lancaster, John Sollee, Shreyas Kulkarni, Cheng Ting Lin, Jie Li, Xinbo Gao, Scott Collins, Colin Greineder, Sun H. Ahn, Harrison X. Bai, Zhicheng Jiao, Michael K. Atalay

    Abstract: Purpose: Pulmonary embolism (PE) is a significant cause of mortality in the United States. The objective of this study is to implement deep learning (DL) models using Computed Tomography Pulmonary Angiography (CTPA), clinical data, and PE Severity Index (PESI) scores to predict PE mortality. Materials and Methods: 918 patients (median age 64 years, range 13-99 years, 52% female) with 3,978 CTPAs w… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  43. arXiv:2406.00639  [pdf, other

    cs.CV

    An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition

    Authors: Haojun Xu, Yan Gao, Jie Li, Xinbo Gao

    Abstract: Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training. Previous research has focused on aligning sequences' visual and semantic spatial distributions. However, these methods extract semantic features simply. They ignore that proper prompt design for rich and fine-grained action cues can provide robust repr… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 12 pages, 8 figures init commit

  44. arXiv:2406.00602  [pdf, other

    cs.SE cs.PL

    From Effectiveness to Efficiency: Comparative Evaluation of Code Generated by LCGMs for Bilingual Programming Questions

    Authors: Weipeng Jiang, Xuanqi Gao, Juan Zhai, Shiqing Ma, Xiaoyu Zhang, Chao Shen

    Abstract: Large Code Generation Models (LCGMs) have garnered significant attention and achieved promising results across various programming tasks. However, concerns arise regarding performance when using non-English prompts, as these models are primarily trained on English-centric corpora, and most programming language tokens resemble English. Existing benchmarks often rely on English programming questions… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 10 and a quarter pages, 6 figures

  45. arXiv:2406.00588  [pdf, other

    cs.LG cs.CR math.ST

    Generalization Bound and New Algorithm for Clean-Label Backdoor Attack

    Authors: Lijia Yu, Shuang Liu, Yibo Miao, Xiao-Shan Gao, Lijun Zhang

    Abstract: The generalization bound is a crucial theoretical tool for assessing the generalizability of learning methods and there exist vast literatures on generalizability of normal learning, adversarial learning, and data poisoning. Unlike other data poison attacks, the backdoor attack has the special property that the poisoned triggers are contained in both the training set and the test set and the purpo… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  46. arXiv:2405.19098  [pdf, other

    cs.LG cs.AI cs.CR cs.CV stat.ML

    Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior

    Authors: Shuyu Cheng, Yibo Miao, Yinpeng Dong, Xiao Yang, Xiao-Shan Gao, Jun Zhu

    Abstract: This paper studies the challenging black-box adversarial attack that aims to generate adversarial examples against a black-box model by only using output feedback of the model to input queries. Some previous methods improve the query efficiency by incorporating the gradient of a surrogate white-box model into query-based attacks due to the adversarial transferability. However, the localized gradie… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  47. arXiv:2405.19027  [pdf, other

    cs.DC

    A Dual-functional Blockchain Framework for Solving Distributed Optimization

    Authors: Weihang Cao, Xintong Ling, Jiaheng Wang, Xiqi Gao, Zhi Ding

    Abstract: Proof of Work (PoW) has been extensively utilized as the foundation of blockchain's security, consistency, and tamper-resistance. However, long has it been criticized for its tremendous and inefficient utilization of computational power and energy. In this work, we design a dual-functional blockchain framework that uses solving optimization problems to reach consensus as an alternative to PoW, cha… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  48. arXiv:2405.18812  [pdf, other

    cs.CV

    MindSemantix: Deciphering Brain Visual Experiences with a Brain-Language Model

    Authors: Ziqi Ren, Jie Li, Xuetong Xue, Xin Li, Fan Yang, Zhicheng Jiao, Xinbo Gao

    Abstract: Deciphering the human visual experience through brain activities captured by fMRI represents a compelling and cutting-edge challenge in the field of neuroscience research. Compared to merely predicting the viewed image itself, decoding brain activity into meaningful captions provides a higher-level interpretation and summarization of visual information, which naturally enhances the application fle… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 13 pages, 6 figures

  49. arXiv:2405.18700  [pdf, other

    cs.CV

    Multi-Condition Latent Diffusion Network for Scene-Aware Neural Human Motion Prediction

    Authors: Xuehao Gao, Yang Yang, Yang Wu, Shaoyi Du, Guo-Jun Qi

    Abstract: Inferring 3D human motion is fundamental in many applications, including understanding human activity and analyzing one's intention. While many fruitful efforts have been made to human motion prediction, most approaches focus on pose-driven prediction and inferring human motion in isolation from the contextual environment, thus leaving the body location movement in the scene behind. However, real-… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE Transactions on Image Processing

  50. arXiv:2405.18380  [pdf, other

    cs.LG cs.AI cs.CL

    OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning

    Authors: Pengxiang Li, Lu Yin, Xiaowei Gao, Shiwei Liu

    Abstract: The rapid advancements in Large Language Models (LLMs) have revolutionized various natural language processing tasks. However, the substantial size of LLMs presents significant challenges in training or fine-tuning. While parameter-efficient approaches such as low-rank adaptation (LoRA) have gained popularity, they often compromise performance compared to full-rank fine-tuning. In this paper, we p… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.