Skip to main content

Showing 1–50 of 382 results for author: Xu, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12648  [pdf, ps, other

    cs.IT eess.SP

    Blind Beamforming for Coverage Enhancement with Intelligent Reflecting Surface

    Authors: Fan Xu, Jiawei Yao, Wenhai Lai, Kaiming Shen, Xin Li, Xin Chen, Zhi-Quan Luo

    Abstract: Conventional policy for configuring an intelligent reflecting surface (IRS) typically requires channel state information (CSI), thus incurring substantial overhead costs and facing incompatibility with the current network protocols. This paper proposes a blind beamforming strategy in the absence of CSI, aiming to boost the minimum signal-to-noise ratio (SNR) among all the receiver positions, namel… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 17 pages

  2. arXiv:2407.12217  [pdf, other

    cs.CV

    AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs

    Authors: Yunling Zheng, Zeyi Xu, Fanghui Xue, Biao Yang, Jiancheng Lyu, Shuai Zhang, Yingyong Qi, Jack Xin

    Abstract: We propose and demonstrate an alternating Fourier and image domain filtering approach for feature extraction as an efficient alternative to build a vision backbone without using the computationally intensive attention. The performance among the lightweight models reaches the state-of-the-art level on ImageNet-1K classification, and improves downstream tasks on object detection and segmentation con… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  3. arXiv:2407.10836  [pdf, other

    cs.LG math.NA

    Data-Guided Physics-Informed Neural Networks for Solving Inverse Problems in Partial Differential Equations

    Authors: Wei Zhou, Y. F. Xu

    Abstract: Physics-informed neural networks (PINNs) represent a significant advancement in scientific machine learning by integrating fundamental physical laws into their architecture through loss functions. PINNs have been successfully applied to solve various forward and inverse problems in partial differential equations (PDEs). However, a notable challenge can emerge during the early training stages when… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  4. arXiv:2407.07124  [pdf, other

    cs.DC cs.AI cs.LG

    FedClust: Tackling Data Heterogeneity in Federated Learning through Weight-Driven Client Clustering

    Authors: Md Sirajul Islam, Simin Javaherian, Fei Xu, Xu Yuan, Li Chen, Nian-Feng Tzeng

    Abstract: Federated learning (FL) is an emerging distributed machine learning paradigm that enables collaborative training of machine learning models over decentralized devices without exposing their local data. One of the major challenges in FL is the presence of uneven data distributions across client devices, violating the well-known assumption of independent-and-identically-distributed (IID) training sa… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  5. arXiv:2407.06095  [pdf, other

    cs.CV eess.IV

    Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation

    Authors: Xinyu Bai, Feng Xu

    Abstract: Synthetic Aperture Radar (SAR) provides all-weather, high-resolution imaging capabilities, but its unique imaging mechanism often requires expert interpretation, limiting its widespread applicability. Translating SAR images into more easily recognizable optical images using diffusion models helps address this challenge. However, diffusion models suffer from high latency due to numerous iterative i… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  6. arXiv:2407.05505  [pdf, other

    eess.IV cs.CV

    Dynamic Position Transformation and Boundary Refinement Network for Left Atrial Segmentation

    Authors: Fangqiang Xu, Wenxuan Tu, Fan Feng, Malitha Gunawardhana, Jiayuan Yang, Yun Gu, Jichao Zhao

    Abstract: Left atrial (LA) segmentation is a crucial technique for irregular heartbeat (i.e., atrial fibrillation) diagnosis. Most current methods for LA segmentation strictly assume that the input data is acquired using object-oriented center cropping, while this assumption may not always hold in practice due to the high cost of manual object annotation. Random cropping is a straightforward data pre-proces… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: MICCAI 2024 conference

  7. Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference

    Authors: Kai Shen, Lingfei Wu, Siliang Tang, Fangli Xu, Bo Long, Yueting Zhuang, Jian Pei

    Abstract: The visual question generation (VQG) task aims to generate human-like questions from an image and potentially other side information (e.g. answer type). Previous works on VQG fall in two aspects: i) They suffer from one image to many questions mapping problem, which leads to the failure of generating referential and meaningful questions from an image. ii) They fail to model complex implicit relati… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 2024

  8. arXiv:2407.04230  [pdf, other

    cs.CV

    A Physical Model-Guided Framework for Underwater Image Enhancement and Depth Estimation

    Authors: Dazhao Du, Enhan Li, Lingyu Si, Fanjiang Xu, Jianwei Niu, Fuchun Sun

    Abstract: Due to the selective absorption and scattering of light by diverse aquatic media, underwater images usually suffer from various visual degradations. Existing underwater image enhancement (UIE) approaches that combine underwater physical imaging models with neural networks often fail to accurately estimate imaging model parameters such as depth and veiling light, resulting in poor performance in ce… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  9. Temporal Prototype-Aware Learning for Active Voltage Control on Power Distribution Networks

    Authors: Feiyang Xu, Shunyu Liu, Yunpeng Qing, Yihe Zhou, Yuwen Wang, Mingli Song

    Abstract: Active Voltage Control (AVC) on the Power Distribution Networks (PDNs) aims to stabilize the voltage levels to ensure efficient and reliable operation of power systems. With the increasing integration of distributed energy resources, recent efforts have explored employing multi-agent reinforcement learning (MARL) techniques to realize effective AVC. Existing methods mainly focus on the acquisition… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 12 pages, 8 figures

  10. arXiv:2406.16505  [pdf, other

    q-fin.CP cs.AI

    $\text{Alpha}^2$: Discovering Logical Formulaic Alphas using Deep Reinforcement Learning

    Authors: Feng Xu, Yan Yin, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Zongzhang Zhang

    Abstract: Alphas are pivotal in providing signals for quantitative trading. The industry highly values the discovery of formulaic alphas for their interpretability and ease of analysis, compared with the expressive yet overfitting-prone black-box alphas. In this work, we focus on discovering formulaic alphas. Prior studies on automatically generating a collection of formulaic alphas were mostly based on gen… ▽ More

    Submitted 26 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  11. arXiv:2406.14497  [pdf, other

    cs.SE cs.CL

    CodeRAG-Bench: Can Retrieval Augment Code Generation?

    Authors: Zora Zhiruo Wang, Akari Asai, Xinyan Velocity Yu, Frank F. Xu, Yiqing Xie, Graham Neubig, Daniel Fried

    Abstract: While language models (LMs) have proven remarkably adept at generating code, many programs are challenging for LMs to generate using their parametric knowledge alone. Providing external contexts such as library documentation can facilitate generating accurate and functional code. Despite the success of retrieval-augmented generation (RAG) in various text-oriented tasks, its potential for improving… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  12. arXiv:2406.11736  [pdf, other

    cs.CL cs.AI

    Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models

    Authors: Fangzhi Xu, Qiushi Sun, Kanzhi Cheng, Jun Liu, Yu Qiao, Zhiyong Wu

    Abstract: One of the primary driving forces contributing to the superior performance of Large Language Models (LLMs) is the extensive availability of human-annotated natural language data, which is used for alignment fine-tuning. This inspired researchers to investigate self-training methods to mitigate the extensive reliance on human annotations. However, the current success of self-training has been prima… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 18 pages, 6 figures

  13. arXiv:2406.11501  [pdf, other

    cs.LG cs.AI stat.ME

    Teleporter Theory: A General and Simple Approach for Modeling Cross-World Counterfactual Causality

    Authors: Jiangmeng Li, Bin Qin, Qirui Ji, Yi Li, Wenwen Qiang, Jianwen Cao, Fanjiang Xu

    Abstract: Leveraging the development of structural causal model (SCM), researchers can establish graphical models for exploring the causal mechanisms behind machine learning techniques. As the complexity of machine learning applications rises, single-world interventionism causal analysis encounters theoretical adaptation limitations. Accordingly, cross-world counterfactual approach extends our understanding… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  14. arXiv:2406.11136  [pdf, other

    cs.RO cs.HC

    Robots in Family Routines: Development of and Initial Insights from the Family-Robot Routines Inventory

    Authors: Michael F. Xu, Bengisu Cagiltay, Joseph Michaelis, Sarah Sebo, Bilge Mutlu

    Abstract: Despite advances in areas such as the personalization of robots, sustaining adoption of robots for long-term use in families remains a challenge. Recent studies have identified integrating robots into families' routines and rituals as a promising approach to support long-term adoption. However, few studies explored the integration of robots into family routines and there is a gap in systematic mea… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  15. arXiv:2406.06565  [pdf, other

    cs.CL cs.AI cs.LG

    MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures

    Authors: Jinjie Ni, Fuzhao Xue, Xiang Yue, Yuntian Deng, Mahir Shah, Kabir Jain, Graham Neubig, Yang You

    Abstract: Evaluating large language models (LLMs) is challenging. Traditional ground-truth-based benchmarks fail to capture the comprehensiveness and nuance of real-world queries, while LLM-as-judge benchmarks suffer from grading biases and limited query quantity. Both of them may also become contaminated over time. User-facing evaluation, such as Chatbot Arena, provides reliable signals but is costly and s… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  16. arXiv:2406.02385  [pdf, other

    cs.CV

    Low-Rank Adaption on Transformer-based Oriented Object Detector for Satellite Onboard Processing of Remote Sensing Images

    Authors: Xinyang Pu, Feng Xu

    Abstract: Deep learning models in satellite onboard enable real-time interpretation of remote sensing images, reducing the need for data transmission to the ground and conserving communication resources. As satellite numbers and observation frequencies increase, the demand for satellite onboard real-time image interpretation grows, highlighting the expanding importance and development of this technology. Ho… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  17. arXiv:2406.00734  [pdf, other

    cs.LG

    GLADformer: A Mixed Perspective for Graph-level Anomaly Detection

    Authors: Fan Xu, Nan Wang, Hao Wu, Xuezhi Wen, Dalin Zhang, Siyang Lu, Binyong Li, Wei Gong, Hai Wan, Xibin Zhao

    Abstract: Graph-Level Anomaly Detection (GLAD) aims to distinguish anomalous graphs within a graph dataset. However, current methods are constrained by their receptive fields, struggling to learn global features within the graphs. Moreover, most contemporary methods are based on spatial domain and lack exploration of spectral characteristics. In this paper, we propose a multi-perspective hybrid graph-level… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  18. arXiv:2405.19109  [pdf, other

    cs.CL

    PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question Answering

    Authors: Fangzhi Xu, Qika Lin, Tianzhe Zhao, Jiawei Han, Jun Liu

    Abstract: Logical reasoning task has attracted great interest since it was proposed. Faced with such a task, current competitive models, even large language models (e.g., ChatGPT and PaLM 2), still perform badly. Previous promising LMs struggle in logical consistency modeling and logical structure perception. To this end, we model the logical reasoning task by transforming each logical sample into reasoning… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024

  19. arXiv:2405.12725  [pdf, other

    cs.CR cs.CV

    Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks

    Authors: Boheng Li, Yishuo Cai, Haowei Li, Feng Xue, Zhifeng Li, Yiming Li

    Abstract: Model quantization is widely used to compress and accelerate deep neural networks. However, recent studies have revealed the feasibility of weaponizing model quantization via implanting quantization-conditioned backdoors (QCBs). These special backdoors stay dormant on released full-precision models but will come into effect after standard quantization. Due to the peculiarity of QCBs, existing defe… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024. 19 pages, 9 figures

  20. arXiv:2405.07974  [pdf, other

    cs.CV

    SignAvatar: Sign Language 3D Motion Reconstruction and Generation

    Authors: Lu Dong, Lipisha Chaudhary, Fei Xu, Xiao Wang, Mason Lary, Ifeoma Nwogu

    Abstract: Achieving expressive 3D motion reconstruction and automatic generation for isolated sign words can be challenging, due to the lack of real-world 3D sign-word data, the complex nuances of signing motions, and the cross-modal understanding of sign language semantics. To address these challenges, we introduce SignAvatar, a framework capable of both word-level sign language reconstruction and generati… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: Accepted by FG2024

  21. arXiv:2405.05633  [pdf, other

    cs.DC

    HarmonyBatch: Batching multi-SLO DNN Inference with Heterogeneous Serverless Functions

    Authors: Jiabin Chen, Fei Xu, Yikun Gu, Li Chen, Fangming Liu, Zhi Zhou

    Abstract: Deep Neural Network (DNN) inference on serverless functions is gaining prominence due to its potential for substantial budget savings. Existing works on serverless DNN inference solely optimize batching requests from one application with a single Service Level Objective (SLO) on CPU functions. However, production serverless DNN inference traces indicate that the request arrival rate of application… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 10 pages, 14 figures, accepted by IWQOS24

  22. arXiv:2405.02676  [pdf, other

    cs.CV cs.GR

    Hand-Object Interaction Controller (HOIC): Deep Reinforcement Learning for Reconstructing Interactions with Physics

    Authors: Haoyu Hu, Xinyu Yi, Zhe Cao, Jun-Hai Yong, Feng Xu

    Abstract: Hand manipulating objects is an important interaction motion in our daily activities. We faithfully reconstruct this motion with a single RGBD camera by a novel deep reinforcement learning method to leverage physics. Firstly, we propose object compensation control which establishes direct object control to make the network training more stable. Meanwhile, by leveraging the compensation force and t… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: SIGGRAPH 2024 Conference Track

    ACM Class: I.5.4

  23. arXiv:2404.19619  [pdf, other

    cs.GR

    Physical Non-inertial Poser (PNP): Modeling Non-inertial Effects in Sparse-inertial Human Motion Capture

    Authors: Xinyu Yi, Yuxiao Zhou, Feng Xu

    Abstract: Existing inertial motion capture techniques use the human root coordinate frame to estimate local poses and treat it as an inertial frame by default. We argue that when the root has linear acceleration or rotation, the root frame should be considered non-inertial theoretically. In this paper, we model the fictitious forces that are non-neglectable in a non-inertial frame by an auto-regressive esti… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Accepted by SIGGRAPH 2024 Project Page: https://xinyu-yi.github.io/PNP/

  24. arXiv:2404.15284  [pdf, other

    eess.SP cs.AI

    Global 4D Ionospheric STEC Prediction based on DeepONet for GNSS Rays

    Authors: Dijia Cai, Zenghui Shi, Haiyang Fu, Huan Liu, Hongyi Qian, Yun Sui, Feng Xu, Ya-Qiu Jin

    Abstract: The ionosphere is a vitally dynamic charged particle region in the Earth's upper atmosphere, playing a crucial role in applications such as radio communication and satellite navigation. The Slant Total Electron Contents (STEC) is an important parameter for characterizing wave propagation, representing the integrated electron density along the ray of radio signals passing through the ionosphere. Th… ▽ More

    Submitted 12 March, 2024; originally announced April 2024.

  25. arXiv:2404.11313  [pdf, other

    eess.IV cs.AI

    NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

    Authors: Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei Li, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo , et al. (43 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge

  26. arXiv:2404.10337  [pdf, other

    cs.AI

    Intriguing Properties of Positional Encoding in Time Series Forecasting

    Authors: Jianqi Zhang, Jingyao Wang, Wenwen Qiang, Fanjiang Xu, Changwen Zheng, Fuchun Sun, Hui Xiong

    Abstract: Transformer-based methods have made significant progress in time series forecasting (TSF). They primarily handle two types of tokens, i.e., temporal tokens that contain all variables of the same timestamp, and variable tokens that contain all input time points for a specific variable. Transformer-based methods rely on positional encoding (PE) to mark tokens' positions, facilitating the model to pe… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  27. arXiv:2404.09271  [pdf, other

    cs.CV cs.RO

    VRS-NeRF: Visual Relocalization with Sparse Neural Radiance Field

    Authors: Fei Xue, Ignas Budvytis, Daniel Olmeda Reino, Roberto Cipolla

    Abstract: Visual relocalization is a key technique to autonomous driving, robotics, and virtual/augmented reality. After decades of explorations, absolute pose regression (APR), scene coordinate regression (SCR), and hierarchical methods (HMs) have become the most popular frameworks. However, in spite of high efficiency, APRs and SCRs have limited accuracy especially in large-scale outdoor scenes; HMs are a… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: source code https://github.com/feixue94/vrs-nerf

  28. arXiv:2404.07785  [pdf, other

    cs.CV cs.RO

    PRAM: Place Recognition Anywhere Model for Efficient Visual Localization

    Authors: Fei Xue, Ignas Budvytis, Roberto Cipolla

    Abstract: Humans localize themselves efficiently in known environments by first recognizing landmarks defined on certain objects and their spatial relationships, and then verifying the location by aligning detailed structures of recognized objects with those in the memory. Inspired by this, we propose the place recognition anywhere model (PRAM) to perform visual localization as efficiently as humans do. PRA… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: project page: https://feixue94.github.io/pram-project/

  29. arXiv:2404.05960  [pdf, other

    cs.CV

    EasyTrack: Efficient and Compact One-stream 3D Point Clouds Tracker

    Authors: Baojie Fan, Wuyang Zhou, Kai Wang, Shijun Zhou, Fengyu Xu, Jiandong Tian

    Abstract: Most of 3D single object trackers (SOT) in point clouds follow the two-stream multi-stage 3D Siamese or motion tracking paradigms, which process the template and search area point clouds with two parallel branches, built on supervised point cloud backbones. In this work, beyond typical 3D Siamese or motion tracking, we propose a neat and compact one-stream transformer 3D SOT paradigm from the nove… ▽ More

    Submitted 12 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  30. arXiv:2404.05656  [pdf

    cs.CL cs.AI cs.LG

    Causality Extraction from Nuclear Licensee Event Reports Using a Hybrid Framework

    Authors: Shahidur Rahoman Sohag, Sai Zhang, Min Xian, Shoukun Sun, Fei Xu, Zhegang Ma

    Abstract: Industry-wide nuclear power plant operating experience is a critical source of raw data for performing parameter estimations in reliability and risk models. Much operating experience information pertains to failure events and is stored as reports containing unstructured data, such as narratives. Event reports are essential for understanding how failures are initiated and propagated, including the… ▽ More

    Submitted 22 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  31. arXiv:2404.03219  [pdf, other

    cs.CV cs.GR

    iSeg: Interactive 3D Segmentation via Interactive Attention

    Authors: Itai Lang, Fei Xu, Dale Decatur, Sudarshan Babu, Rana Hanocka

    Abstract: We present iSeg, a new interactive technique for segmenting 3D shapes. Previous works have focused mainly on leveraging pre-trained 2D foundation models for 3D segmentation based on text. However, text may be insufficient for accurately describing fine-grained spatial segmentations. Moreover, achieving a consistent 3D segmentation using a 2D model is challenging since occluded areas of the same se… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Project page: https://threedle.github.io/iSeg/

  32. arXiv:2404.03162  [pdf, other

    cs.CR

    LTRDetector: Exploring Long-Term Relationship for Advanced Persistent Threats Detection

    Authors: Xiaoxiao Liu, Fan Xu, Nan Wang, Qinxin Zhao, Dalin Zhang, Xibin Zhao, Jiqiang Liu

    Abstract: Advanced Persistent Threat (APT) is challenging to detect due to prolonged duration, infrequent occurrence, and adept concealment techniques. Existing approaches primarily concentrate on the observable traits of attack behaviors, neglecting the intricate relationships formed throughout the persistent attack lifecycle. Thus, we present an innovative APT detection framework named LTRDetector, implem… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  33. arXiv:2403.19936  [pdf, other

    cs.CL

    SLFNet: Generating Semantic Logic Forms from Natural Language Using Semantic Probability Graphs

    Authors: Hao Wu, Fan Xu

    Abstract: Building natural language interfaces typically uses a semantic parser to parse the user's natural language and convert it into structured \textbf{S}emantic \textbf{L}ogic \textbf{F}orms (SLFs). The mainstream approach is to adopt a sequence-to-sequence framework, which requires that natural language commands and SLFs must be represented serially. Since a single natural language may have multiple S… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  34. arXiv:2403.14734  [pdf, other

    cs.SE cs.AI cs.CL cs.PL

    A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

    Authors: Qiushi Sun, Zhirui Chen, Fangzhi Xu, Kanzhi Cheng, Chang Ma, Zhangyue Yin, Jianing Wang, Chengcheng Han, Renyu Zhu, Shuai Yuan, Qipeng Guo, Xipeng Qiu, Pengcheng Yin, Xiaoli Li, Fei Yuan, Lingpeng Kong, Xiang Li, Zhiyong Wu

    Abstract: Neural Code Intelligence -- leveraging deep learning to understand, generate, and optimize code -- holds immense potential for transformative impacts on the whole society. Bridging the gap between Natural Language and Programming Language, this domain has drawn significant attention from researchers in both research communities over the past few years. This survey presents a systematic and chronol… ▽ More

    Submitted 23 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: 64 pages, 6 figures, 10 tables, 692 references

  35. arXiv:2403.13850  [pdf, other

    cs.LG cs.AI physics.flu-dyn

    Spatio-Temporal Fluid Dynamics Modeling via Physical-Awareness and Parameter Diffusion Guidance

    Authors: Hao Wu, Fan Xu, Yifan Duan, Ziwei Niu, Weiyan Wang, Gaofeng Lu, Kun Wang, Yuxuan Liang, Yang Wang

    Abstract: This paper proposes a two-stage framework named ST-PAD for spatio-temporal fluid dynamics modeling in the field of earth sciences, aiming to achieve high-precision simulation and prediction of fluid dynamics through spatio-temporal physics awareness and parameter diffusion guidance. In the upstream stage, we design a vector quantization reconstruction module with temporal evolution characteristics… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  36. arXiv:2403.13337  [pdf, other

    cs.CV cs.AI

    Learning Novel View Synthesis from Heterogeneous Low-light Captures

    Authors: Quan Zheng, Hao Sun, Huiyao Xu, Fanjiang Xu

    Abstract: Neural radiance field has achieved fundamental success in novel view synthesis from input views with the same brightness level captured under fixed normal lighting. Unfortunately, synthesizing novel views remains to be a challenge for input views with heterogeneous brightness level captured under low-light condition. The condition is pretty common in the real world. It causes low-contrast images w… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  37. arXiv:2403.11506  [pdf, other

    cs.CV cs.AI

    End-To-End Underwater Video Enhancement: Dataset and Model

    Authors: Dazhao Du, Enhan Li, Lingyu Si, Fanjiang Xu, Jianwei Niu

    Abstract: Underwater video enhancement (UVE) aims to improve the visibility and frame quality of underwater videos, which has significant implications for marine research and exploration. However, existing methods primarily focus on developing image enhancement algorithms to enhance each frame independently. There is a lack of supervised datasets and models specifically tailored for UVE tasks. To fill this… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  38. arXiv:2403.10079  [pdf, other

    cs.CV cs.AI

    Learning Physical Dynamics for Object-centric Visual Prediction

    Authors: Huilin Xu, Tao Chen, Feng Xu

    Abstract: The ability to model the underlying dynamics of visual scenes and reason about the future is central to human intelligence. Many attempts have been made to empower intelligent systems with such physical understanding and prediction abilities. However, most existing methods focus on pixel-to-pixel prediction, which suffers from heavy computational costs while lacking a deep understanding of the phy… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 13 pages, 10 figures

  39. arXiv:2403.09721  [pdf, other

    cs.CL cs.AI

    A Semantic Mention Graph Augmented Model for Document-Level Event Argument Extraction

    Authors: Jian Zhang, Changlin Yang, Haiping Zhu, Qika Lin, Fangzhi Xu, Jun Liu

    Abstract: Document-level Event Argument Extraction (DEAE) aims to identify arguments and their specific roles from an unstructured document. The advanced approaches on DEAE utilize prompt-based methods to guide pre-trained language models (PLMs) in extracting arguments from input documents. They mainly concentrate on establishing relations between triggers and entity mentions within documents, leaving two u… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted By Coling 2024

  40. arXiv:2403.08556  [pdf, other

    cs.CV cs.AI

    SM4Depth: Seamless Monocular Metric Depth Estimation across Multiple Cameras and Scenes by One Model

    Authors: Yihao Liu, Feng Xue, Anlong Ming

    Abstract: The generalization of monocular metric depth estimation (MMDE) has been a longstanding challenge. Recent methods made progress by combining relative and metric depth or aligning input image focal length. However, they are still beset by challenges in camera, scene, and data levels: (1) Sensitivity to different cameras; (2) Inconsistent accuracy across scenes; (3) Reliance on massive training data.… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Project Page: xuefeng-cvr.github.io/SM4Depth

  41. arXiv:2403.07028  [pdf, other

    cs.LG cs.AI math.OC

    An Efficient Learning-based Solver Comparable to Metaheuristics for the Capacitated Arc Routing Problem

    Authors: Runze Guo, Feng Xue, Anlong Ming, Nicu Sebe

    Abstract: Recently, neural networks (NN) have made great strides in combinatorial optimization. However, they face challenges when solving the capacitated arc routing problem (CARP) which is to find the minimum-cost tour covering all required edges on a graph, while within capacity constraints. In tackling CARP, NN-based approaches tend to lag behind advanced metaheuristics, since they lack directed arc mod… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  42. arXiv:2403.04144  [pdf, other

    cs.DC cs.LG

    FedClust: Optimizing Federated Learning on Non-IID Data through Weight-Driven Client Clustering

    Authors: Md Sirajul Islam, Simin Javaherian, Fei Xu, Xu Yuan, Li Chen, Nian-Feng Tzeng

    Abstract: Federated learning (FL) is an emerging distributed machine learning paradigm enabling collaborative model training on decentralized devices without exposing their local data. A key challenge in FL is the uneven data distribution across client devices, violating the well-known assumption of independent-and-identically-distributed (IID) training samples in conventional machine learning. Clustered fe… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  43. arXiv:2403.03866  [pdf, other

    cs.CL

    KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions

    Authors: Fangyuan Xu, Kyle Lo, Luca Soldaini, Bailey Kuehl, Eunsol Choi, David Wadden

    Abstract: Large language models (LLMs) adapted to follow user instructions are now widely deployed as conversational agents. In this work, we examine one increasingly common instruction-following task: providing writing assistance to compose a long-form answer. To evaluate the capabilities of current LLMs on this task, we construct KIWI, a dataset of knowledge-intensive writing instructions in the scientifi… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  44. arXiv:2403.02583  [pdf, other

    cs.SE

    Generative Software Engineering

    Authors: Yuan Huang, Yinan Chen, Xiangping Chen, Junqi Chen, Rui Peng, Zhicao Tang, Jinbo Huang, Furen Xu, Zibin Zheng

    Abstract: The rapid development of deep learning techniques, improved computational power, and the availability of vast training data have led to significant advancements in pre-trained models and large language models (LLMs). Pre-trained models based on architectures such as BERT and Transformer, as well as LLMs like ChatGPT, have demonstrated remarkable language capabilities and found applications in Soft… ▽ More

    Submitted 3 April, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  45. arXiv:2403.01118  [pdf, other

    cs.CV cs.AI

    Adversarial Testing for Visual Grounding via Image-Aware Property Reduction

    Authors: Zhiyuan Chang, Mingyang Li, Junjie Wang, Cheng Li, Boyu Wu, Fanjiang Xu, Qing Wang

    Abstract: Due to the advantages of fusing information from various modalities, multimodal learning is gaining increasing attention. Being a fundamental task of multimodal learning, Visual Grounding (VG), aims to locate objects in images through natural language expressions. Ensuring the quality of VG models presents significant challenges due to the complex nature of the task. In the black box scenario, exi… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 14pages, 6 figures

  46. arXiv:2402.14899  [pdf, other

    cs.CV cs.AI cs.CR cs.LG

    Stop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial Images

    Authors: Zefeng Wang, Zhen Han, Shuo Chen, Fan Xue, Zifeng Ding, Xun Xiao, Volker Tresp, Philip Torr, Jindong Gu

    Abstract: Recently, Multimodal LLMs (MLLMs) have shown a great ability to understand images. However, like traditional vision models, they are still vulnerable to adversarial images. Meanwhile, Chain-of-Thought (CoT) reasoning has been widely explored on MLLMs, which not only improves model's performance, but also enhances model's explainability by giving intermediate reasoning steps. Nevertheless, there is… ▽ More

    Submitted 18 March, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  47. Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network

    Authors: Lin Chen, Fengli Xu, Nian Li, Zhenyu Han, Meng Wang, Yong Li, Pan Hui

    Abstract: Heterogeneous information networks (HIN) have gained increasing popularity in recent years for capturing complex relations between diverse types of nodes. Meta-structures are proposed as a useful tool to identify the important patterns in HINs, but hand-crafted meta-structures pose significant challenges for scaling up, drawing wide research attention towards developing automatic search algorithms… ▽ More

    Submitted 22 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  48. arXiv:2402.10876  [pdf, other

    cs.DC

    Accelerating Sparse DNNs Based on Tiled GEMM

    Authors: Cong Guo, Fengchen Xue, Jingwen Leng, Yuxian Qiu, Yue Guan, Weihao Cui, Quan Chen, Minyi Guo

    Abstract: Network pruning can reduce the computation cost of deep neural network (DNN) models. However, sparse models often produce randomly-distributed weights to maintain accuracy, leading to irregular computations. Consequently, unstructured sparse models cannot achieve meaningful speedup on commodity hardware built for dense matrix computations. Accelerators are usually modified or designed with structu… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE Transactions on Computers. arXiv admin note: substantial text overlap with arXiv:2008.13006

  49. arXiv:2402.09836  [pdf, other

    cs.AI

    Chain-of-Planned-Behaviour Workflow Elicits Few-Shot Mobility Generation in LLMs

    Authors: Chenyang Shao, Fengli Xu, Bingbing Fan, Jingtao Ding, Yuan Yuan, Meng Wang, Yong Li

    Abstract: The powerful reasoning capabilities of large language models (LLMs) have brought revolutionary changes to many fields, but their performance in human behaviour generation has not yet been extensively explored. This gap likely emerges because the internal processes governing behavioral intentions cannot be solely explained by abstract reasoning. Instead, they are also influenced by a multitude of f… ▽ More

    Submitted 5 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  50. arXiv:2402.04829  [pdf, other

    cs.CV cs.GR

    NeRF as a Non-Distant Environment Emitter in Physics-based Inverse Rendering

    Authors: Jingwang Ling, Ruihan Yu, Feng Xu, Chun Du, Shuang Zhao

    Abstract: Physics-based inverse rendering enables joint optimization of shape, material, and lighting based on captured 2D images. To ensure accurate reconstruction, using a light model that closely resembles the captured environment is essential. Although the widely adopted distant environmental lighting model is adequate in many cases, we demonstrate that its inability to capture spatially varying illumin… ▽ More

    Submitted 1 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: SIGGRAPH 2024. Project page and video: https://nerfemitterpbir.github.io/