Skip to main content

Showing 1–50 of 349 results for author: Gao, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13151  [pdf, other

    eess.IV cs.CV

    Wavelet-based Bi-dimensional Aggregation Network for SAR Image Change Detection

    Authors: Jiangwei Xie, Feng Gao, Xiaowei Zhou, Junyu Dong

    Abstract: Synthetic aperture radar (SAR) image change detection is critical in remote sensing image analysis. Recently, the attention mechanism has been widely used in change detection tasks. However, existing attention mechanisms often employ down-sampling operations such as average pooling on the Key and Value components to enhance computational efficiency. These irreversible operations result in the loss… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: IEEE GRSL 2024

  2. arXiv:2407.10101  [pdf, other

    cs.RO

    WING: Wheel-Inertial Neural Odometry with Ground Manifold Constraints

    Authors: Chenxing Jiang, Kunyi Zhang, Sheng Yang, Shaojie Shen, Chao Xu, Fei Gao

    Abstract: In this paper, we propose an interoceptive-only odometry system for ground robots with neural network processing and soft constraints based on the assumption of a globally continuous ground manifold. Exteroceptive sensors such as cameras, GPS and LiDAR may encounter difficulties in scenarios with poor illumination, indoor environments, dusty areas and straight tunnels. Therefore, improving the pos… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: Accepted by IEEE Transactions on Intelligent Vehicles

  3. arXiv:2407.06691  [pdf, other

    cs.IT eess.SP

    OFDM Achieves the Lowest Ranging Sidelobe Under Random ISAC Signaling

    Authors: Fan Liu, Ying Zhang, Yifeng Xiong, Shuangyang Li, Weijie Yuan, Feifei Gao, Shi Jin, Giuseppe Caire

    Abstract: This paper aims to answer a fundamental question in the area of Integrated Sensing and Communications (ISAC): What is the optimal communication-centric ISAC waveform for ranging? Towards that end, we first established a generic framework to analyze the sensing performance of communication-centric ISAC waveforms built upon orthonormal signaling bases and random data symbols. Then, we evaluated thei… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 14 pages, 12 figures, submitted to IEEE for possible publication

  4. arXiv:2407.03663  [pdf, other

    cs.CV

    Limited-View Photoacoustic Imaging Reconstruction Via High-quality Self-supervised Neural Representation

    Authors: Youshen xiao, Yuting Shen, Bowei Yao, Xiran Cai, Yuyao Zhang, Fei Gao

    Abstract: In practical applications within the human body, it is often challenging to fully encompass the target tissue or organ, necessitating the use of limited-view arrays, which can lead to the loss of crucial information. Addressing the reconstruction of photoacoustic sensor signals in limited-view detection spaces has become a focal point of current research. In this study, we introduce a self-supervi… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  5. arXiv:2407.02272  [pdf, other

    cs.CV cs.GR

    Aligning Human Motion Generation with Human Perceptions

    Authors: Haoru Wang, Wentao Zhu, Luyi Miao, Yishu Xu, Feng Gao, Qi Tian, Yizhou Wang

    Abstract: Human motion generation is a critical task with a wide range of applications. Achieving high realism in generated motions requires naturalness, smoothness, and plausibility. Despite rapid advancements in the field, current generation methods often fall short of these goals. Furthermore, existing evaluation metrics typically rely on ground-truth-based errors, simple heuristics, or distribution dist… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Project page: https://motioncritic.github.io/

  6. arXiv:2407.01292  [pdf, other

    cs.RO

    Preserving Relative Localization of FoV-Limited Drone Swarm via Active Mutual Observation

    Authors: Lianjie Guo, Zaitian Gongye, Ziyi Xu, Yingjian Wang, Xin Zhou, Jinni Zhou, Fei Gao

    Abstract: Relative state estimation is crucial for vision-based swarms to estimate and compensate for the unavoidable drift of visual odometry. For autonomous drones equipped with the most compact sensor setting -- a stereo camera that provides a limited field of view (FoV), the demand for mutual observation for relative state estimation conflicts with the demand for environment observation. To balance the… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted by IROS 2024, 8 pages, 10 figures

  7. arXiv:2407.00578  [pdf, other

    cs.RO

    UniQuad: A Unified and Versatile Quadrotor Platform Series for UAV Research and Application

    Authors: Yichen Zhang, Xinyi Chen, Peize Liu, Junzhe Wang, Hetai Zou, Neng Pan, Fei Gao, Shaojie Shen

    Abstract: As quadrotors take on an increasingly diverse range of roles, researchers often need to develop new hardware platforms tailored for specific tasks, introducing significant engineering overhead. In this article, we introduce the UniQuad series, a unified and versatile quadrotor platform series that offers high flexibility to adapt to a wide range of common tasks, excellent customizability for advan… ▽ More

    Submitted 4 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

    Comments: Submitted to 40th Anniversary of the IEEE Conference on Robotics and Automation (ICRA-X40)

  8. arXiv:2406.18045  [pdf, other

    cs.CL cs.AI

    PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry

    Authors: Linqing Chen, Weilei Wang, Zilong Bai, Peng Xu, Yan Fang, Jie Fang, Wentao Wu, Lizhi Zhou, Ruiji Zhang, Yubin Xia, Chaobo Xu, Ran Hu, Licong Xu, Qijun Cai, Haoran Hua, Jing Sun, Jin Liu, Tian Qiu, Haowen Liu, Meng Hu, Xiuwen Li, Fei Gao, Yufu Wang, Lin Tie, Chaochao Wang , et al. (11 additional authors not shown)

    Abstract: Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpo… ▽ More

    Submitted 9 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  9. arXiv:2406.16422  [pdf, other

    cs.CV cs.AI

    Exploring Cross-Domain Few-Shot Classification via Frequency-Aware Prompting

    Authors: Tiange Zhang, Qing Cai, Feng Gao, Lin Qi, Junyu Dong

    Abstract: Cross-Domain Few-Shot Learning has witnessed great stride with the development of meta-learning. However, most existing methods pay more attention to learning domain-adaptive inductive bias (meta-knowledge) through feature-wise manipulation or task diversity improvement while neglecting the phenomenon that deep networks tend to rely more on high-frequency cues to make the classification decision,… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  10. arXiv:2406.13954  [pdf

    cs.AI

    Research on Flight Accidents Prediction based Back Propagation Neural Network

    Authors: Haoxing Liu, Fangzhou Shen, Haoshen Qin and, Fanru Gao

    Abstract: With the rapid development of civil aviation and the significant improvement of people's living standards, taking an air plane has become a common and efficient way of travel. However, due to the flight characteris-tics of the aircraft and the sophistication of the fuselage structure, flight de-lays and flight accidents occur from time to time. In addition, the life risk factor brought by aircraft… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  11. arXiv:2406.05687  [pdf, other

    cs.RO

    FlightBench: A Comprehensive Benchmark of Spatial Planning Methods for Quadrotors

    Authors: Shu-Ang Yu, Chao Yu, Feng Gao, Yi Wu, Yu Wang

    Abstract: Spatial planning in cluttered environments is crucial for mobile systems, particularly agile quadrotors. Existing methods, both optimization-based and learning-based, often focus only on success rates in specific environments and lack a unified platform with tasks of varying difficulty. To address this, we introduce FlightBench, the first comprehensive open-source benchmark for 3D spatial planning… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: The first three authors contribute equally

  12. arXiv:2406.01054  [pdf, other

    cs.LG cs.CV

    Confidence-Based Task Prediction in Continual Disease Classification Using Probability Distribution

    Authors: Tanvi Verma, Lukas Schwemer, Mingrui Tan, Fei Gao, Yong Liu, Huazhu Fu

    Abstract: Deep learning models are widely recognized for their effectiveness in identifying medical image findings in disease classification. However, their limitations become apparent in the dynamic and ever-changing clinical environment, characterized by the continuous influx of newly annotated medical data from diverse sources. In this context, the need for continual learning becomes particularly paramou… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  13. arXiv:2406.00947  [pdf, other

    cs.CV

    Cross-Dimensional Medical Self-Supervised Representation Learning Based on a Pseudo-3D Transformation

    Authors: Fei Gao, Siwen Wang, Fandong Zhang, Hong-Yu Zhou, Yizhou Wang, Churan Wang, Gang Yu, Yizhou Yu

    Abstract: Medical image analysis suffers from a shortage of data, whether annotated or not. This becomes even more pronounced when it comes to 3D medical images. Self-Supervised Learning (SSL) can partially ease this situation by using unlabeled data. However, most existing SSL methods can only make use of data in a single dimensionality (e.g. 2D or 3D), and are incapable of enlarging the training dataset b… ▽ More

    Submitted 4 July, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

    Comments: MICCAI 2024 accept

  14. arXiv:2405.20883  [pdf, other

    cs.RO

    Scalable Distance-based Multi-Agent Relative State Estimation via Block Multiconvex Optimization

    Authors: Tianyue Wu, Gongye Zaitian, Qianhao Wang, Fei Gao

    Abstract: This paper explores the distance-based relative state estimation problem in large-scale systems, which is hard to solve effectively due to its high-dimensionality and non-convexity. In this paper, we alleviate this inherent hardness to simultaneously achieve scalability and robustness of inference on this problem. Our idea is launched from a universal geometric formulation, called \emph{generalize… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: To appear in Robotics: Science and System 2024

  15. arXiv:2405.18816  [pdf, other

    cs.CV cs.LG

    Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching

    Authors: Yasi Zhang, Peiyu Yu, Yaxuan Zhu, Yingshan Chang, Feng Gao, Ying Nian Wu, Oscar Leong

    Abstract: Generative models based on flow matching have attracted significant attention for their simplicity and superior performance in high-resolution image synthesis. By leveraging the instantaneous change-of-variables formula, one can directly compute image likelihoods from a learned flow, making them enticing candidates as priors for downstream tasks such as inverse problems. In particular, a natural a… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  16. arXiv:2405.18515  [pdf, other

    cs.LG

    Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication

    Authors: Yunuo Chen, Tianyi Xie, Zeshun Zong, Xuan Li, Feng Gao, Yin Yang, Ying Nian Wu, Chenfanfu Jiang

    Abstract: Existing diffusion-based text-to-3D generation methods primarily focus on producing visually realistic shapes and appearances, often neglecting the physical constraints necessary for downstream tasks. Generated models frequently fail to maintain balance when placed in physics-based simulations or 3D printed. This balance is crucial for satisfying user design intentions in interactive gaming, embod… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  17. arXiv:2405.18224  [pdf, other

    cs.CV

    SSLChange: A Self-supervised Change Detection Framework Based on Domain Adaptation

    Authors: Yitao Zhao, Turgay Celik, Nanqing Liu, Feng Gao, Heng-Chao Li

    Abstract: In conventional remote sensing change detection (RS CD) procedures, extensive manual labeling for bi-temporal images is first required to maintain the performance of subsequent fully supervised training. However, pixel-level labeling for CD tasks is very complex and time-consuming. In this paper, we explore a novel self-supervised contrastive framework applicable to the RS CD task, which promotes… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: This manuscript has been submitted to IEEE TGRS and is under review

  18. arXiv:2405.17769  [pdf, other

    cs.RO cs.CV

    Microsaccade-inspired Event Camera for Robotics

    Authors: Botao He, Ze Wang, Yuan Zhou, Jingxi Chen, Chahat Deep Singh, Haojia Li, Yuman Gao, Shaojie Shen, Kaiwei Wang, Yanjun Cao, Chao Xu, Yiannis Aloimonos, Fei Gao, Cornelia Fermuller

    Abstract: Neuromorphic vision sensors or event cameras have made the visual perception of extremely low reaction time possible, opening new avenues for high-dynamic robotics applications. These event cameras' output is dependent on both motion and texture. However, the event camera fails to capture object edges that are parallel to the camera motion. This is a problem intrinsic to the sensor and therefore c… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Published on Science Robotics June 2024 issue

  19. arXiv:2405.12420  [pdf, other

    cs.CV

    GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details

    Authors: Boqian Li, Xuan Li, Ying Jiang, Tianyi Xie, Feng Gao, Huamin Wang, Yin Yang, Chenfanfu Jiang

    Abstract: Traditional 3D garment creation is labor-intensive, involving sketching, modeling, UV mapping, and texturing, which are time-consuming and costly. Recent advances in diffusion-based generative models have enabled new possibilities for 3D garment generation from text prompts, images, and videos. However, existing methods either suffer from inconsistencies among multi-view images or require addition… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  20. arXiv:2405.10142  [pdf, other

    cs.RO

    GS-Planner: A Gaussian-Splatting-based Planning Framework for Active High-Fidelity Reconstruction

    Authors: Rui Jin, Yuman Gao, Yingjian Wang, Haojian Lu, Fei Gao

    Abstract: Active reconstruction technique enables robots to autonomously collect scene data for full coverage, relieving users from tedious and time-consuming data capturing process. However, designed based on unsuitable scene representations, existing methods show unrealistic reconstruction results or the inability of online quality evaluation. Due to the recent advancements in explicit radiance field tech… ▽ More

    Submitted 24 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  21. arXiv:2405.07736  [pdf, other

    cs.RO

    Learning to Plan Maneuverable and Agile Flight Trajectory with Optimization Embedded Networks

    Authors: Zhichao Han, Long Xu, Fei Gao

    Abstract: In recent times, an increasing number of researchers have been devoted to utilizing deep neural networks for end-to-end flight navigation. This approach has gained traction due to its ability to bridge the gap between perception and planning that exists in traditional methods, thereby eliminating delays between modules. However, the practice of replacing original modules with neural networks in a… ▽ More

    Submitted 7 June, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: Some statements in the introduction may be controversial

  22. arXiv:2405.05993  [pdf

    cs.LG cs.AI

    Precision Rehabilitation for Patients Post-Stroke based on Electronic Health Records and Machine Learning

    Authors: Fengyi Gao, Xingyu Zhang, Sonish Sivarajkumar, Parker Denny, Bayan Aldhahwani, Shyam Visweswaran, Ryan Shi, William Hogan, Allyn Bove, Yanshan Wang

    Abstract: In this study, we utilized statistical analysis and machine learning methods to examine whether rehabilitation exercises can improve patients post-stroke functional abilities, as well as forecast the improvement in functional abilities. Our dataset is patients' rehabilitation exercises and demographic information recorded in the unstructured electronic health records (EHRs) data and free-text reha… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  23. arXiv:2405.00362  [pdf, other

    cs.RO cs.CG cs.GR

    Implicit Swept Volume SDF: Enabling Continuous Collision-Free Trajectory Generation for Arbitrary Shapes

    Authors: Jingping Wang, Tingrui Zhang, Qixuan Zhang, Chuxiao Zeng, Jingyi Yu, Chao Xu, Lan Xu, Fei Gao

    Abstract: In the field of trajectory generation for objects, ensuring continuous collision-free motion remains a huge challenge, especially for non-convex geometries and complex environments. Previous methods either oversimplify object shapes, which results in a sacrifice of feasible space or rely on discrete sampling, which suffers from the "tunnel effect". To address these limitations, we propose a novel… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: accecpted by SIGGRAPH2024&TOG. Joint First Authors: Jingping Wang,Tingrui Zhang, Joint Corresponding authors: Fei Gao, Lan Xu

  24. arXiv:2404.02986  [pdf, other

    cs.LG stat.ML

    Universal Functional Regression with Neural Operator Flows

    Authors: Yaozhong Shi, Angela F. Gao, Zachary E. Ross, Kamyar Azizzadenesheli

    Abstract: Regression on function spaces is typically limited to models with Gaussian process priors. We introduce the notion of universal functional regression, in which we aim to learn a prior distribution over non-Gaussian function spaces that remains mathematically tractable for functional regression. To do this, we develop Neural Operator Flows (OpFlow), an infinite-dimensional extension of normalizing… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  25. arXiv:2404.00885  [pdf, other

    cs.LG

    Modeling Output-Level Task Relatedness in Multi-Task Learning with Feedback Mechanism

    Authors: Xiangming Xi, Feng Gao, Jun Xu, Fangtai Guo, Tianlei Jin

    Abstract: Multi-task learning (MTL) is a paradigm that simultaneously learns multiple tasks by sharing information at different levels, enhancing the performance of each individual task. While previous research has primarily focused on feature-level or parameter-level task relatedness, and proposed various model architectures and learning algorithms to improve learning performance, we aim to explore output-… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: submitted to CDC2024

  26. arXiv:2404.00589   

    cs.LG cs.CL

    Harnessing the Power of Large Language Model for Uncertainty Aware Graph Processing

    Authors: Zhenyu Qian, Yiming Qian, Yuting Song, Fei Gao, Hai Jin, Chen Yu, Xia Xie

    Abstract: Handling graph data is one of the most difficult tasks. Traditional techniques, such as those based on geometry and matrix factorization, rely on assumptions about the data relations that become inadequate when handling large and complex graph data. On the other hand, deep learning approaches demonstrate promising results in handling large graph data, but they often fall short of providing interpr… ▽ More

    Submitted 12 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: Because my organization does not allow members to privately upload papers to arXiv, I am requesting a withdrawal of my submission

  27. arXiv:2403.17353  [pdf, other

    cs.RO cs.LG

    Multi-Objective Trajectory Planning with Dual-Encoder

    Authors: Beibei Zhang, Tian Xiang, Chentao Mao, Yuhua Zheng, Shuai Li, Haoyi Niu, Xiangming Xi, Wenyuan Bai, Feng Gao

    Abstract: Time-jerk optimal trajectory planning is crucial in advancing robotic arms' performance in dynamic tasks. Traditional methods rely on solving complex nonlinear programming problems, bringing significant delays in generating optimized trajectories. In this paper, we propose a two-stage approach to accelerate time-jerk optimal trajectory planning. Firstly, we introduce a dual-encoder based transform… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 6 pages, 7 figures, conference

  28. arXiv:2403.17288  [pdf, other

    cs.RO

    Sparse-Graph-Enabled Formation Planning for Large-Scale Aerial Swarms

    Authors: Yuan Zhou, Lun Quan, Chao Xu, Guangtong Xu, Fei Gao

    Abstract: The formation trajectory planning using complete graphs to model collaborative constraints becomes computationally intractable as the number of drones increases due to the curse of dimensionality. To tackle this issue, this paper presents a sparse graph construction method for formation planning to realize better efficiency-performance trade-off. Firstly, a sparsification mechanism for complete gr… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  29. arXiv:2403.16394  [pdf, other

    cs.CL cs.AI

    Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation

    Authors: Yingshan Chang, Yasi Zhang, Zhiyuan Fang, Yingnian Wu, Yonatan Bisk, Feng Gao

    Abstract: The literature on text-to-image generation is plagued by issues of faithfully composing entities with relations. But there lacks a formal understanding of how entity-relation compositions can be effectively learned. Moreover, the underlying phenomenon space that meaningfully reflects the problem structure is not well-defined, leading to an arms race for larger quantities of data in the hope that g… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  30. arXiv:2403.16080  [pdf, other

    cs.CV

    PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling

    Authors: Xiaoyun Zheng, Liwei Liao, Xufeng Li, Jianbo Jiao, Rongjie Wang, Feng Gao, Shiqi Wang, Ronggang Wang

    Abstract: High-quality human reconstruction and photo-realistic rendering of a dynamic scene is a long-standing problem in computer vision and graphics. Despite considerable efforts invested in developing various capture systems and reconstruction algorithms, recent advancements still struggle with loose or oversized clothing and overly complex poses. In part, this is due to the challenges of acquiring high… ▽ More

    Submitted 2 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

    Comments: CVPR2024(accepted). Project page: https://pku-dymvhumans.github.io

  31. arXiv:2403.14350  [pdf, other

    cs.CV

    Annotation-Efficient Polyp Segmentation via Active Learning

    Authors: Duojun Huang, Xinyu Xiong, De-Jun Fan, Feng Gao, Xiao-Jian Wu, Guanbin Li

    Abstract: Deep learning-based techniques have proven effective in polyp segmentation tasks when provided with sufficient pixel-wise labeled data. However, the high cost of manual annotation has created a bottleneck for model generalization. To minimize annotation costs, we propose a deep active learning framework for annotation-efficient polyp segmentation. In practice, we measure the uncertainty of each sa… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 2024 IEEE 21th International Symposium on Biomedical Imaging (ISBI)

  32. arXiv:2403.13455  [pdf, other

    cs.RO

    FACT: Fast and Active Coordinate Initialization for Vision-based Drone Swarms

    Authors: Yuan Li, Anke Zhao, Yingjian Wang, Ziyi Xu, Xin Zhou, Jinni Zhou, Chao Xu, Fei Gao

    Abstract: Swarm robots have sparked remarkable developments across a range of fields. While it is necessary for various applications in swarm robots, a fast and robust coordinate initialization in vision-based drone swarms remains elusive. To this end, our paper proposes a complete system to recover a swarm's initial relative pose on platforms with size, weight, and power (SWaP) constraints. To overcome lim… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  33. arXiv:2403.10805  [pdf, other

    cs.SD cs.AI cs.CV cs.GR cs.HC eess.AS

    Speech-driven Personalized Gesture Synthetics: Harnessing Automatic Fuzzy Feature Inference

    Authors: Fan Zhang, Zhaohan Wang, Xin Lyu, Siyuan Zhao, Mengjian Li, Weidong Geng, Naye Ji, Hui Du, Fuxing Gao, Hao Wu, Shunman Li

    Abstract: Speech-driven gesture generation is an emerging field within virtual human creation. However, a significant challenge lies in accurately determining and processing the multitude of input features (such as acoustic, semantic, emotional, personality, and even subtle unknown features). Traditional approaches, reliant on various explicit feature inputs and complex multimodal processing, constrain the… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: 12 pages,

  34. arXiv:2403.10067  [pdf, other

    eess.IV cs.CV

    Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising

    Authors: Shuai Hu, Feng Gao, Xiaowei Zhou, Junyu Dong, Qian Du

    Abstract: Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data. However, simultaneously modeling global and local features is rarely explored to enhance HSI denoising. In this letter, we propose a hybrid convolution and attention network (HCANet), which leverages both the strengths of convolution neural networks (CNNs) and Transformers. To enhan… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: IEEE GRSL 2024

  35. arXiv:2403.09318  [pdf, other

    quant-ph cs.CV cs.LG

    A Hierarchical Fused Quantum Fuzzy Neural Network for Image Classification

    Authors: Sheng-Yao Wu, Run-Ze Li, Yan-Qi Song, Su-Juan Qin, Qiao-Yan Wen, Fei Gao

    Abstract: Neural network is a powerful learning paradigm for data feature learning in the era of big data. However, most neural network models are deterministic models that ignore the uncertainty of data. Fuzzy neural networks are proposed to address this problem. FDNN is a hierarchical deep neural network that derives information from both fuzzy and neural representations, the representations are then fuse… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  36. arXiv:2403.08460  [pdf, other

    cs.CV cs.RO

    Towards Dense and Accurate Radar Perception Via Efficient Cross-Modal Diffusion Model

    Authors: Ruibin Zhang, Donglai Xue, Yuhan Wang, Ruixu Geng, Fei Gao

    Abstract: Millimeter wave (mmWave) radars have attracted significant attention from both academia and industry due to their capability to operate in extreme weather conditions. However, they face challenges in terms of sparsity and noise interference, which hinder their application in the field of micro aerial vehicle (MAV) autonomous navigation. To this end, this paper proposes a novel approach to dense an… ▽ More

    Submitted 19 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 8 pages, 6 figures, submitted to RA-L

  37. arXiv:2403.04586  [pdf, other

    cs.RO cs.LG

    Learning Speed Adaptation for Flight in Clutter

    Authors: Guangyu Zhao, Tianyue Wu, Yeke Chen, Fei Gao

    Abstract: Animals learn to adapt speed of their movements to their capabilities and the environment they observe. Mobile robots should also demonstrate this ability to trade-off aggressiveness and safety for efficiently accomplishing tasks. The aim of this work is to endow flight vehicles with the ability of speed adaptation in prior unknown and partially observable cluttered environments. We propose a hier… ▽ More

    Submitted 10 July, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Published on Robotics and Automation Letter (RA-L). 8 pages, 10 figures. The first two authors contribute equally to this work. Project page: https://learning-agility-adaptation.github.io/

  38. arXiv:2403.02977  [pdf, other

    cs.RO

    Fast Iterative Region Inflation for Computing Large 2-D/3-D Convex Regions of Obstacle-Free Space

    Authors: Qianhao Wang, Zhepei Wang, Mingyang Wang, Jialin Ji, Zhichao Han, Tianyue Wu, Rui Jin, Yuman Gao, Chao Xu, Fei Gao

    Abstract: Convex polytopes have compact representations and exhibit convexity, which makes them suitable for abstracting obstacle-free spaces from various environments. Existing methods for generating convex polytopes always struggle to strike a balance between two requirements, producing high-quality polytope and efficiency. Moreover, another crucial requirement for convex polytopes to accurately contain c… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  39. arXiv:2403.01991  [pdf, other

    cs.RO

    Skater: A Novel Bi-modal Bi-copter Robot for Adaptive Locomotion in Air and Diverse Terrain

    Authors: Junxiao Lin, Ruibin Zhang, Neng Pan, Chao Xu, Fei Gao

    Abstract: In this letter, we present a novel bi-modal bi-copter robot called Skater, which is adaptable to air and various ground surfaces. Skater consists of a bi-copter moving along its longitudinal direction with two passive wheels on both sides. Using a longitudinally arranged bi-copter as the unified actuation system for both aerial and ground modes, this robot not only keeps a concise and lightweight… ▽ More

    Submitted 26 May, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE Robotics and Automation Letters (RA-L)

  40. arXiv:2403.00322  [pdf, other

    cs.RO

    Model-Based Planning and Control for Terrestrial-Aerial Bimodal Vehicles with Passive Wheels

    Authors: Ruibin Zhang, Junxiao Lin, Yuze Wu, Yuman Gao, Chi Wang, Chao Xu, Yanjun Cao, Fei Gao

    Abstract: Terrestrial and aerial bimodal vehicles have gained widespread attention due to their cross-domain maneuverability. Nevertheless, their bimodal dynamics significantly increase the complexity of motion planning and control, thus hindering robust and efficient autonomous navigation in unknown environments. To resolve this issue, we develop a model-based planning and control framework for terrestrial… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: Accepted at IROS 2023

  41. arXiv:2402.17098  [pdf, other

    cs.CV

    In Defense and Revival of Bayesian Filtering for Thermal Infrared Object Tracking

    Authors: Peng Gao, Shi-Min Li, Feng Gao, Fei Wang, Ru-Yue Yuan, Hamido Fujita

    Abstract: Deep learning-based methods monopolize the latest research in the field of thermal infrared (TIR) object tracking. However, relying solely on deep learning models to obtain better tracking results requires carefully selecting feature information that is beneficial to representing the target object and designing a reasonable template update strategy, which undoubtedly increases the difficulty of mo… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  42. Star-Searcher: A Complete and Efficient Aerial System for Autonomous Target Search in Complex Unknown Environments

    Authors: Yiming Luo, Zixuan Zhuang, Neng Pan, Chen Feng, Shaojie Shen, Fei Gao, Hui Cheng, Boyu Zhou

    Abstract: This paper tackles the challenge of autonomous target search using unmanned aerial vehicles (UAVs) in complex unknown environments. To fill the gap in systematic approaches for this task, we introduce Star-Searcher, an aerial system featuring specialized sensor suites, mapping, and planning modules to optimize searching. Path planning challenges due to increased inspection requirements are address… ▽ More

    Submitted 21 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Aceepted to IEEE RA-L. Code: https://github.com/SYSU-STAR/STAR-Searcher. Video: https://www.youtube.com/watch?v=08ll_oo_DtU

  43. arXiv:2402.14304   

    cs.RO cs.AI cs.CV

    Vision-Language Navigation with Embodied Intelligence: A Survey

    Authors: Peng Gao, Peng Wang, Feng Gao, Fei Wang, Ruyue Yuan

    Abstract: As a long-term vision in the field of artificial intelligence, the core goal of embodied intelligence is to improve the perception, understanding, and interaction capabilities of agents and the environment. Vision-language navigation (VLN), as a critical research path to achieve embodied intelligence, focuses on exploring how agents use natural language to communicate effectively with humans, rece… ▽ More

    Submitted 15 March, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: The pictures in Figures 2, 4, and 5 are used without authorization, and the literatures in Table 1 have been cited improperly

  44. arXiv:2401.16663  [pdf, other

    cs.HC cs.CV

    VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual Reality

    Authors: Ying Jiang, Chang Yu, Tianyi Xie, Xuan Li, Yutao Feng, Huamin Wang, Minchen Li, Henry Lau, Feng Gao, Yin Yang, Chenfanfu Jiang

    Abstract: As consumer Virtual Reality (VR) and Mixed Reality (MR) technologies gain momentum, there's a growing focus on the development of engagements with 3D virtual content. Unfortunately, traditional techniques for content creation, editing, and interaction within these virtual spaces are fraught with difficulties. They tend to be not only engineering-intensive but also require extensive expertise, whic… ▽ More

    Submitted 4 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  45. arXiv:2401.07784  [pdf, other

    cs.RO

    Certifiable Mutual Localization and Trajectory Planning for Bearing-Based Robot Swarm

    Authors: Yingjian Wang, Xiangyong Wen, Fei Gao

    Abstract: Bearing measurements,as the most common modality in nature, have recently gained traction in multi-robot systems to enhance mutual localization and swarm collaboration. Despite their advantages, challenges such as sensory noise, obstacle occlusion, and uncoordinated swarm motion persist in real-world scenarios, potentially leading to erroneous state estimation and undermining the system's flexibil… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  46. arXiv:2312.15023  [pdf, other

    cs.LG stat.ML

    Federated Q-Learning: Linear Regret Speedup with Low Communication Cost

    Authors: Zhong Zheng, Fengyu Gao, Lingzhou Xue, Jing Yang

    Abstract: In this paper, we consider federated reinforcement learning for tabular episodic Markov Decision Processes (MDP) where, under the coordination of a central server, multiple agents collaboratively explore the environment and learn an optimal policy without sharing their raw data. While linear speedup in the number of agents has been achieved for some metrics, such as convergence rate and sample com… ▽ More

    Submitted 7 May, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: 51 pages

  47. Adaptive Tracking and Perching for Quadrotor in Dynamic Scenarios

    Authors: Yuman Gao, Jialin Ji, Qianhao Wang, Rui Jin, Yi Lin, Zhimeng Shang, Yanjun Cao, Shaojie Shen, Chao Xu, Fei Gao

    Abstract: Perching on the moving platforms is a promising solution to enhance the endurance and operational range of quadrotors, which could benefit the efficiency of a variety of air-ground cooperative tasks. To ensure robust perching, tracking with a steady relative state and reliable perception is a prerequisite. This paper presents an adaptive dynamic tracking and perching scheme for autonomous quadroto… ▽ More

    Submitted 17 January, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  48. arXiv:2312.04062  [pdf, other

    cs.IT cs.AI eess.SP

    A Low-Overhead Incorporation-Extrapolation based Few-Shot CSI Feedback Framework for Massive MIMO Systems

    Authors: Binggui Zhou, Xi Yang, Jintao Wang, Shaodan Ma, Feifei Gao, Guanghua Yang

    Abstract: Accurate channel state information (CSI) is essential for downlink precoding in frequency division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems with orthogonal frequency-division multiplexing (OFDM). However, obtaining CSI through feedback from the user equipment (UE) becomes challenging with the increasing scale of antennas and subcarriers and leads to extremely high CSI… ▽ More

    Submitted 21 June, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: 16 pages, 12 figures, 5 tables. Accepted by IEEE Transactions on Wireless Communications

  49. arXiv:2312.04025  [pdf, other

    cs.DC cs.AI

    Moirai: Towards Optimal Placement for Distributed Inference on Heterogeneous Devices

    Authors: Beibei Zhang, Hongwei Zhu, Feng Gao, Zhihui Yang, Sean Xiaoyang Wang

    Abstract: The escalating size of Deep Neural Networks (DNNs) has spurred a growing research interest in hosting and serving DNN models across multiple devices. A number of studies have been reported to partition a DNN model across devices, providing device placement solutions. The methods appeared in the literature, however, either suffer from poor placement performance due to the exponential search space o… ▽ More

    Submitted 26 December, 2023; v1 submitted 6 December, 2023; originally announced December 2023.

  50. arXiv:2312.01097  [pdf, other

    cs.CV cs.LG cs.RO

    Planning as In-Painting: A Diffusion-Based Embodied Task Planning Framework for Environments under Uncertainty

    Authors: Cheng-Fu Yang, Haoyang Xu, Te-Lin Wu, Xiaofeng Gao, Kai-Wei Chang, Feng Gao

    Abstract: Task planning for embodied AI has been one of the most challenging problems where the community does not meet a consensus in terms of formulation. In this paper, we aim to tackle this problem with a unified framework consisting of an end-to-end trainable method and a planning algorithm. Particularly, we propose a task-agnostic method named 'planning as in-painting'. In this method, we use a Denois… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.