Skip to main content

Showing 1–50 of 83 results for author: Pu, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13254  [pdf, other

    cs.CV

    Make a Strong Teacher with Label Assistance: A Novel Knowledge Distillation Approach for Semantic Segmentation

    Authors: Shoumeng Qiu, Jie Chen, Xinrun Li, Ru Wan, Xiangyang Xue, Jian Pu

    Abstract: In this paper, we introduce a novel knowledge distillation approach for the semantic segmentation task. Unlike previous methods that rely on power-trained teachers or other modalities to provide additional knowledge, our approach does not require complex teacher models or information from extra sensors. Specifically, for the teacher model training, we propose to noise the label and then incorporat… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Journal ref: ECCV 2024

  2. arXiv:2407.10534  [pdf, other

    cs.CV

    Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs

    Authors: Rong Ma, Jie Chen, Xiangyang Xue, Jian Pu

    Abstract: Deep supervised models possess significant capability to assimilate extensive training data, thereby presenting an opportunity to enhance model performance through training on multiple datasets. However, conflicts arising from different label spaces among datasets may adversely affect model performance. In this paper, we propose a novel approach to automatically construct a unified label space acr… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  3. arXiv:2407.10347  [pdf

    cs.CL

    MambaForGCN: Enhancing Long-Range Dependency with State Space Model and Kolmogorov-Arnold Networks for Aspect-Based Sentiment Analysis

    Authors: Adamu Lawan, Juhua Pu, Haruna Yunusa, Aliyu Umar, Muhammad Lawan

    Abstract: Aspect-based sentiment Analysis (ABSA) identifies and evaluates sentiments toward specific aspects of entities within text, providing detailed insights beyond overall sentiment. However, Attention mechanisms and neural network models struggle with syntactic constraints, and the quadratic complexity of attention mechanisms hinders their adoption for capturing long-range dependencies between aspect… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 25 pages, 3 figures and 3 tables. arXiv admin note: text overlap with arXiv:2405.13013

  4. arXiv:2407.07479  [pdf, other

    cs.CV

    How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval?

    Authors: Yuxin Chen, Zongyang Ma, Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Bing Li, Junfu Pu, Ying Shan, Xiaojuan Qi, Weiming Hu

    Abstract: Dominant dual-encoder models enable efficient image-text retrieval but suffer from limited accuracy while the cross-encoder models offer higher accuracy at the expense of efficiency. Distilling cross-modality matching knowledge from cross-encoder to dual-encoder provides a natural approach to harness their strengths. Thus we investigate the following valuable question: how to make cross-encoder a… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted by CVPR 2024

  5. arXiv:2407.05376  [pdf, other

    cs.RO

    Rethinking Closed-loop Planning Framework for Imitation-based Model Integrating Prediction and Planning

    Authors: Jiayu Guo, Mingyue Feng, Pengfei Zhu, Chengjun Li, Jian Pu

    Abstract: In recent years, the integration of prediction and planning through neural networks has received substantial attention. Despite extensive studies on it, there is a noticeable gap in understanding the operation of such models within a closed-loop planning setting. To bridge this gap, we propose a novel closed-loop planning framework compatible with neural networks engaged in joint prediction and pl… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 7 pages,5 figures

  6. arXiv:2406.17297  [pdf, other

    cs.CV cs.AI

    Towards Open-set Camera 3D Object Detection

    Authors: Zhuolin He, Xinrun Li, Heng Gao, Jiachen Tang, Shoumeng Qiu, Wenfu Wang, Lvjian Lu, Xuchong Qiu, Xiangyang Xue, Jian Pu

    Abstract: Traditional camera 3D object detectors are typically trained to recognize a predefined set of known object classes. In real-world scenarios, these detectors may encounter unknown objects outside the training categories and fail to identify them correctly. To address this gap, we present OS-Det3D (Open-set Camera 3D Object Detection), a two-stage training framework enhancing the ability of camera 3… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  7. arXiv:2406.16525  [pdf, other

    stat.ML cs.LG

    OAML: Outlier Aware Metric Learning for OOD Detection Enhancement

    Authors: Heng Gao, Zhuolin He, Shoumeng Qiu, Jian Pu

    Abstract: Out-of-distribution (OOD) detection methods have been developed to identify objects that a model has not seen during training. The Outlier Exposure (OE) methods use auxiliary datasets to train OOD detectors directly. However, the collection and learning of representative OOD samples may pose challenges. To tackle these issues, we propose the Outlier Aware Metric Learning (OAML) framework. The main… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  8. arXiv:2406.11683  [pdf, other

    cs.CL

    HoLLMwood: Unleashing the Creativity of Large Language Models in Screenwriting via Role Playing

    Authors: Jing Chen, Xinyu Zhu, Cheng Yang, Chufan Shi, Yadong Xi, Yuxiang Zhang, Junjie Wang, Jiashu Pu, Rongsheng Zhang, Yujiu Yang, Tian Feng

    Abstract: Generative AI has demonstrated unprecedented creativity in the field of computer vision, yet such phenomena have not been observed in natural language processing. In particular, large language models (LLMs) can hardly produce written works at the level of human experts due to the extremely high complexity of literature writing. In this paper, we present HoLLMwood, an automated framework for unleas… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  9. arXiv:2405.13013  [pdf

    cs.CL cs.AI

    Amplifying Aspect-Sentence Awareness: A Novel Approach for Aspect-Based Sentiment Analysis

    Authors: Adamu Lawan, Juhua Pu, Haruna Yunusa, Jawad Muhammad, Aliyu Umar

    Abstract: Aspect-Based Sentiment Analysis (ABSA) is increasingly crucial in Natural Language Processing (NLP) for applications such as customer feedback analysis and product recommendation systems. ABSA goes beyond traditional sentiment analysis by extracting sentiments related to specific aspects mentioned in the text; existing attention-based models often need help to effectively connect aspects with cont… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 24 pages, 4 figures, 4 tables

  10. arXiv:2404.13346  [pdf, other

    cs.RO

    EC-SLAM: Real-time Dense Neural RGB-D SLAM System with Effectively Constrained Global Bundle Adjustment

    Authors: Guanghao Li, Qi Chen, YuXiang Yan, Jian Pu

    Abstract: We introduce EC-SLAM, a real-time dense RGB-D simultaneous localization and mapping (SLAM) system utilizing Neural Radiance Fields (NeRF). Although recent NeRF-based SLAM systems have demonstrated encouraging outcomes, they have yet to completely leverage NeRF's capability to constrain pose optimization. By employing an effectively constrained global bundle adjustment (BA) strategy, our system mak… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  11. arXiv:2403.10051  [pdf, other

    cs.DB

    Accelerating Regular Path Queries over Graph Database with Processing-in-Memory

    Authors: Ruoyan Ma, Shengan Zheng, Guifeng Wang, Jin Pu, Yifan Hua, Wentao Wang, Linpeng Huang

    Abstract: Regular path queries (RPQs) in graph databases are bottlenecked by the memory wall. Emerging processing-in-memory (PIM) technologies offer a promising solution to dispatch and execute path matching tasks in parallel within PIM modules. We present Moctopus, a PIM-based data management system for graph databases that supports efficient batch RPQs and graph updates. Moctopus employs a PIM-friendly dy… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  12. arXiv:2403.02710  [pdf, other

    cs.CV cs.RO

    FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View

    Authors: Jiawei Hou, Xiaoyan Li, Wenhao Guan, Gang Zhang, Di Feng, Yuheng Du, Xiangyang Xue, Jian Pu

    Abstract: In autonomous driving, 3D occupancy prediction outputs voxel-wise status and semantic labels for more comprehensive understandings of 3D scenes compared with traditional perception tasks, such as 3D object detection and bird's-eye view (BEV) semantic segmentation. Recent researchers have extensively explored various aspects of this task, including view transformation techniques, ground-truth label… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted by ICRA 2024

  13. arXiv:2402.09954  [pdf, other

    cs.CL cs.LG

    Crafting a Good Prompt or Providing Exemplary Dialogues? A Study of In-Context Learning for Persona-based Dialogue Generation

    Authors: Jiashu Pu, Yajing Wan, Yuru Zhang, Jing Chen, Ling Cheng, Qian Shao, Yongzhu Chang, Tangjie Lv, Rongsheng Zhang

    Abstract: Previous in-context learning (ICL) research has focused on tasks such as classification, machine translation, text2table, etc., while studies on whether ICL can improve human-like dialogue generation are scarce. Our work fills this gap by systematically investigating the ICL capabilities of large language models (LLMs) in persona-based dialogue generation, conducting extensive experiments on high-… ▽ More

    Submitted 17 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  14. arXiv:2402.05940  [pdf

    cs.LG cs.AI stat.ME

    Causal Relationship Network of Risk Factors Impacting Workday Loss in Underground Coal Mines

    Authors: Shangsi Ren, Cameron A. Beeche, Zhiyi Shi, Maria Acevedo Garcia, Katherine Zychowski, Shuguang Leng, Pedram Roghanchi, Jiantao Pu

    Abstract: This study aims to establish the causal relationship network between various factors leading to workday loss in underground coal mines using a novel causal artificial intelligence (AI) method. The analysis utilizes data obtained from the National Institute for Occupational Safety and Health (NIOSH). A total of 101,010 injury records from 3,982 unique underground coal mines spanning the years from… ▽ More

    Submitted 24 January, 2024; originally announced February 2024.

    Comments: 5 figures 5 tables

  15. arXiv:2401.13854  [pdf, other

    cs.LG cs.CR

    Embedding Attack Project (Work Report)

    Authors: Jiameng Pu, Zafar Takhirov

    Abstract: This report summarizes all the MIA experiments (Membership Inference Attacks) of the Embedding Attack Project, including threat models, experimental setup, experimental results, findings and discussion. Current results cover the evaluation of two main MIA strategies (loss-based and embedding-based MIAs) on 6 AI models ranging from Computer Vision to Language Modelling. There are two ongoing experi… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 13 pages, 5 figures

  16. arXiv:2310.11886  [pdf, other

    cs.SI cs.DB cs.DS

    Sampling Algorithms for Butterfly Counting on Temporal Bipartite Graphs

    Authors: Jiaxi Pu, Yanhao Wang, Yuchen Li, Xuan Zhou

    Abstract: Temporal bipartite graphs are widely used to denote time-evolving relationships between two disjoint sets of nodes, such as customer-product interactions in E-commerce and user-group memberships in social networks. Temporal butterflies, $(2,2)$-bicliques that occur within a short period and in a prescribed order, are essential in modeling the structural and sequential patterns of such graphs. Coun… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 10 pages, 10 figures; under review

  17. arXiv:2310.11532  [pdf, other

    cs.CL eess.AS

    Multi-stage Large Language Model Correction for Speech Recognition

    Authors: Jie Pu, Thai-Son Nguyen, Sebastian Stüker

    Abstract: In this paper, we investigate the usage of large language models (LLMs) to improve the performance of competitive speech recognition systems. Different from previous LLM-based ASR error correction methods, we propose a novel multi-stage approach that utilizes uncertainty estimation of ASR outputs and reasoning capability of LLMs. Specifically, the proposed approach has two stages: the first stage… ▽ More

    Submitted 17 June, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

  18. arXiv:2310.08826  [pdf, other

    cs.CV

    Revisiting Multi-modal 3D Semantic Segmentation in Real-world Autonomous Driving

    Authors: Feng Jiang, Chaoping Tu, Gang Zhang, Jun Li, Hanqing Huang, Junyu Lin, Di Feng, Jian Pu

    Abstract: LiDAR and camera are two critical sensors for multi-modal 3D semantic segmentation and are supposed to be fused efficiently and robustly to promise safety in various real-world scenarios. However, existing multi-modal methods face two key challenges: 1) difficulty with efficient deployment and real-time execution; and 2) drastic performance degradation under weak calibration between LiDAR and came… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 7 pages, 3 figures

  19. arXiv:2309.14744  [pdf, other

    cs.CV cs.RO

    ADU-Depth: Attention-based Distillation with Uncertainty Modeling for Depth Estimation

    Authors: Zizhang Wu, Zhuozheng Li, Zhi-Gang Fan, Yunzhe Wu, Xiaoquan Wang, Rui Tang, Jian Pu

    Abstract: Monocular depth estimation is challenging due to its inherent ambiguity and ill-posed nature, yet it is quite important to many applications. While recent works achieve limited accuracy by designing increasingly complicated networks to extract features with limited spatial geometric cues from a single RGB image, we intend to introduce spatial cues by training a teacher network that leverages left-… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: accepted by CoRL 2023

  20. arXiv:2309.13989  [pdf, other

    cs.LG

    A Novel Approach for Effective Multi-View Clustering with Information-Theoretic Perspective

    Authors: Chenhang Cui, Yazhou Ren, Jingyu Pu, Jiawei Li, Xiaorong Pu, Tianyi Wu, Yutao Shi, Lifang He

    Abstract: Multi-view clustering (MVC) is a popular technique for improving clustering performance using various data sources. However, existing methods primarily focus on acquiring consistent information while often neglecting the issue of redundancy across multiple views. This study presents a new approach called Sufficient Multi-View Clustering (SUMVC) that examines the multi-view clustering framework fro… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

  21. arXiv:2309.12708  [pdf, other

    cs.CV cs.AI cs.LG

    PointSSC: A Cooperative Vehicle-Infrastructure Point Cloud Benchmark for Semantic Scene Completion

    Authors: Yuxiang Yan, Boda Liu, Jianfei Ai, Qinbu Li, Ru Wan, Jian Pu

    Abstract: Semantic Scene Completion (SSC) aims to jointly generate space occupancies and semantic labels for complex 3D scenes. Most existing SSC models focus on volumetric representations, which are memory-inefficient for large outdoor spaces. Point clouds provide a lightweight alternative but existing benchmarks lack outdoor point cloud scenes with semantic labels. To address this, we introduce PointSSC,… ▽ More

    Submitted 6 March, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: ICRA2024, oral & poster

  22. arXiv:2309.11002  [pdf, other

    cs.CV

    PPD: A New Valet Parking Pedestrian Fisheye Dataset for Autonomous Driving

    Authors: Zizhang Wu, Xinyuan Chen, Fan Song, Yuanzhu Gan, Tianhao Xu, Jian Pu, Rui Tang

    Abstract: Pedestrian detection under valet parking scenarios is fundamental for autonomous driving. However, the presence of pedestrians can be manifested in a variety of ways and postures under imperfect ambient conditions, which can adversely affect detection performance. Furthermore, models trained on publicdatasets that include pedestrians generally provide suboptimal outcomes for these valet parking sc… ▽ More

    Submitted 24 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: 9 pages, 6 figures

  23. arXiv:2309.10475  [pdf, other

    cs.CV

    LineMarkNet: Line Landmark Detection for Valet Parking

    Authors: Zizhang Wu, Yuanzhu Gan, Tianhao Xu, Rui Tang, Jian Pu

    Abstract: We aim for accurate and efficient line landmark detection for valet parking, which is a long-standing yet unsolved problem in autonomous driving. To this end, we present a deep line landmark detection system where we carefully design the modules to be lightweight. Specifically, we first empirically design four general line landmarks including three physical lines and one novel mental line. The fou… ▽ More

    Submitted 24 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: 29 pages, 12 figures

  24. arXiv:2309.05256  [pdf, other

    cs.LG

    Examining the Effect of Pre-training on Time Series Classification

    Authors: Jiashu Pu, Shiwei Zhao, Ling Cheng, Yongzhu Chang, Runze Wu, Tangjie Lv, Rongsheng Zhang

    Abstract: Although the pre-training followed by fine-tuning paradigm is used extensively in many fields, there is still some controversy surrounding the impact of pre-training on the fine-tuning process. Currently, experimental findings based on text and image data lack consensus. To delve deeper into the unsupervised pre-training followed by fine-tuning paradigm, we have extended previous research to a new… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

  25. Lifted Algorithms for Symmetric Weighted First-Order Model Sampling

    Authors: Yuanhong Wang, Juhua Pu, Yuyi Wang, Ondřej Kuželka

    Abstract: Weighted model counting (WMC) is the task of computing the weighted sum of all satisfying assignments (i.e., models) of a propositional formula. Similarly, weighted model sampling (WMS) aims to randomly generate models with probability proportional to their respective weights. Both WMC and WMS are hard to solve exactly, falling under the $\#\mathsf{P}$-hard complexity class. However, it is known t… ▽ More

    Submitted 14 June, 2024; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: 47 pages, 6 figures. An expanded version of "On exact sampling in the two-variable fragment of first-order logic" in LICS23. arXiv admin note: substantial text overlap with arXiv:2302.02730

    MSC Class: 68T27 ACM Class: I.2.4

    Journal ref: Artificial Intelligence 331 (2024): 104114

  26. arXiv:2308.04665  [pdf, other

    cs.CL

    Sudowoodo: a Chinese Lyric Imitation System with Source Lyrics

    Authors: Yongzhu Chang, Rongsheng Zhang, Lin Jiang, Qihang Chen, Le Zhang, Jiashu Pu

    Abstract: Lyrics generation is a well-known application in natural language generation research, with several previous studies focusing on generating accurate lyrics using precise control such as keywords, rhymes, etc. However, lyrics imitation, which involves writing new lyrics by imitating the style and content of the source lyrics, remains a challenging task due to the lack of a parallel corpus. In this… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: 7 pages,3 figures, submit to emnlp 2023 demo track

    Journal ref: publish EMNLP 2023

  27. I-WAS: a Data Augmentation Method with GPT-2 for Simile Detection

    Authors: Yongzhu Chang, Rongsheng Zhang, Jiashu Pu

    Abstract: Simile detection is a valuable task for many natural language processing (NLP)-based applications, particularly in the field of literature. However, existing research on simile detection often relies on corpora that are limited in size and do not adequately represent the full range of simile forms. To address this issue, we propose a simile data augmentation method based on \textbf{W}ord replaceme… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: 15 pages, 1 figure

    Report number: Lecture Notes in Computer Science, vol 14189. Springer, Cham

    Journal ref: published ICDAR 2023 D-NLP

  28. arXiv:2306.10921  [pdf, other

    cs.CV

    Understanding Depth Map Progressively: Adaptive Distance Interval Separation for Monocular 3d Object Detection

    Authors: Xianhui Cheng, Shoumeng Qiu, Zhikang Zou, Jian Pu, Xiangyang Xue

    Abstract: Monocular 3D object detection aims to locate objects in different scenes with just a single image. Due to the absence of depth information, several monocular 3D detection techniques have emerged that rely on auxiliary depth maps from the depth estimation task. There are multiple approaches to understanding the representation of depth maps, including treating them as pseudo-LiDAR point clouds, leve… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  29. Neighborhood-based Hard Negative Mining for Sequential Recommendation

    Authors: Lu Fan, Jiashu Pu, Rongsheng Zhang, Xiao-Ming Wu

    Abstract: Negative sampling plays a crucial role in training successful sequential recommendation models. Instead of merely employing random negative sample selection, numerous strategies have been proposed to mine informative negative samples to enhance training and performance. However, few of these approaches utilize structural information. In this work, we observe that as training progresses, the distri… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Journal ref: SIGIR 2023

  30. arXiv:2306.09552  [pdf, other

    cs.AR

    Retrospective: EIE: Efficient Inference Engine on Sparse and Compressed Neural Network

    Authors: Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, William J. Dally

    Abstract: EIE proposed to accelerate pruned and compressed neural networks, exploiting weight sparsity, activation sparsity, and 4-bit weight-sharing in neural network accelerators. Since published in ISCA'16, it opened a new design space to accelerate pruned and sparse neural networks and spawned many algorithm-hardware co-designs for model compression and acceleration, both in academia and commercial AI c… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: Invited retrospective paper at ISCA 2023

  31. arXiv:2305.07397  [pdf, other

    cs.CV

    Learning Monocular Depth in Dynamic Environment via Context-aware Temporal Attention

    Authors: Zizhang Wu, Zhuozheng Li, Zhi-Gang Fan, Yunzhe Wu, Yuanzhu Gan, Jian Pu, Xianzhi Li

    Abstract: The monocular depth estimation task has recently revealed encouraging prospects, especially for the autonomous driving task. To tackle the ill-posed problem of 3D geometric reasoning from 2D monocular images, multi-frame monocular methods are developed to leverage the perspective correlation information from sequential temporal frames. However, moving objects such as cars and trains usually violat… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: accepted by IJCAI 2023; 9 pages, 5 figures

  32. arXiv:2305.06939  [pdf, other

    cs.LG

    Deep Multi-View Subspace Clustering with Anchor Graph

    Authors: Chenhang Cui, Yazhou Ren, Jingyu Pu, Xiaorong Pu, Lifang He

    Abstract: Deep multi-view subspace clustering (DMVSC) has recently attracted increasing attention due to its promising performance. However, existing DMVSC methods still have two issues: (1) they mainly focus on using autoencoders to nonlinearly embed the data, while the embedding may be suboptimal for clustering because the clustering objective is rarely considered in autoencoders, and (2) existing methods… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  33. arXiv:2305.06278  [pdf, other

    cs.CV

    A Multi-modal Garden Dataset and Hybrid 3D Dense Reconstruction Framework Based on Panoramic Stereo Images for a Trimming Robot

    Authors: Can Pu, Chuanyu Yang, Jinnian Pu, Radim Tylecek, Robert B. Fisher

    Abstract: Recovering an outdoor environment's surface mesh is vital for an agricultural robot during task planning and remote visualization. Our proposed solution is based on a newly-designed panoramic stereo camera along with a hybrid novel software framework that consists of three fusion modules. The panoramic stereo camera with a pentagon shape consists of 5 stereo vision camera pairs to stream synchroni… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: 32 pages

  34. arXiv:2304.14800  [pdf, other

    cs.CV cs.AI

    Multi-to-Single Knowledge Distillation for Point Cloud Semantic Segmentation

    Authors: Shoumeng Qiu, Feng Jiang, Haiqiang Zhang, Xiangyang Xue, Jian Pu

    Abstract: 3D point cloud semantic segmentation is one of the fundamental tasks for environmental understanding. Although significant progress has been made in recent years, the performance of classes with few examples or few points is still far from satisfactory. In this paper, we propose a novel multi-to-single knowledge distillation framework for the 3D point cloud semantic segmentation task to boost the… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

    Journal ref: ICRA 2023

  35. arXiv:2304.11393  [pdf, other

    cs.CV cs.AI

    Knowledge Distillation from 3D to Bird's-Eye-View for LiDAR Semantic Segmentation

    Authors: Feng Jiang, Heng Gao, Shoumeng Qiu, Haiqiang Zhang, Ru Wan, Jian Pu

    Abstract: LiDAR point cloud segmentation is one of the most fundamental tasks for autonomous driving scene understanding. However, it is difficult for existing models to achieve both high inference speed and accuracy simultaneously. For example, voxel-based methods perform well in accuracy, while Bird's-Eye-View (BEV)-based methods can achieve real-time inference. To overcome this issue, we develop an effec… ▽ More

    Submitted 22 April, 2023; originally announced April 2023.

    Comments: ICME 2023 Accepted

  36. arXiv:2304.09453  [pdf, other

    cs.CV

    Network Pruning Spaces

    Authors: Xuanyu He, Yu-I Yang, Ran Song, Jiachen Pu, Conggang Hu, Feijun Jiang, Wei Zhang, Huanghao Ding

    Abstract: Network pruning techniques, including weight pruning and filter pruning, reveal that most state-of-the-art neural networks can be accelerated without a significant performance drop. This work focuses on filter pruning which enables accelerated inference with any off-the-shelf deep learning library and hardware. We propose the concept of \emph{network pruning spaces} that parametrize populations of… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

  37. arXiv:2302.12966  [pdf, other

    cs.CV cs.RO

    SUPS: A Simulated Underground Parking Scenario Dataset for Autonomous Driving

    Authors: Jiawei Hou, Qi Chen, Yurong Cheng, Guang Chen, Xiangyang Xue, Taiping Zeng, Jian Pu

    Abstract: Automatic underground parking has attracted considerable attention as the scope of autonomous driving expands. The auto-vehicle is supposed to obtain the environmental information, track its location, and build a reliable map of the scenario. Mainstream solutions consist of well-trained neural networks and simultaneous localization and mapping (SLAM) methods, which need numerous carefully labeled… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: Accepted for publication at the 25th IEEE Intelligent Transportation Systems Conference (ITSC 2022)

  38. arXiv:2302.10549  [pdf, other

    cs.CV

    MonoPGC: Monocular 3D Object Detection with Pixel Geometry Contexts

    Authors: Zizhang Wu, Yuanzhu Gan, Lei Wang, Guilian Chen, Jian Pu

    Abstract: Monocular 3D object detection reveals an economical but challenging task in autonomous driving. Recently center-based monocular methods have developed rapidly with a great trade-off between speed and accuracy, where they usually depend on the object center's depth estimation via 2D features. However, the visual semantic features without sufficient pixel geometry information, may affect the perform… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted by ICRA 2023

  39. arXiv:2302.10511  [pdf, other

    cs.CV

    MVFusion: Multi-View 3D Object Detection with Semantic-aligned Radar and Camera Fusion

    Authors: Zizhang Wu, Guilian Chen, Yuanzhu Gan, Lei Wang, Jian Pu

    Abstract: Multi-view radar-camera fused 3D object detection provides a farther detection range and more helpful features for autonomous driving, especially under adverse weather. The current radar-camera fusion methods deliver kinds of designs to fuse radar information with camera data. However, these fusion approaches usually adopt the straightforward concatenation operation between multi-modal features, w… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted by ICRA 2023

  40. arXiv:2302.04486  [pdf, other

    cs.RO cs.AI cs.CV

    A General Mobile Manipulator Automation Framework for Flexible Manufacturing in Hostile Industrial Environments

    Authors: Can Pu, Chuanyu Yang, Jinnian Pu, Robert B. Fisher

    Abstract: To enable a mobile manipulator to perform human tasks from a single teaching demonstration is vital to flexible manufacturing. We call our proposed method MMPA (Mobile Manipulator Process Automation with One-shot Teaching). Currently, there is no effective and robust MMPA framework which is not influenced by harsh industrial environments and the mobile base's parking precision. The proposed MMPA f… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: 25 pages

  41. arXiv:2302.02730  [pdf, other

    cs.AI cs.LO

    On Exact Sampling in the Two-Variable Fragment of First-Order Logic

    Authors: Yuanhong Wang, Juhua Pu, Yuyi Wang, Ondřej Kuželka

    Abstract: In this paper, we study the sampling problem for first-order logic proposed recently by Wang et al. -- how to efficiently sample a model of a given first-order sentence on a finite domain? We extend their result for the universally-quantified subfragment of two-variable logic $\mathbf{FO}^2$ ($\mathbf{UFO}^2$) to the entire fragment of $\mathbf{FO}^2$. Specifically, we prove the domain-liftability… ▽ More

    Submitted 6 May, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: 37 pages, 4 figures, LICS 2023

    MSC Class: 68T27 ACM Class: I.2.4

  42. Attention-Based Depth Distillation with 3D-Aware Positional Encoding for Monocular 3D Object Detection

    Authors: Zizhang Wu, Yunzhe Wu, Jian Pu, Xianzhi Li, Xiaoquan Wang

    Abstract: Monocular 3D object detection is a low-cost but challenging task, as it requires generating accurate 3D localization solely from a single image input. Recent developed depth-assisted methods show promising results by using explicit depth maps as intermediate features, which are either precomputed by monocular depth estimation networks or jointly evaluated with 3D object detection. However, inevita… ▽ More

    Submitted 3 July, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: Accepted by AAAI2023

  43. arXiv:2211.11761  [pdf, other

    cs.LG

    From Node Interaction to Hop Interaction: New Effective and Scalable Graph Learning Paradigm

    Authors: Jie Chen, Zilong Li, Yin Zhu, Junping Zhang, Jian Pu

    Abstract: Existing Graph Neural Networks (GNNs) follow the message-passing mechanism that conducts information interaction among nodes iteratively. While considerable progress has been made, such node interaction paradigms still have the following limitation. First, the scalability limitation precludes the broad application of GNNs in large-scale industrial settings since the node interaction among rapidly… ▽ More

    Submitted 13 April, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: accepted by CVPR 2023

  44. arXiv:2210.09609  [pdf, other

    cs.LG cs.AI

    SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP

    Authors: Jie Chen, Shouzhen Chen, Mingyuan Bai, Junbin Gao, Junping Zhang, Jian Pu

    Abstract: The message-passing mechanism helps Graph Neural Networks (GNNs) achieve remarkable results on various node classification tasks. Nevertheless, the recursive nodes fetching and aggregation in message-passing cause inference latency when deploying GNNs to large-scale graphs. One promising inference acceleration direction is to distill the GNNs into message-passing-free student multi-layer perceptro… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

  45. arXiv:2210.09421  [pdf, other

    cs.CR cs.CL cs.LG

    Deepfake Text Detection: Limitations and Opportunities

    Authors: Jiameng Pu, Zain Sarwar, Sifat Muhammad Abdullah, Abdullah Rehman, Yoonjin Kim, Parantapa Bhattacharya, Mobin Javed, Bimal Viswanath

    Abstract: Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed f… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE S&P 2023; First two authors contributed equally to this work; 18 pages, 7 figures

  46. arXiv:2210.04142  [pdf, other

    cs.LG

    Deep Clustering: A Comprehensive Survey

    Authors: Yazhou Ren, Jingyu Pu, Zhimeng Yang, Jie Xu, Guofeng Li, Xiaorong Pu, Philip S. Yu, Lifang He

    Abstract: Cluster analysis plays an indispensable role in machine learning and data mining. Learning a good data representation is crucial for clustering algorithms. Recently, deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks. Existing surveys for deep clustering mainly focus on the single-view fields… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

  47. arXiv:2207.03682  [pdf, other

    cs.CV cs.MM

    Music-driven Dance Regeneration with Controllable Key Pose Constraints

    Authors: Junfu Pu, Ying Shan

    Abstract: In this paper, we propose a novel framework for music-driven dance motion synthesis with controllable key pose constraint. In contrast to methods that generate dance motion sequences only based on music without any other controllable conditions, this work targets on synthesizing high-quality dance motion driven by music as well as customized poses performed by users. Our model involves two single-… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

  48. arXiv:2207.03190  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Learning Music-Dance Representations through Explicit-Implicit Rhythm Synchronization

    Authors: Jiashuo Yu, Junfu Pu, Ying Cheng, Rui Feng, Ying Shan

    Abstract: Although audio-visual representation has been proved to be applicable in many downstream tasks, the representation of dancing videos, which is more specific and always accompanied by music with complex auditory contents, remains challenging and uninvestigated. Considering the intrinsic alignment between the cadent movement of dancer and music rhythm, we introduce MuDaR, a novel Music-Dance Represe… ▽ More

    Submitted 10 August, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

    Comments: Accepted for publication in IEEE Transactions on Multimedia

  49. arXiv:2205.15127  [pdf, other

    cs.LG cs.AI

    Universal Deep GNNs: Rethinking Residual Connection in GNNs from a Path Decomposition Perspective for Preventing the Over-smoothing

    Authors: Jie Chen, Weiqi Liu, Zhizhong Huang, Junbin Gao, Junping Zhang, Jian Pu

    Abstract: The performance of GNNs degrades as they become deeper due to the over-smoothing. Among all the attempts to prevent over-smoothing, residual connection is one of the promising methods due to its simplicity. However, recent studies have shown that GNNs with residual connections only slightly slow down the degeneration. The reason why residual connections fail in GNNs is still unknown. In this paper… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  50. arXiv:2204.12807  [pdf, other

    cs.CL cs.AI

    Probing Simile Knowledge from Pre-trained Language Models

    Authors: Weijie Chen, Yongzhu Chang, Rongsheng Zhang, Jiashu Pu, Guandan Chen, Le Zhang, Yadong Xi, Yijiang Chen, Chang Su

    Abstract: Simile interpretation (SI) and simile generation (SG) are challenging tasks for NLP because models require adequate world knowledge to produce predictions. Previous works have employed many hand-crafted resources to bring knowledge-related into models, which is time-consuming and labor-intensive. In recent years, pre-trained language models (PLMs) based approaches have become the de-facto standard… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: Long paper accepted at ACL 2022