Zum Hauptinhalt springen

Showing 1–50 of 173 results for author: Lu, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10000  [pdf, other

    cs.HC

    Working in Extended Reality in the Wild: Worker and Bystander Experiences of XR Virtual Displays in Real-World Settings

    Authors: Leonardo Pavanatto, Verena Biener, Jennifer Chandran, Snehanjali Kalamkar, Feiyu Lu, John J. Dudley, Jinghui Hu, G. Nikki Ramirez-Saffy, Per Ola Kristensson, Alexander Giovannelli, Luke Schlueter, Jörg Müller, Jens Grubert, Doug A. Bowman

    Abstract: Although access to sufficient screen space is crucial to knowledge work, workers often find themselves with limited access to display infrastructure in remote or public settings. While virtual displays can be used to extend the available screen space through extended reality (XR) head-worn displays (HWD), we must better understand the implications of working with them in public settings from both… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2310.09786

  2. arXiv:2408.07307  [pdf, other

    cs.LG math.AP

    Nonlocal Attention Operator: Materializing Hidden Knowledge Towards Interpretable Physics Discovery

    Authors: Yue Yu, Ning Liu, Fei Lu, Tian Gao, Siavash Jafarzadeh, Stewart Silling

    Abstract: Despite the recent popularity of attention-based neural architectures in core AI fields like natural language processing (NLP) and computer vision (CV), their potential in modeling complex physical systems remains under-explored. Learning problems in physical systems are often characterized as discovering operators that map between function spaces based on a few instances of function pairs. This t… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  3. arXiv:2408.02110  [pdf, other

    cs.CV

    AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos

    Authors: Feichi Lu, Zijian Dong, Jie Song, Otmar Hilliges

    Abstract: Despite progress in human motion capture, existing multi-view methods often face challenges in estimating the 3D pose and shape of multiple closely interacting people. This difficulty arises from reliance on accurate 2D joint estimations, which are hard to obtain due to occlusions and body contact when people are in close interaction. To address this, we propose a novel method leveraging the perso… ▽ More

    Submitted 20 August, 2024; v1 submitted 4 August, 2024; originally announced August 2024.

    Comments: Project Page: https://eth-ait.github.io/AvatarPose/

  4. arXiv:2407.06346  [pdf, other

    cs.LG cs.DC stat.ML

    High-Dimensional Distributed Sparse Classification with Scalable Communication-Efficient Global Updates

    Authors: Fred Lu, Ryan R. Curtin, Edward Raff, Francis Ferraro, James Holt

    Abstract: As the size of datasets used in statistical learning continues to grow, distributed training of models has attracted increasing attention. These methods partition the data and exploit parallelism to reduce memory and runtime, but suffer increasingly from communication costs as the data size or the number of iterations grows. Recent work on linear models has shown that a surrogate likelihood can be… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: KDD 2024, Research Track

  5. arXiv:2407.05597  [pdf, other

    cs.CV cs.GR

    GeoNLF: Geometry guided Pose-Free Neural LiDAR Fields

    Authors: Weiyi Xue, Zehan Zheng, Fan Lu, Haiyun Wei, Guang Chen, Changjun Jiang

    Abstract: Although recent efforts have extended Neural Radiance Fields (NeRF) into LiDAR point cloud synthesis, the majority of existing works exhibit a strong dependence on precomputed poses. However, point cloud registration methods struggle to achieve precise global pose estimation, whereas previous pose-free NeRFs overlook geometric consistency in global reconstruction. In light of this, we explore the… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  6. arXiv:2406.18321  [pdf, other

    cs.CL cs.AI

    MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data

    Authors: Meng Fang, Xiangpeng Wan, Fei Lu, Fei Xing, Kai Zou

    Abstract: Large language models (LLMs) have significantly advanced natural language understanding and demonstrated strong problem-solving abilities. Despite these successes, most LLMs still struggle with solving mathematical problems due to the intricate reasoning required. This paper investigates the mathematical problem-solving capabilities of LLMs using the newly developed "MathOdyssey" dataset. The data… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  7. arXiv:2406.07895  [pdf, other

    cs.CV

    Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation

    Authors: Jiadong Liang, Feng Lu

    Abstract: Vivid talking face generation holds immense potential applications across diverse multimedia domains, such as film and game production. While existing methods accurately synchronize lip movements with input audio, they typically ignore crucial alignments between emotion and facial cues, which include expression, gaze, and head pose. These alignments are indispensable for synthesizing realistic vid… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  8. arXiv:2406.01753  [pdf, other

    cs.LG cs.DC stat.ML

    Optimizing the Optimal Weighted Average: Efficient Distributed Sparse Classification

    Authors: Fred Lu, Ryan R. Curtin, Edward Raff, Francis Ferraro, James Holt

    Abstract: While distributed training is often viewed as a solution to optimizing linear models on increasingly large datasets, inter-machine communication costs of popular distributed approaches can dominate as data dimensionality increases. Recent work on non-interactive algorithms shows that approximate solutions for linear models can be obtained efficiently with only a single round of communication among… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Under review

  9. arXiv:2405.19730  [pdf

    cs.AI cs.CV cs.LG

    Research on the Spatial Data Intelligent Foundation Model

    Authors: Shaohua Wang, Xing Xie, Yong Li, Danhuai Guo, Zhi Cai, Yu Liu, Yang Yue, Xiao Pan, Feng Lu, Huayi Wu, Zhipeng Gui, Zhiming Ding, Bolong Zheng, Fuzheng Zhang, Jingyuan Wang, Zhengchao Chen, Hao Lu, Jiayi Li, Peng Yue, Wenhao Yu, Yao Yao, Leilei Sun, Yong Zhang, Longbiao Chen, Xiaoping Du , et al. (6 additional authors not shown)

    Abstract: This report focuses on spatial data intelligent large models, delving into the principles, methods, and cutting-edge applications of these models. It provides an in-depth discussion on the definition, development history, current status, and trends of spatial data intelligent large models, as well as the challenges they face. The report systematically elucidates the key technologies of spatial dat… ▽ More

    Submitted 28 August, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: V1 and V2 are in Chinese language, other versions are in English

  10. arXiv:2405.16868  [pdf, other

    cs.CV

    RCDN: Towards Robust Camera-Insensitivity Collaborative Perception via Dynamic Feature-based 3D Neural Modeling

    Authors: Tianhang Wang, Fan Lu, Zehan Zheng, Guang Chen, Changjun Jiang

    Abstract: Collaborative perception is dedicated to tackling the constraints of single-agent perception, such as occlusions, based on the multiple agents' multi-view sensor inputs. However, most existing works assume an ideal condition that all agents' multi-view cameras are continuously available. In reality, cameras may be highly noisy, obscured or even failed during the collaboration. In this work, we int… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  11. arXiv:2405.12784  [pdf, other

    cs.CV

    Generalize Polyp Segmentation via Inpainting across Diverse Backgrounds and Pseudo-Mask Refinement

    Authors: Jiajian Ma, Fangqi Lu, Silin Huang, Song Wu, Zhen Li

    Abstract: Inpainting lesions within different normal backgrounds is a potential method of addressing the generalization problem, which is crucial for polyp segmentation models. However, seamlessly introducing polyps into complex endoscopic environments while simultaneously generating accurate pseudo-masks remains a challenge for current inpainting methods. To address these issues, we first leverage the pre-… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  12. arXiv:2404.12721  [pdf, other

    cs.CV cs.AI cs.LG

    Generalized Few-Shot Meets Remote Sensing: Discovering Novel Classes in Land Cover Mapping via Hybrid Semantic Segmentation Framework

    Authors: Zhuohong Li, Fangxiao Lu, Jiaqi Zou, Lei Hu, Hongyan Zhang

    Abstract: Land-cover mapping is one of the vital applications in Earth observation, aiming at classifying each pixel's land-cover type of remote-sensing images. As natural and human activities change the landscape, the land-cover map needs to be rapidly updated. However, discovering newly appeared land-cover types in existing classification systems is still a non-trivial task hindered by various scales of c… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 11 pages, 11 figures, accepted by CVPR 2024 L3D-IVU Workshop

  13. arXiv:2404.06780  [pdf, other

    cs.CV

    Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior

    Authors: Fan Lu, Kwan-Yee Lin, Yan Xu, Hongsheng Li, Guang Chen, Changjun Jiang

    Abstract: Text-to-3D generation has achieved remarkable success via large-scale text-to-image diffusion models. Nevertheless, there is no paradigm for scaling up the methodology to urban scale. Urban scenes, characterized by numerous elements, intricate arrangement relationships, and vast scale, present a formidable barrier to the interpretability of ambiguous textual descriptions for effective model optimi… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Project page: https://urbanarchitect.github.io/

  14. arXiv:2404.02742  [pdf, other

    cs.CV

    LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis

    Authors: Zehan Zheng, Fan Lu, Weiyi Xue, Guang Chen, Changjun Jiang

    Abstract: Although neural radiance fields (NeRFs) have achieved triumphs in image novel view synthesis (NVS), LiDAR NVS remains largely unexplored. Previous LiDAR NVS methods employ a simple shift from image NVS methods while ignoring the dynamic nature and the large-scale reconstruction problem of LiDAR point clouds. In light of this, we propose LiDAR4D, a differentiable LiDAR-only framework for novel spac… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024. Project Page: https://dyfcalid.github.io/LiDAR4D

  15. arXiv:2403.17007  [pdf, other

    cs.CV

    DreamLIP: Language-Image Pre-training with Long Captions

    Authors: Kecheng Zheng, Yifei Zhang, Wei Wu, Fan Lu, Shuailei Ma, Xin Jin, Wei Chen, Yujun Shen

    Abstract: Language-image pre-training largely relies on how precisely and thoroughly a text describes its paired image. In practice, however, the contents of an image can be so rich that well describing them requires lengthy captions (e.g., with 10 sentences), which are usually missing in existing datasets. Consequently, there are currently no clear evidences on whether and how language-image pre-training c… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  16. arXiv:2403.16792  [pdf, other

    cs.CL cs.SE

    Iterative Refinement of Project-Level Code Context for Precise Code Generation with Compiler Feedback

    Authors: Zhangqian Bi, Yao Wan, Zheng Wang, Hongyu Zhang, Batu Guan, Fangxin Lu, Zili Zhang, Yulei Sui, Hai Jin, Xuanhua Shi

    Abstract: Large Language Models (LLMs) have shown remarkable progress in automated code generation. Yet, LLM-generated code may contain errors in API usage, class, data structure, or missing project-specific information. As much of this project-specific context cannot fit into the prompts of LLMs, we must find ways to allow the model to explore the project-level code context. We present CoCoGen, a new code… ▽ More

    Submitted 10 June, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  17. arXiv:2403.16428  [pdf, other

    cs.CV

    Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

    Authors: Zicong Fan, Takehiko Ohkawa, Linlin Yang, Nie Lin, Zhishan Zhou, Shihao Zhou, Jiajun Liang, Zhong Gao, Xuanyang Zhang, Xue Zhang, Fei Li, Zheng Liu, Feng Lu, Karim Abou Zeid, Bastian Leibe, Jeongwan On, Seungryul Baek, Aditya Prakash, Saurabh Gupta, Kun He, Yoichi Sato, Otmar Hilliges, Hyung Jin Chang, Angela Yao

    Abstract: We interact with the world with our hands and see it through our own (egocentric) perspective. A holistic 3Dunderstanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion generation. Accurately reconstructing such interactions in 3D is challenging due to heavy occlusion, viewpoint bias, camera distortion, and motion blur from the h… ▽ More

    Submitted 5 August, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted to ECCV 2024

  18. arXiv:2403.12366  [pdf

    cs.LG physics.ao-ph

    U-Net Kalman Filter (UNetKF): An Example of Machine Learning-assisted Ensemble Data Assimilation

    Authors: Feiyu Lu

    Abstract: Machine learning techniques have seen a tremendous rise in popularity in weather and climate sciences. Data assimilation (DA), which combines observations and numerical models, has great potential to incorporate machine learning and artificial intelligence (ML/AI) techniques. In this paper, we use U-Net, a type of convolutional neutral network (CNN), to predict the localized ensemble covariances f… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  19. arXiv:2403.02746  [pdf, other

    cs.CV cs.LG

    Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels

    Authors: Zhuohong Li, Wei He, Jiepan Li, Fangxiao Lu, Hongyan Zhang

    Abstract: Large-scale high-resolution (HR) land-cover mapping is a vital task to survey the Earth's surface and resolve many challenges facing humanity. However, it is still a non-trivial task hindered by complex ground details, various landforms, and the scarcity of accurate training labels over a wide-span geographic area. In this paper, we propose an efficient, weakly supervised framework (Paraformer) to… ▽ More

    Submitted 23 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: 11 pages, 9 figures, accepted by CVPR 2024

  20. arXiv:2402.19231  [pdf, other

    cs.CV cs.RO

    CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition

    Authors: Feng Lu, Xiangyuan Lan, Lijun Zhang, Dongmei Jiang, Yaowei Wang, Chun Yuan

    Abstract: Over the past decade, most methods in visual place recognition (VPR) have used neural networks to produce feature representations. These networks typically produce a global representation of a place image using only this image itself and neglect the cross-image variations (e.g. viewpoint and illumination), which limits their robustness in challenging scenes. In this paper, we propose a robust glob… ▽ More

    Submitted 1 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted by CVPR2024

  21. arXiv:2402.18925  [pdf, other

    cs.CV

    PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds

    Authors: Haotian Liu, Sanqing Qu, Fan Lu, Zongtao Bu, Florian Roehrbein, Alois Knoll, Guang Chen

    Abstract: Event cameras can record scene dynamics with high temporal resolution, providing rich scene details for monocular depth estimation (MDE) even at low-level illumination. Therefore, existing complementary learning approaches for MDE fuse intensity information from images and scene details from event data for better scene understanding. However, most methods directly fuse two modalities at pixel leve… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Under Review

  22. arXiv:2402.18169  [pdf, ps, other

    cs.CL

    MIKO: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery

    Authors: Feihong Lu, Weiqi Wang, Yangyifei Luo, Ziqin Zhu, Qingyun Sun, Baixuan Xu, Haochen Shi, Shiqi Gao, Qian Li, Yangqiu Song, Jianxin Li

    Abstract: Social media has become a ubiquitous tool for connecting with others, staying updated with news, expressing opinions, and finding entertainment. However, understanding the intention behind social media posts remains challenging due to the implicitness of intentions in social media posts, the need for cross-modality understanding of both text and images, and the presence of noisy information such a… ▽ More

    Submitted 29 February, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 11 pages, 5 figures

  23. Deep Homography Estimation for Visual Place Recognition

    Authors: Feng Lu, Shuting Dong, Lijun Zhang, Bingxi Liu, Xiangyuan Lan, Dongmei Jiang, Chun Yuan

    Abstract: Visual place recognition (VPR) is a fundamental task for many applications such as robot localization and augmented reality. Recently, the hierarchical VPR methods have received considerable attention due to the trade-off between accuracy and efficiency. They usually first use global features to retrieve the candidate images, then verify the spatial consistency of matched local features for re-ran… ▽ More

    Submitted 18 March, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI2024

    Journal ref: AAAI 2024

  24. arXiv:2402.14505  [pdf, other

    cs.CV cs.AI

    Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition

    Authors: Feng Lu, Lijun Zhang, Xiangyuan Lan, Shuting Dong, Yaowei Wang, Chun Yuan

    Abstract: Recent studies show that vision models pre-trained in generic visual learning tasks with large-scale data can provide useful feature representations for a wide range of visual perception problems. However, few attempts have been made to exploit pre-trained foundation models in visual place recognition (VPR). Due to the inherent difference in training objectives and data between the tasks of model… ▽ More

    Submitted 3 April, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: ICLR2024

  25. arXiv:2402.08412  [pdf, other

    stat.ML cs.LG math.DS math.ST

    Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel

    Authors: Quanjun Lang, Xiong Wang, Fei Lu, Mauro Maggioni

    Abstract: Modeling multi-agent systems on networks is a fundamental challenge in a wide variety of disciplines. We jointly infer the weight matrix of the network and the interaction kernel, which determine respectively which agents interact with which others and the rules of such interactions from data consisting of multiple trajectories. The estimator we propose leads naturally to a non-convex optimization… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 53 pages, 17 figures

    MSC Class: 62F12; 82C22

  26. arXiv:2401.14978  [pdf, other

    cs.HC cs.MM cs.SD eess.AS

    Robust Dual-Modal Speech Keyword Spotting for XR Headsets

    Authors: Zhuojiang Cai, Yuhan Ma, Feng Lu

    Abstract: While speech interaction finds widespread utility within the Extended Reality (XR) domain, conventional vocal speech keyword spotting systems continue to grapple with formidable challenges, including suboptimal performance in noisy environments, impracticality in situations requiring silence, and susceptibility to inadvertent activations when others speak nearby. These challenges, however, can pot… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: Accepted to IEEE VR 2024

  27. arXiv:2401.13877  [pdf

    cs.CV cs.RO

    AscDAMs: Advanced SLAM-based channel detection and mapping system

    Authors: Tengfei Wang, Fucheng Lu, Jintao Qin, Taosheng Huang, Hui Kong, Ping Shen

    Abstract: Obtaining high-resolution, accurate channel topography and deposit conditions is the prior challenge for the study of channelized debris flow. Currently, wide-used mapping technologies including satellite imaging and drone photogrammetry struggle to precisely observe channel interior conditions of mountainous long-deep gullies, particularly those in the Wenchuan Earthquake region. SLAM is an emerg… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  28. arXiv:2401.07022  [pdf, other

    cs.LG cs.AI cs.CL

    Edge-Enabled Anomaly Detection and Information Completion for Social Network Knowledge Graphs

    Authors: Fan Lu, Quan Qi, Huaibin Qin

    Abstract: In the rapidly advancing information era, various human behaviors are being precisely recorded in the form of data, including identity information, criminal records, and communication data. Law enforcement agencies can effectively maintain social security and precisely combat criminal activities by analyzing the aforementioned data. In comparison to traditional data analysis methods, deep learning… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: 20 pages, 6 figures, Has been accepted by Wireless Network

  29. arXiv:2401.07009  [pdf, other

    cs.CL cs.AI

    Joint Extraction of Uyghur Medicine Knowledge with Edge Computing

    Authors: Fan Lu, Quan Qi, Huaibin Qin

    Abstract: Medical knowledge extraction methods based on edge computing deploy deep learning models on edge devices to achieve localized entity and relation extraction. This approach avoids transferring substantial sensitive data to cloud data centers, effectively safeguarding the privacy of healthcare services. However, existing relation extraction methods mainly employ a sequential pipeline approach, which… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: 11 pages,6 figures,Has been accepted by Tsinghua Science and Technology

  30. arXiv:2312.15813  [pdf, other

    cs.LG

    Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits!

    Authors: Tirth Patel, Fred Lu, Edward Raff, Charles Nicholas, Cynthia Matuszek, James Holt

    Abstract: Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines, meaning a 0.1\% change can cause an overwhelming number of false positives. However, academic research is often restrained to public datasets on the order of ten thousand samples and is too small to detect improvements that may be relevant to ind… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: To appear in Conference on Applied Machine Learning for Information Security 2023

  31. arXiv:2312.15644  [pdf, other

    cs.CV

    UVAGaze: Unsupervised 1-to-2 Views Adaptation for Gaze Estimation

    Authors: Ruicong Liu, Feng Lu

    Abstract: Gaze estimation has become a subject of growing interest in recent research. Most of the current methods rely on single-view facial images as input. Yet, it is hard for these approaches to handle large head angles, leading to potential inaccuracies in the estimation. To address this issue, adding a second-view camera can help better capture eye appearance. However, existing multi-view methods have… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: This paper is accepted by AAAI2024. Code has been released at https://github.com/MickeyLLG/UVAGaze

  32. arXiv:2312.09513  [pdf, other

    cs.AI

    CGS-Mask: Making Time Series Predictions Intuitive for All

    Authors: Feng Lu, Wei Li, Yifei Sun, Cheng Song, Yufei Ren, Albert Y. Zomaya

    Abstract: Artificial intelligence (AI) has immense potential in time series prediction, but most explainable tools have limited capabilities in providing a systematic understanding of important features over time. These tools typically rely on evaluating a single time point, overlook the time ordering of inputs, and neglect the time-sensitive nature of time series applications. These factors make it difficu… ▽ More

    Submitted 12 April, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI24

  33. arXiv:2312.01732  [pdf, other

    cs.CV

    Likelihood-Aware Semantic Alignment for Full-Spectrum Out-of-Distribution Detection

    Authors: Fan Lu, Kai Zhu, Kecheng Zheng, Wei Zhai, Yang Cao

    Abstract: Full-spectrum out-of-distribution (F-OOD) detection aims to accurately recognize in-distribution (ID) samples while encountering semantic and covariate shifts simultaneously. However, existing out-of-distribution (OOD) detectors tend to overfit the covariance information and ignore intrinsic semantic correlation, inadequate for adapting to complex domain transformations. To address this issue, we… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: 16 pages, 7 figures

  34. arXiv:2311.18216  [pdf, other

    cs.CV cs.MM eess.IV

    FS-BAND: A Frequency-Sensitive Banding Detector

    Authors: Zijian Chen, Wei Sun, Zicheng Zhang, Ru Huang, Fangfang Lu, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang

    Abstract: Banding artifact, as known as staircase-like contour, is a common quality annoyance that happens in compression, transmission, etc. scenarios, which largely affects the user's quality of experience (QoE). The banding distortion typically appears as relatively small pixel-wise variations in smooth backgrounds, which is difficult to analyze in the spatial domain but easily reflected in the frequency… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2311.17752

  35. arXiv:2311.17752  [pdf, other

    cs.CV cs.DB cs.MM

    BAND-2k: Banding Artifact Noticeable Database for Banding Detection and Quality Assessment

    Authors: Zijian Chen, Wei Sun, Jun Jia, Fangfang Lu, Zicheng Zhang, Jing Liu, Ru Huang, Xiongkuo Min, Guangtao Zhai

    Abstract: Banding, also known as staircase-like contours, frequently occurs in flat areas of images/videos processed by the compression or quantization algorithms. As undesirable artifacts, banding destroys the original image structure, thus degrading users' quality of experience (QoE). In this paper, we systematically investigate the banding image quality assessment (IQA) problem, aiming to detect the imag… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  36. arXiv:2311.16618  [pdf, other

    cs.CV

    Cross-level Attention with Overlapped Windows for Camouflaged Object Detection

    Authors: Jiepan Li, Fangxiao Lu, Nan Xue, Zhuohong Li, Hongyan Zhang, Wei He

    Abstract: Camouflaged objects adaptively fit their color and texture with the environment, which makes them indistinguishable from the surroundings. Current methods revealed that high-level semantic features can highlight the differences between camouflaged objects and the backgrounds. Consequently, they integrate high-level semantic features with low-level detailed features for accurate camouflaged object… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  37. arXiv:2311.15609  [pdf, other

    cs.LG cs.CV

    A manometric feature descriptor with linear-SVM to distinguish esophageal contraction vigor

    Authors: Jialin Liu, Lu Yan, Xiaowei Liu, Yuzhuo Dai, Fanggen Lu, Yuanting Ma, Muzhou Hou, Zheng Wang

    Abstract: n clinical, if a patient presents with nonmechanical obstructive dysphagia, esophageal chest pain, and gastro esophageal reflux symptoms, the physician will usually assess the esophageal dynamic function. High-resolution manometry (HRM) is a clinically commonly used technique for detection of esophageal dynamic function comprehensively and objectively. However, after the results of HRM are obtaine… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  38. arXiv:2310.19978  [pdf, other

    cs.LG stat.CO stat.ML

    Scaling Up Differentially Private LASSO Regularized Logistic Regression via Faster Frank-Wolfe Iterations

    Authors: Edward Raff, Amol Khanna, Fred Lu

    Abstract: To the best of our knowledge, there are no methods today for training differentially private regression models on sparse input data. To remedy this, we adapt the Frank-Wolfe algorithm for $L_1$ penalized linear regression to be aware of sparse inputs and to use them effectively. In doing so, we reduce the training time of the algorithm from $\mathcal{O}( T D S + T N S)$ to… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: To appear in the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  39. arXiv:2310.18874  [pdf, other

    cs.CV cs.AI

    HDMNet: A Hierarchical Matching Network with Double Attention for Large-scale Outdoor LiDAR Point Cloud Registration

    Authors: Weiyi Xue, Fan Lu, Guang Chen

    Abstract: Outdoor LiDAR point clouds are typically large-scale and complexly distributed. To achieve efficient and accurate registration, emphasizing the similarity among local regions and prioritizing global local-to-local matching is of utmost importance, subsequent to which accuracy can be enhanced through cost-effective fine registration. In this paper, a novel hierarchical neural network with double at… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted by WACV2024

  40. arXiv:2310.07729  [pdf, other

    cs.RO eess.SY

    Energy-Aware Routing Algorithm for Mobile Ground-to-Air Charging

    Authors: Bill Cai, Fei Lu, Lifeng Zhou

    Abstract: We investigate the problem of energy-constrained planning for a cooperative system of an Unmanned Ground Vehicles (UGV) and an Unmanned Aerial Vehicle (UAV). In scenarios where the UGV serves as a mobile base to ferry the UAV and as a charging station to recharge the UAV, we formulate a novel energy-constrained routing problem. To tackle this problem, we design an energy-aware routing algorithm, a… ▽ More

    Submitted 6 August, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

  41. arXiv:2310.05184  [pdf, other

    cs.CV

    AANet: Aggregation and Alignment Network with Semi-hard Positive Sample Mining for Hierarchical Place Recognition

    Authors: Feng Lu, Lijun Zhang, Shuting Dong, Baifan Chen, Chun Yuan

    Abstract: Visual place recognition (VPR) is one of the research hotspots in robotics, which uses visual information to locate robots. Recently, the hierarchical two-stage VPR methods have become popular in this field due to the trade-off between accuracy and efficiency. These methods retrieve the top-k candidate images using the global features in the first stage, then re-rank the candidates by matching the… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

    Comments: ICRA2023

  42. arXiv:2309.14907  [pdf, other

    cs.LG cs.AI

    Label Deconvolution for Node Representation Learning on Large-scale Attributed Graphs against Learning Bias

    Authors: Zhihao Shi, Jie Wang, Fanghua Lu, Hanzhu Chen, Defu Lian, Zheng Wang, Jieping Ye, Feng Wu

    Abstract: Node representation learning on attributed graphs -- whose nodes are associated with rich attributes (e.g., texts and protein sequences) -- plays a crucial role in many important downstream tasks. To encode the attributes and graph structures simultaneously, recent studies integrate pre-trained models with graph neural networks (GNNs), where pre-trained models serve as node encoders (NEs) to encod… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  43. arXiv:2309.05105  [pdf, other

    math.OC cs.LG

    Convex Q Learning in a Stochastic Environment: Extended Version

    Authors: Fan Lu, Sean Meyn

    Abstract: The paper introduces the first formulation of convex Q-learning for Markov decision processes with function approximation. The algorithms and theory rest on a relaxation of a dual of Manne's celebrated linear programming characterization of optimal control. The main contributions firstly concern properties of the relaxation, described as a deterministic convex program: we identify conditions for a… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: Extended version of "Convex Q-learning in a stochastic environment", IEEE Conference on Decision and Control, 2023 (to appear)

    MSC Class: 68T05; 93E35; 62L20; 93E20

  44. arXiv:2309.02165  [pdf, other

    cs.CV

    PCFGaze: Physics-Consistent Feature for Appearance-based Gaze Estimation

    Authors: Yiwei Bao, Feng Lu

    Abstract: Although recent deep learning based gaze estimation approaches have achieved much improvement, we still know little about how gaze features are connected to the physics of gaze. In this paper, we try to answer this question by analyzing the gaze feature manifold. Our analysis revealed the insight that the geodesic distance between gaze features is consistent with the gaze differences between sampl… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  45. arXiv:2308.10310  [pdf, other

    cs.CV

    DVGaze: Dual-View Gaze Estimation

    Authors: Yihua Cheng, Feng Lu

    Abstract: Gaze estimation methods estimate gaze from facial appearance with a single camera. However, due to the limited view of a single camera, the captured facial appearance cannot provide complete facial information and thus complicate the gaze estimation problem. Recently, camera devices are rapidly updated. Dual cameras are affordable for users and have been integrated in many devices. This developmen… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  46. arXiv:2308.10207  [pdf

    cs.HC

    Affective Digital Twins for Digital Human: Bridging the Gap in Human-Machine Affective Interaction

    Authors: Feng Lu, Bo Liu

    Abstract: In recent years, metaverse and digital humans have become important research and industry areas of focus. However, existing digital humans still lack realistic affective traits, making emotional interaction with humans difficult. Grounded in the developments of artificial intelligence, human-computer interaction, virtual reality, and affective computing, this paper proposes the concept and technic… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

    Comments: 7 pages, in Chinese language, 6 figures

    Journal ref: Journal of Chinese Association for Artificial Intelligence, 13(3), 11-16 (2023)

  47. arXiv:2308.08884  [pdf, other

    cs.CV

    SRMAE: Masked Image Modeling for Scale-Invariant Deep Representations

    Authors: Zhiming Wang, Lin Gu, Feng Lu

    Abstract: Due to the prevalence of scale variance in nature images, we propose to use image scale as a self-supervised signal for Masked Image Modeling (MIM). Our method involves selecting random patches from the input image and downsampling them to a low-resolution format. Our framework utilizes the latest advances in super-resolution (SR) to design the prediction head, which reconstructs the input from lo… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

  48. arXiv:2308.00591  [pdf, other

    cs.CV

    Visibility Enhancement for Low-light Hazy Scenarios

    Authors: Chaoqun Zhuang, Yunfei Liu, Sijia Wen, Feng Lu

    Abstract: Low-light hazy scenes commonly appear at dusk and early morning. The visual enhancement for low-light hazy images is an ill-posed problem. Even though numerous methods have been proposed for image dehazing and low-light enhancement respectively, simply integrating them cannot deliver pleasing results for this particular task. In this paper, we present a novel method to enhance visibility for low-l… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

  49. arXiv:2308.00240  [pdf, other

    cs.CL

    Towards Effective Ancient Chinese Translation: Dataset, Model, and Evaluation

    Authors: Geyang Guo, Jiarong Yang, Fengyuan Lu, Jiaxin Qin, Tianyi Tang, Wayne Xin Zhao

    Abstract: Interpreting ancient Chinese has been the key to comprehending vast Chinese literature, tradition, and civilization. In this paper, we propose Erya for ancient Chinese translation. From a dataset perspective, we collect, clean, and classify ancient Chinese materials from various sources, forming the most extensive ancient Chinese resource to date. From a model perspective, we devise Erya training… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: Accepted by NLPCC 2023

  50. arXiv:2307.13855  [pdf, other

    cs.CV cs.LG

    Exploring the Sharpened Cosine Similarity

    Authors: Skyler Wu, Fred Lu, Edward Raff, James Holt

    Abstract: Convolutional layers have long served as the primary workhorse for image classification. Recently, an alternative to convolution was proposed using the Sharpened Cosine Similarity (SCS), which in theory may serve as a better feature detector. While multiple sources report promising results, there has not been to date a full-scale empirical analysis of neural network performance using these new lay… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: Accepted to I Can't Believe It's Not Better Workshop (ICBINB) at NeurIPS 2022