Skip to main content

Showing 1–50 of 170 results for author: Yan, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12788  [pdf, other

    cs.CV cs.AI

    SS-ADA: A Semi-Supervised Active Domain Adaptation Framework for Semantic Segmentation

    Authors: Weihao Yan, Yeqiang Qian, Yueyuan Li, Tao Li, Chunxiang Wang, Ming Yang

    Abstract: Semantic segmentation plays an important role in intelligent vehicles, providing pixel-level semantic information about the environment. However, the labeling budget is expensive and time-consuming when semantic segmentation model is applied to new driving scenarios. To reduce the costs, semi-supervised semantic segmentation methods have been proposed to leverage large quantities of unlabeled imag… ▽ More

    Submitted 17 June, 2024; originally announced July 2024.

    Comments: 12 pages,13 figures,8 tables

  2. arXiv:2407.07840  [pdf, other

    cs.CV cs.CL

    Decompose and Compare Consistency: Measuring VLMs' Answer Reliability via Task-Decomposition Consistency Comparison

    Authors: Qian Yang, Weixiang Yan, Aishwarya Agrawal

    Abstract: Despite tremendous advancements, current state-of-the-art Vision-Language Models (VLMs) are still far from perfect. They tend to hallucinate and may generate biased responses. In such circumstances, having a way to assess the reliability of a given response generated by a VLM is quite useful. Existing methods, such as estimating uncertainty using answer likelihoods or prompt-based confidence gener… ▽ More

    Submitted 11 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: Preprint

  3. arXiv:2407.07732  [pdf

    cs.HC cs.AI

    Text2VP: Generative AI for Visual Programming and Parametric Modeling

    Authors: Guangxi Feng, Wei Yan

    Abstract: The integration of generative artificial intelligence (AI) into architectural design has witnessed a significant evolution, marked by the recent advancements in AI to generate text, images, and 3D models. However, no models exist for text-to-parametric models that are used in architectural design for generating various design options, including free-form designs, and optimizing the design options.… ▽ More

    Submitted 8 June, 2024; originally announced July 2024.

    Comments: Demonstration Video: https://www.youtube.com/playlist?list=PLUOmOLuLSaDWss2En2buixBxvTPy-lDvA

  4. arXiv:2407.06937  [pdf, other

    cs.CV

    HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance

    Authors: Guian Fang, Wenbiao Yan, Yuanfan Guo, Jianhua Han, Zutao Jiang, Hang Xu, Shengcai Liao, Xiaodan Liang

    Abstract: Text-to-image diffusion models have significantly advanced in conditional image generation. However, these models usually struggle with accurately rendering images featuring humans, resulting in distorted limbs and other anomalies. This issue primarily stems from the insufficient recognition and evaluation of limb qualities in diffusion models. To address this issue, we introduce AbHuman, the firs… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  5. arXiv:2407.06362  [pdf, other

    cs.RO physics.app-ph

    Self-deployable contracting-cord metamaterials with tunable mechanical properties

    Authors: Wenzhong Yan, Talmage Jones, Christopher L. Jawetz, Ryan H. Lee, Jonathan B. Hopkins, Ankur Mehta

    Abstract: Recent advances in active materials and fabrication techniques have enabled the production of cyclically self-deployable metamaterials with an expanded functionality space. However, designing metamaterials that possess continuously tunable mechanical properties after self-deployment remains a challenge, notwithstanding its importance. Inspired by push puppets, we introduce an efficient design stra… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 6 figures

    Journal ref: Materials Horizons (2024)

  6. arXiv:2406.13890  [pdf, other

    cs.CL cs.AI

    ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World

    Authors: Weixiang Yan, Haitian Liu, Tengxiao Wu, Qian Chen, Wen Wang, Haoyuan Chai, Jiayi Wang, Weishan Zhao, Yixin Zhang, Renjun Zhang, Li Zhu

    Abstract: LLMs have achieved significant performance progress in various NLP applications. However, LLMs still struggle to meet the strict requirements for accuracy and reliability in the medical field and face many challenges in clinical applications. Existing clinical diagnostic evaluation benchmarks for evaluating medical agents powered by LLMs have severe limitations. Firstly, most existing medical eval… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  7. arXiv:2405.19547  [pdf, other

    cs.LG cs.CV

    CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning

    Authors: Yiping Wang, Yifang Chen, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du

    Abstract: Data selection has emerged as a core issue for large-scale visual-language model pretaining (e.g., CLIP), particularly with noisy web-curated datasets. Three main data selection approaches are: (1) leveraging external non-CLIP models to aid data selection, (2) training new CLIP-style embedding models that are more effective at selecting high-quality data than the original OpenAI CLIP model, and (3… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: This paper supercedes our previous VAS paper (arXiv:2402.02055)

  8. arXiv:2405.10879  [pdf, other

    cs.CV

    One registration is worth two segmentations

    Authors: Shiqi Huang, Tingfa Xu, Ziyi Shen, Shaheer Ullah Saeed, Wen Yan, Dean Barratt, Yipeng Hu

    Abstract: The goal of image registration is to establish spatial correspondence between two or more images, traditionally through dense displacement fields (DDFs) or parametric transformations (e.g., rigid, affine, and splines). Rethinking the existing paradigms of achieving alignment via spatial transformations, we uncover an alternative but more intuitive correspondence representation: a set of correspond… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: Early Accepted by MICCAI2024

  9. arXiv:2405.06247  [pdf, other

    cs.LG cs.AI cs.CR

    Disttack: Graph Adversarial Attacks Toward Distributed GNN Training

    Authors: Yuxiang Zhang, Xin Liu, Meng Wu, Wei Yan, Mingyu Yan, Xiaochun Ye, Dongrui Fan

    Abstract: Graph Neural Networks (GNNs) have emerged as potent models for graph learning. Distributing the training process across multiple computing nodes is the most promising solution to address the challenges of ever-growing real-world graphs. However, current adversarial attack methods on GNNs neglect the characteristics and applications of the distributed scenario, leading to suboptimal performance and… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted by 30th International European Conference on Parallel and Distributed Computing(Euro-Par 2024)

  10. arXiv:2405.00253  [pdf, other

    cs.CL cs.SE

    CodeHalu: Code Hallucinations in LLMs Driven by Execution-based Verification

    Authors: Yuchen Tian, Weixiang Yan, Qian Yang, Qian Chen, Wen Wang, Ziyang Luo, Lei Ma

    Abstract: Large Language Models (LLMs) have made significant progress in code generation, providing developers with unprecedented automated programming support. However, LLMs often generate code that is syntactically correct and even semantically plausible but may not execute as expected or meet specified requirements. This phenomenon of hallucinations in the code domain has not been systematically explored… ▽ More

    Submitted 26 June, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

  11. arXiv:2404.07181  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.comp-ph

    BAMBOO: a predictive and transferable machine learning force field framework for liquid electrolyte development

    Authors: Sheng Gong, Yumin Zhang, Zhenliang Mu, Zhichen Pu, Hongyi Wang, Zhiao Yu, Mengyi Chen, Tianze Zheng, Zhi Wang, Lifei Chen, Xiaojie Wu, Shaochen Shi, Weihao Gao, Wen Yan, Liang Xiang

    Abstract: Despite the widespread applications of machine learning force field (MLFF) on solids and small molecules, there is a notable gap in applying MLFF to complex liquid electrolytes. In this work, we introduce BAMBOO (ByteDance AI Molecular Simulation Booster), a novel framework for molecular dynamics (MD) simulations, with a demonstration of its capabilities in the context of liquid electrolytes for l… ▽ More

    Submitted 22 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  12. arXiv:2404.04485  [pdf, other

    cs.HC

    Majority Voting of Doctors Improves Appropriateness of AI Reliance in Pathology

    Authors: Hongyan Gu, Chunxu Yang, Shino Magaki, Neda Zarrin-Khameh, Nelli S. Lakis, Inma Cobos, Negar Khanlou, Xinhai R. Zhang, Jasmeet Assi, Joshua T. Byers, Ameer Hamza, Karam Han, Anders Meyer, Hilda Mirbaha, Carrie A. Mohila, Todd M. Stevens, Sara L. Stone, Wenzhong Yan, Mohammad Haeri, Xiang 'Anthony' Chen

    Abstract: As Artificial Intelligence (AI) making advancements in medical decision-making, there is a growing need to ensure doctors develop appropriate reliance on AI to avoid adverse outcomes. However, existing methods in enabling appropriate AI reliance might encounter challenges while being applied in the medical domain. With this regard, this work employs and provides the validation of an alternative ap… ▽ More

    Submitted 16 June, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

    Comments: 46 pages, 11 figures. Accepted International Journal of Human-Computer Studies

  13. arXiv:2404.00357  [pdf, other

    cs.LG

    Revisiting Random Weight Perturbation for Efficiently Improving Generalization

    Authors: Tao Li, Qinghua Tao, Weihao Yan, Zehao Lei, Yingwen Wu, Kun Fang, Mingzhen He, Xiaolin Huang

    Abstract: Improving the generalization ability of modern deep neural networks (DNNs) is a fundamental challenge in machine learning. Two branches of methods have been proposed to seek flat minima and improve generalization: one led by sharpness-aware minimization (SAM) minimizes the worst-case neighborhood loss through adversarial weight perturbation (AWP), and the other minimizes the expected Bayes objecti… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted to TMLR 2024

  14. arXiv:2403.16153  [pdf, other

    cs.LG cs.AI

    One Masked Model is All You Need for Sensor Fault Detection, Isolation and Accommodation

    Authors: Yiwei Fu, Weizhong Yan

    Abstract: Accurate and reliable sensor measurements are critical for ensuring the safety and longevity of complex engineering systems such as wind turbines. In this paper, we propose a novel framework for sensor fault detection, isolation, and accommodation (FDIA) using masked models and self-supervised learning. Our proposed approach is a general time series modeling approach that can be applied to any neu… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: Accepted by the 2024 International Joint Conference on Neural Networks (IJCNN 2024)

  15. Semantic Is Enough: Only Semantic Information For NeRF Reconstruction

    Authors: Ruibo Wang, Song Zhang, Ping Huang, Donghai Zhang, Wei Yan

    Abstract: Recent research that combines implicit 3D representation with semantic information, like Semantic-NeRF, has proven that NeRF model could perform excellently in rendering 3D structures with semantic labels. This research aims to extend the Semantic Neural Radiance Fields (Semantic-NeRF) model by focusing solely on semantic output and removing the RGB output component. We reformulate the model and i… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  16. arXiv:2403.07943  [pdf, other

    cs.LG cs.CR

    Revisiting Edge Perturbation for Graph Neural Network in Graph Data Augmentation and Attack

    Authors: Xin Liu, Yuxiang Zhang, Meng Wu, Mingyu Yan, Kun He, Wei Yan, Shirui Pan, Xiaochun Ye, Dongrui Fan

    Abstract: Edge perturbation is a basic method to modify graph structures. It can be categorized into two veins based on their effects on the performance of graph neural networks (GNNs), i.e., graph data augmentation and attack. Surprisingly, both veins of edge perturbation methods employ the same operations, yet yield opposite effects on GNNs' accuracy. A distinct boundary between these methods in using edg… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: 14P

  17. arXiv:2403.07408  [pdf, other

    cs.CV

    NightHaze: Nighttime Image Dehazing via Self-Prior Learning

    Authors: Beibei Lin, Yeying Jin, Wending Yan, Wei Ye, Yuan Yuan, Robby T. Tan

    Abstract: Masked autoencoder (MAE) shows that severe augmentation during training produces robust representations for high-level tasks. This paper brings the MAE-like framework to nighttime image enhancement, demonstrating that severe augmentation during training produces strong network priors that are resilient to real-world night haze degradations. We propose a novel nighttime image dehazing method with s… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  18. arXiv:2403.05146  [pdf, other

    cs.CV

    Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy

    Authors: Yuelin Zhang, Wanquan Yan, Kim Yan, Chun Ping Lam, Yufu Qiu, Pengyu Zheng, Raymond Shing-Yan Tang, Shing Shin Cheng

    Abstract: Gastric simulators with objective educational feedback have been proven useful for endoscopy training. Existing electronic simulators with feedback are however not commonly adopted due to their high cost. In this work, a motion-guided dual-camera tracker is proposed to provide reliable endoscope tip position feedback at a low cost inside a mechanical simulator for endoscopy skill evaluation, tackl… ▽ More

    Submitted 20 April, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  19. arXiv:2403.04143  [pdf, other

    cs.RO

    Incremental Bayesian Learning for Fail-Operational Control in Autonomous Driving

    Authors: Lei Zheng, Rui Yang, Zengqi Peng, Wei Yan, Michael Yu Wang, Jun Ma

    Abstract: Abrupt maneuvers by surrounding vehicles (SVs) can typically lead to safety concerns and affect the task efficiency of the ego vehicle (EV), especially with model uncertainties stemming from environmental disturbances. This paper presents a real-time fail-operational controller that ensures the asymptotic convergence of an uncertain EV to a safe state, while preserving task efficiency in dynamic e… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 8 pages, 8 figures, accepted for publication in the 22nd European Control Conference (ECC 2024)

  20. arXiv:2403.02611  [pdf, other

    cs.CV cs.AI

    A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning

    Authors: Yuelin Zhang, Pengyu Zheng, Wanquan Yan, Chengyu Fang, Shing Shin Cheng

    Abstract: Defocus blur is a persistent problem in microscope imaging that poses harm to pathology interpretation and medical intervention in cell microscopy and microscope surgery. To address this problem, a unified framework including the multi-pyramid transformer (MPT) and extended frequency contrastive regularization (EFCR) is proposed to tackle two outstanding challenges in microscopy deblur: longer att… ▽ More

    Submitted 3 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  21. arXiv:2402.13778  [pdf, other

    cs.CV

    Weakly supervised localisation of prostate cancer using reinforcement learning for bi-parametric MR images

    Authors: Martynas Pocius, Wen Yan, Dean C. Barratt, Mark Emberton, Matthew J. Clarkson, Yipeng Hu, Shaheer U. Saeed

    Abstract: In this paper we propose a reinforcement learning based weakly supervised system for localisation. We train a controller function to localise regions of interest within an image by introducing a novel reward definition that utilises non-binarised classification probability, generated by a pre-trained binary classifier which classifies object presence in images or image crops. The object-presence c… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted at ISBI 2024 (21st IEEE International Symposium on Biomedical Imaging)

  22. arXiv:2402.10728  [pdf, other

    eess.IV cs.CV

    Semi-weakly-supervised neural network training for medical image registration

    Authors: Yiwen Li, Yunguan Fu, Iani J. M. B. Gayo, Qianye Yang, Zhe Min, Shaheer U. Saeed, Wen Yan, Yipei Wang, J. Alison Noble, Mark Emberton, Matthew J. Clarkson, Dean C. Barratt, Victor A. Prisacariu, Yipeng Hu

    Abstract: For training registration networks, weak supervision from segmented corresponding regions-of-interest (ROIs) have been proven effective for (a) supplementing unsupervised methods, and (b) being used independently in registration tasks in which unsupervised losses are unavailable or ineffective. This correspondence-informing supervision entails cost in annotation that requires significant specialis… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  23. arXiv:2402.08268  [pdf, other

    cs.LG

    World Model on Million-Length Video And Language With Blockwise RingAttention

    Authors: Hao Liu, Wilson Yan, Matei Zaharia, Pieter Abbeel

    Abstract: Current language models fall short in understanding aspects of the world not easily described in words, and struggle with complex, long-form tasks. Video sequences offer valuable temporal information absent in language and static images, making them attractive for joint modeling with language. Such models could develop a understanding of both human textual knowledge and the physical world, enablin… ▽ More

    Submitted 14 March, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  24. arXiv:2402.02414  [pdf, other

    cs.HC cs.CV

    Navigate Biopsy with Ultrasound under Augmented Reality Device: Towards Higher System Performance

    Authors: Haowei Li, Wenqing Yan, Jiasheng Zhao, Yuqi Ji, Long Qian, Hui Ding, Zhe Zhao, Guangzhi Wang

    Abstract: Purpose: Biopsies play a crucial role in determining the classification and staging of tumors. Ultrasound is frequently used in this procedure to provide real-time anatomical information. Using augmented reality (AR), surgeons can visualize ultrasound data and spatial navigation information seamlessly integrated with real tissues. This innovation facilitates faster and more precise biopsy operatio… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  25. arXiv:2402.02055  [pdf, other

    cs.LG cs.AI

    Variance Alignment Score: A Simple But Tough-to-Beat Data Selection Method for Multimodal Contrastive Learning

    Authors: Yiping Wang, Yifang Chen, Wendan Yan, Kevin Jamieson, Simon Shaolei Du

    Abstract: In recent years, data selection has emerged as a core issue for large-scale visual-language model pretraining, especially on noisy web-curated datasets. One widely adopted strategy assigns quality scores such as CLIP similarity for each sample and retains the data pairs with the highest scores. However, these approaches are agnostic of data distribution and always fail to select the most informati… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: 17 pages, 4 figures

  26. arXiv:2401.17642  [pdf, other

    cs.CV

    Exploring the Common Appearance-Boundary Adaptation for Nighttime Optical Flow

    Authors: Hanyu Zhou, Yi Chang, Haoyue Liu, Wending Yan, Yuxing Duan, Zhiwei Shi, Luxin Yan

    Abstract: We investigate a challenging task of nighttime optical flow, which suffers from weakened texture and amplified noise. These degradations weaken discriminative visual features, thus causing invalid motion feature matching. Typically, existing methods employ domain adaptation to transfer knowledge from auxiliary domain to nighttime domain in either input visual space or output motion space. However,… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Journal ref: International Conference on Learning Representations (ICLR), 2024

  27. arXiv:2401.12645  [pdf, ps, other

    cs.IT cs.LG eess.SP

    On the Robustness of Deep Learning-aided Symbol Detectors to Varying Conditions and Imperfect Channel Knowledge

    Authors: Chin-Hung Chen, Boris Karanov, Wim van Houtum, Wu Yan, Alex Young, Alex Alvarado

    Abstract: Recently, a data-driven Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm tailored to channels with intersymbol interference has been introduced. This so-called BCJRNet algorithm utilizes neural networks to calculate channel likelihoods. BCJRNet has demonstrated resilience against inaccurate channel tap estimations when applied to a time-invariant channel with ideal exponential decay profiles. However, it… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted paper at IEEE Wireless Communications and Networking Conference (WCNC) 2024

  28. arXiv:2401.08604  [pdf, other

    cs.CV cs.AI

    SAM4UDASS: When SAM Meets Unsupervised Domain Adaptive Semantic Segmentation in Intelligent Vehicles

    Authors: Weihao Yan, Yeqiang Qian, Xingyuan Chen, Hanyang Zhuang, Chunxiang Wang, Ming Yang

    Abstract: Semantic segmentation plays a critical role in enabling intelligent vehicles to comprehend their surrounding environments. However, deep learning-based methods usually perform poorly in domain shift scenarios due to the lack of labeled data for training. Unsupervised domain adaptation (UDA) techniques have emerged to bridge the gap across different driving scenes and enhance model performance on u… ▽ More

    Submitted 22 November, 2023; originally announced January 2024.

    Comments: 10 pages,9 figures,9 tables

  29. Semantic Segmentation in Multiple Adverse Weather Conditions with Domain Knowledge Retention

    Authors: Xin Yang, Wending Yan, Yuan Yuan, Michael Bi Mi, Robby T. Tan

    Abstract: Semantic segmentation's performance is often compromised when applied to unlabeled adverse weather conditions. Unsupervised domain adaptation is a potential approach to enhancing the model's adaptability and robustness to adverse weather. However, existing methods encounter difficulties when sequentially adapting the model to multiple unlabeled adverse weather conditions. They struggle to acquire… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

  30. arXiv:2401.02011  [pdf, ps, other

    cs.LG math.OC

    Decentralized Multi-Task Online Convex Optimization Under Random Link Failures

    Authors: Wenjing Yan, Xuanyu Cao

    Abstract: Decentralized optimization methods often entail information exchange between neighbors. Transmission failures can happen due to network congestion, hardware/software issues, communication outage, and other factors. In this paper, we investigate the random link failure problem in decentralized multi-task online convex optimization, where agents have individual decisions that are coupled with each o… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: 18 pages. 2 figures

  31. arXiv:2401.00729  [pdf, other

    cs.CV

    NightRain: Nighttime Video Deraining via Adaptive-Rain-Removal and Adaptive-Correction

    Authors: Beibei Lin, Yeying Jin, Wending Yan, Wei Ye, Yuan Yuan, Shunli Zhang, Robby Tan

    Abstract: Existing deep-learning-based methods for nighttime video deraining rely on synthetic data due to the absence of real-world paired data. However, the intricacies of the real world, particularly with the presence of light effects and low-light regions affected by noise, create significant domain gaps, hampering synthetic-trained models in removing rain streaks properly and leading to over-saturation… ▽ More

    Submitted 10 January, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI24

  32. arXiv:2311.18827  [pdf, other

    cs.GR cs.AI cs.CV cs.LG cs.MM

    Motion-Conditioned Image Animation for Video Editing

    Authors: Wilson Yan, Andrew Brown, Pieter Abbeel, Rohit Girdhar, Samaneh Azadi

    Abstract: We introduce MoCA, a Motion-Conditioned Image Animation approach for video editing. It leverages a simple decomposition of the video editing problem into image editing followed by motion-conditioned image animation. Furthermore, given the lack of robust evaluation datasets for video editing, we introduce a new benchmark that measures edit capability across a wide variety of tasks, such as object r… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: Project page: https://facebookresearch.github.io/MoCA

  33. arXiv:2311.16856  [pdf, other

    cs.LG eess.SP stat.ML

    Attentional Graph Neural Networks for Robust Massive Network Localization

    Authors: Wenzhong Yan, Juntao Wang, Feng Yin, Yang Tian, Abdelhak M. Zoubir

    Abstract: In recent years, Graph neural networks (GNNs) have emerged as a prominent tool for classification tasks in machine learning. However, their application in regression tasks remains underexplored. To tap the potential of GNNs in regression, this paper integrates GNNs with attention mechanism, a technique that revolutionized sequential learning tasks with its adaptability and robustness, to tackle a… ▽ More

    Submitted 14 February, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  34. arXiv:2311.16337  [pdf

    cs.HC cs.CV

    Multi-3D-Models Registration-Based Augmented Reality (AR) Instructions for Assembly

    Authors: Seda Tuzun Canadinc, Wei Yan

    Abstract: This paper introduces a novel, markerless, step-by-step, in-situ 3D Augmented Reality (AR) instruction method and its application - BRICKxAR (Multi 3D Models/M3D) - for small parts assembly. BRICKxAR (M3D) realistically visualizes rendered 3D assembly parts at the assembly location of the physical assembly model (Figure 1). The user controls the assembly process through a user interface. BRICKxAR… ▽ More

    Submitted 28 November, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

  35. arXiv:2311.11056  [pdf, other

    cs.RO cs.LG cs.SE

    Choose Your Simulator Wisely: A Review on Open-source Simulators for Autonomous Driving

    Authors: Yueyuan Li, Wei Yuan, Songan Zhang, Weihao Yan, Qiyuan Shen, Chunxiang Wang, Ming Yang

    Abstract: Simulators play a crucial role in autonomous driving, offering significant time, cost, and labor savings. Over the past few years, the number of simulators for autonomous driving has grown substantially. However, there is a growing concern about the validity of algorithms developed and evaluated in simulators, indicating a need for a thorough analysis of the development status of the simulators.… ▽ More

    Submitted 26 December, 2023; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: 18 pages, 5 figures, 8 tables

  36. arXiv:2311.08588  [pdf, other

    cs.CL cs.AI cs.SE

    CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation

    Authors: Weixiang Yan, Haitian Liu, Yunkun Wang, Yunzhe Li, Qian Chen, Wen Wang, Tingyu Lin, Weishan Zhao, Li Zhu, Hari Sundaram, Shuiguang Deng

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance on assisting humans in programming and facilitating programming automation. However, existing benchmarks for evaluating the code understanding and generation capacities of LLMs suffer from severe limitations. First, most benchmarks are insufficient as they focus on a narrow range of popular programming languages and specific tas… ▽ More

    Submitted 7 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Accepted by ACL 2024 main conference

  37. arXiv:2311.03679  [pdf, other

    cs.CV eess.IV

    Unsupervised convolutional neural network fusion approach for change detection in remote sensing images

    Authors: Weidong Yan, Pei Yan, Li Cao

    Abstract: With the rapid development of deep learning, a variety of change detection methods based on deep learning have emerged in recent years. However, these methods usually require a large number of training samples to train the network model, so it is very expensive. In this paper, we introduce a completely unsupervised shallow convolutional neural network (USCNN) fusion approach for change detection.… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  38. arXiv:2310.14784  [pdf, other

    cs.LG cs.AI

    An Efficient Imbalance-Aware Federated Learning Approach for Wearable Healthcare with Autoregressive Ratio Observation

    Authors: Wenhao Yan, He Li, Kaoru Ota, Mianxiong Dong

    Abstract: Widely available healthcare services are now getting popular because of advancements in wearable sensing techniques and mobile edge computing. People's health information is collected by edge devices such as smartphones and wearable bands for further analysis on servers, then send back suggestions and alerts for abnormal conditions. The recent emergence of federated learning allows users to train… ▽ More

    Submitted 30 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: submitted to IEEE OJCS in Oct. 2023, under review

  39. arXiv:2310.13933  [pdf, other

    cs.IT eess.SP

    Wideband Beamforming for STAR-RIS-assisted THz Communications with Three-Side Beam Split

    Authors: Wencai Yan, Wanming Hao, Gangcan Sun, Chongwen Huang, Qingqing Wu

    Abstract: In this paper, we consider the simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-assisted THz communications with three-side beam split. Except for the beam split at the base station (BS), we analyze the double-side beam split at the STAR-RIS for the first time. To relieve the double-side beam split effect, we propose a time delayer (TD)-based fully-connected… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

  40. arXiv:2310.13917  [pdf, other

    cs.IT eess.SP

    Beamforming Design for the Distributed RISs-aided THz Communications with Double-Layer True Time Delays

    Authors: Gangcan Sun, Wencai Yan, Wanming Hao, Chongwen Huang, Chau Yuen

    Abstract: In this paper, we investigate the reconfigurable intelligent surface (RIS)-aided terahertz (THz) communication system with the sparse radio frequency chains antenna structure at the base station (BS). To overcome the beam split of the BS, different from the conventional single-layer true-time-delay (TTD) scheme, we propose a double-layer TTD scheme that can effectively reduce the number of large-r… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

  41. arXiv:2310.04951  [pdf, other

    cs.AI cs.PL

    CodeTransOcean: A Comprehensive Multilingual Benchmark for Code Translation

    Authors: Weixiang Yan, Yuchen Tian, Yunzhe Li, Qian Chen, Wen Wang

    Abstract: Recent code translation techniques exploit neural machine translation models to translate source code from one programming language to another to satisfy production compatibility or to improve efficiency of codebase maintenance. Most existing code translation datasets only focus on a single pair of popular programming languages. To advance research on code translation and meet diverse requirements… ▽ More

    Submitted 24 October, 2023; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: Accepted by Findings of EMNLP 2023

  42. arXiv:2309.14404  [pdf

    q-bio.QM cs.LG

    pLMFPPred: a novel approach for accurate prediction of functional peptides integrating embedding from pre-trained protein language model and imbalanced learning

    Authors: Zebin Ma, Yonglin Zou, Xiaobin Huang, Wenjin Yan, Hao Xu, Jiexin Yang, Ying Zhang, Jinqi Huang

    Abstract: Functional peptides have the potential to treat a variety of diseases. Their good therapeutic efficacy and low toxicity make them ideal therapeutic agents. Artificial intelligence-based computational strategies can help quickly identify new functional peptides from collections of protein sequences and discover their different functions.Using protein language model-based embeddings (ESM-2), we deve… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: 20 pages, 5 figures,under review

  43. arXiv:2309.12618  [pdf, ps, other

    cs.LG

    Zero-Regret Performative Prediction Under Inequality Constraints

    Authors: Wenjing Yan, Xuanyu Cao

    Abstract: Performative prediction is a recently proposed framework where predictions guide decision-making and hence influence future data distributions. Such performative phenomena are ubiquitous in various areas, such as transportation, finance, public policy, and recommendation systems. To date, work on performative prediction has only focused on unconstrained scenarios, neglecting the fact that many rea… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  44. arXiv:2309.06267  [pdf, ps, other

    cs.IT

    A Complete Proof of an Important Theorem for Variable-to-Variable Length Codes

    Authors: Wei Yan, Yunghsiang S. Han

    Abstract: Variable-to-variable length (VV) codes are a class of lossless source coding. As their name implies, VV codes encode a variable-length sequence of source symbols into a variable-length codeword. This paper will give a complete proof of an important theorem for variable-to-variable length codes.

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2204.07398

  45. arXiv:2308.01738  [pdf, other

    cs.CV

    Enhancing Visibility in Nighttime Haze Images Using Guided APSF and Gradient Adaptive Convolution

    Authors: Yeying Jin, Beibei Lin, Wending Yan, Yuan Yuan, Wei Ye, Robby T. Tan

    Abstract: Visibility in hazy nighttime scenes is frequently reduced by multiple factors, including low light, intense glow, light scattering, and the presence of multicolored light sources. Existing nighttime dehazing methods often struggle with handling glow or low-light conditions, resulting in either excessively dark visuals or unsuppressed glow outputs. In this paper, we enhance the visibility from a si… ▽ More

    Submitted 21 January, 2024; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: Accepted to ACM'MM2023, https://github.com/jinyeying/nighttime_dehaze

    Journal ref: Published in ACM'MM2023

  46. arXiv:2308.00227  [pdf

    cs.HC cs.AI

    Experiments on Generative AI-Powered Parametric Modeling and BIM for Architectural Design

    Authors: Jaechang Ko, John Ajibefun, Wei Yan

    Abstract: This paper introduces a new architectural design framework that utilizes generative AI tools including ChatGPT and Veras with parametric modeling and Building Information Modeling (BIM) to enhance the design process. The study experiments with the potential of ChatGPT and generative AI in 3D architectural design, extending beyond its use in text and 2D image generation. The proposed framework prom… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: 18 pages, 11 figures, 5 tables

  47. arXiv:2307.15864  [pdf, other

    cs.IT

    An Entropy Coding Based on Binary Encoding for Mixed-Radix Digits

    Authors: Na Wang, Wei Yan, Sian-Jheng Lin, Yuliang Huang

    Abstract: The necessity of radix conversion of numeric data is an indispensable component in any complete analysis of digital computation. In this paper, we propose a binary encoding for mixed-radix digits. Second, a variant of rANS coding based on this conversion is given, which supports parallel decoding. The simulations show that the proposed coding in serial mode has a higher throughput than the baselin… ▽ More

    Submitted 12 September, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

  48. Combiner and HyperCombiner Networks: Rules to Combine Multimodality MR Images for Prostate Cancer Localisation

    Authors: Wen Yan, Bernard Chiu, Ziyi Shen, Qianye Yang, Tom Syer, Zhe Min, Shonit Punwani, Mark Emberton, David Atkinson, Dean C. Barratt, Yipeng Hu

    Abstract: One of the distinct characteristics in radiologists' reading of multiparametric prostate MR scans, using reporting systems such as PI-RADS v2.1, is to score individual types of MR modalities, T2-weighted, diffusion-weighted, and dynamic contrast-enhanced, and then combine these image-modality-specific scores using standardised decision rules to predict the likelihood of clinically significant canc… ▽ More

    Submitted 20 January, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: 30 pages, 6 figures

    MSC Class: 68T07

    Journal ref: journal={Medical Image Analysis}, volume={91}, pages={103030}, year={2024}, publisher={Elsevier}

  49. arXiv:2307.08221  [pdf, other

    cs.RO

    NDT-Map-Code: A 3D global descriptor for real-time loop closure detection in lidar SLAM

    Authors: Lizhou Liao, Wenlei Yan, Li Sun, Xinhui Bai, Zhenxing You, Hongyuan Yuan, Chunyun Fu

    Abstract: Loop-closure detection, also known as place recognition, aiming to identify previously visited locations, is an essential component of a SLAM system. Existing research on lidar-based loop closure heavily relies on dense point cloud and 360 FOV lidars. This paper proposes an out-of-the-box NDT (Normal Distribution Transform) based global descriptor, NDT-Map-Code, designed for both on-road driving a… ▽ More

    Submitted 20 March, 2024; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: 8 pages, 6 figures, 4 tables

  50. arXiv:2306.15490  [pdf, other

    eess.IV cs.CV

    EVD Surgical Guidance with Retro-Reflective Tool Tracking and Spatial Reconstruction using Head-Mounted Augmented Reality Device

    Authors: Haowei Li, Wenqing Yan, Du Liu, Long Qian, Yuxing Yang, Yihao Liu, Zhe Zhao, Hui Ding, Guangzhi Wang

    Abstract: Augmented Reality (AR) has been used to facilitate surgical guidance during External Ventricular Drain (EVD) surgery, reducing the risks of misplacement in manual operations. During this procedure, the key challenge is accurately estimating the spatial relationship between pre-operative images and actual patient anatomy in AR environment. This research proposes a novel framework utilizing Time of… ▽ More

    Submitted 3 July, 2023; v1 submitted 27 June, 2023; originally announced June 2023.