Skip to main content

Showing 1–50 of 119 results for author: Niu, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03008  [pdf, other

    cs.CV

    Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering

    Authors: Zhaohe Liao, Jiangtong Li, Li Niu, Liqing Zhang

    Abstract: Despite the recent progress made in Video Question-Answering (VideoQA), these methods typically function as black-boxes, making it difficult to understand their reasoning processes and perform consistent compositional reasoning. To address these challenges, we propose a \textit{model-agnostic} Video Alignment and Answer Aggregation (VA$^{3}$) framework, which is capable of enhancing both compositi… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 10 pages,CVPR

    Journal ref: CVPR (2024) 13395-13404

  2. arXiv:2407.02894  [pdf, other

    cs.CL cs.AI

    Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation

    Authors: Zhibin Lan, Liqiang Niu, Fandong Meng, Jie Zhou, Min Zhang, Jinsong Su

    Abstract: In-image machine translation (IIMT) aims to translate an image containing texts in source language into an image containing translations in target language. In this regard, conventional cascaded methods suffer from issues such as error propagation, massive parameters, and difficulties in deployment and retaining visual characteristics of the input image. Thus, constructing end-to-end models has be… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted to ACL 2024 Findings

  3. arXiv:2406.12935  [pdf, other

    cs.CR cs.AI cs.LG

    ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

    Authors: Fengqing Jiang, Zhangchen Xu, Luyao Niu, Bill Yuchen Lin, Radha Poovendran

    Abstract: Large language models (LLMs) are expected to follow instructions from users and engage in conversations. Techniques to enhance LLMs' instruction-following capabilities typically fine-tune them using data structured according to a predefined chat template. Although chat templates are shown to be effective in optimizing LLM performance, their impact on safety alignment of LLMs has been less understo… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  4. arXiv:2406.12257  [pdf, other

    cs.AI cs.CR

    CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models

    Authors: Yuetai Li, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Dinuka Sahabandu, Bhaskar Ramasubramanian, Radha Poovendran

    Abstract: The remarkable performance of large language models (LLMs) in generation tasks has enabled practitioners to leverage publicly available models to power custom applications, such as chatbots and virtual assistants. However, the data used to train or fine-tune these LLMs is often undisclosed, allowing an attacker to compromise the data and inject backdoors into the models. In this paper, we develop… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  5. arXiv:2406.08464  [pdf, other

    cs.CL cs.AI

    Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

    Authors: Zhangchen Xu, Fengqing Jiang, Luyao Niu, Yuntian Deng, Radha Poovendran, Yejin Choi, Bill Yuchen Lin

    Abstract: High-quality instruction data is critical for aligning large language models (LLMs). Although some models, such as Llama-3-Instruct, have open weights, their alignment data remain private, which hinders the democratization of AI. High human labor costs and a limited, predefined scope for prompting prevent existing open-source data creation methods from scaling effectively, potentially limiting the… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Link: https://magpie-align.github.io/

  6. arXiv:2405.20975  [pdf, other

    cs.CR cs.AI cs.LG

    ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning

    Authors: Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bo Li, Radha Poovendran

    Abstract: In Federated Learning (FL), a set of clients collaboratively train a machine learning model (called global model) without sharing their local training data. The local training data of clients is typically non-i.i.d. and heterogeneous, resulting in varying contributions from individual clients to the final performance of the global model. In response, many contribution evaluation methods were propo… ▽ More

    Submitted 5 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    Comments: To appear in the 33rd USENIX Security Symposium, 2024

  7. arXiv:2405.11841  [pdf, other

    cs.AI

    Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities

    Authors: Junqi Wang, Chunhui Zhang, Jiapeng Li, Yuxi Ma, Lixing Niu, Jiaheng Han, Yujia Peng, Yixin Zhu, Lifeng Fan

    Abstract: Facing the current debate on whether Large Language Models (LLMs) attain near-human intelligence levels (Mitchell & Krakauer, 2023; Bubeck et al., 2023; Kosinski, 2023; Shiffrin & Mitchell, 2023; Ullman, 2023), the current study introduces a benchmark for evaluating social intelligence, one of the most distinctive aspects of human cognition. We developed a comprehensive theoretical framework for s… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Also published in Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci), 2024

  8. arXiv:2404.10573  [pdf, other

    cs.AI cs.CE q-bio.BM

    AAVDiff: Experimental Validation of Enhanced Viability and Diversity in Recombinant Adeno-Associated Virus (AAV) Capsids through Diffusion Generation

    Authors: Lijun Liu, Jiali Yang, Jianfei Song, Xinglin Yang, Lele Niu, Zeqi Cai, Hui Shi, Tingjun Hou, Chang-yu Hsieh, Weiran Shen, Yafeng Deng

    Abstract: Recombinant adeno-associated virus (rAAV) vectors have revolutionized gene therapy, but their broad tropism and suboptimal transduction efficiency limit their clinical applications. To overcome these limitations, researchers have focused on designing and screening capsid libraries to identify improved vectors. However, the large sequence space and limited resources present challenges in identifyin… ▽ More

    Submitted 17 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  9. arXiv:2403.15234  [pdf, other

    cs.CV

    Shadow Generation for Composite Image Using Diffusion model

    Authors: Qingyang Liu, Junqi You, Jianting Wang, Xinhao Tao, Bo Zhang, Li Niu

    Abstract: In the realm of image composition, generating realistic shadow for the inserted foreground remains a formidable challenge. Previous works have developed image-to-image translation models which are trained on paired training data. However, they are struggling to generate shadows with accurate shapes and intensities, hindered by data scarcity and inherent task complexity. In this paper, we resort to… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: accepted by CVPR2024

  10. arXiv:2403.13647  [pdf, other

    cs.CV

    Meta-Point Learning and Refining for Category-Agnostic Pose Estimation

    Authors: Junjie Chen, Jiebin Yan, Yuming Fang, Li Niu

    Abstract: Category-agnostic pose estimation (CAPE) aims to predict keypoints for arbitrary classes given a few support images annotated with keypoints. Existing methods only rely on the features extracted at support keypoints to predict or refine the keypoints on query image, but a few support feature vectors are local and inadequate for CAPE. Considering that human can quickly perceive potential keypoints… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Published in CVPR 2024

  11. arXiv:2403.02607  [pdf

    cs.GT cs.AI

    MEBS: Multi-task End-to-end Bid Shading for Multi-slot Display Advertising

    Authors: Zhen Gong, Lvyin Niu, Yang Zhao, Miao Xu, Zhenzhe Zheng, Haoqi Zhang, Zhilin Zhang, Fan Wu, Rongquan Bai, Chuan Yu, Jian Xu, Bo Zheng

    Abstract: Online bidding and auction are crucial aspects of the online advertising industry. Conventionally, there is only one slot for ad display and most current studies focus on it. Nowadays, multi-slot display advertising is gradually becoming popular where many ads could be displayed in a list and shown as a whole to users. However, multi-slot display advertising leads to different cost-effectiveness.… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  12. arXiv:2402.19273  [pdf, other

    cs.CL

    PlanGPT: Enhancing Urban Planning with Tailored Language Model and Efficient Retrieval

    Authors: He Zhu, Wenjia Zhang, Nuoxian Huang, Boyang Li, Luyao Niu, Zipei Fan, Tianle Lun, Yicheng Tao, Junyou Su, Zhaoya Gong, Chenyu Fang, Xing Liu

    Abstract: In the field of urban planning, general-purpose large language models often struggle to meet the specific needs of planners. Tasks like generating urban planning texts, retrieving related information, and evaluating planning documents pose unique challenges. To enhance the efficiency of urban professionals and overcome these obstacles, we introduce PlanGPT, the first specialized Large Language Mod… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  13. arXiv:2402.18677  [pdf, other

    cs.RO cs.AI eess.SY

    Fault Tolerant Neural Control Barrier Functions for Robotic Systems under Sensor Faults and Attacks

    Authors: Hongchao Zhang, Luyao Niu, Andrew Clark, Radha Poovendran

    Abstract: Safety is a fundamental requirement of many robotic systems. Control barrier function (CBF)-based approaches have been proposed to guarantee the safety of robotic systems. However, the effectiveness of these approaches highly relies on the choice of CBFs. Inspired by the universal approximation power of neural networks, there is a growing trend toward representing CBFs using neural networks, leadi… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  14. arXiv:2402.11753  [pdf, other

    cs.CL cs.AI

    ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

    Authors: Fengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen Xiang, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran

    Abstract: Safety is critical to the usage of large language models (LLMs). Multiple techniques such as data filtering and supervised fine-tuning have been developed to strengthen LLM safety. However, currently known techniques presume that corpora used for safety alignment of LLMs are solely interpreted by semantics. This assumption, however, does not hold in real-world applications, which leads to severe v… ▽ More

    Submitted 7 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: To appear in ACL 2024

  15. arXiv:2402.08983  [pdf, other

    cs.CR cs.AI cs.CL

    SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding

    Authors: Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran

    Abstract: As large language models (LLMs) become increasingly integrated into real-world applications such as code generation and chatbot assistance, extensive efforts have been made to align LLM behavior with human values, including safety. Jailbreak attacks, aiming to provoke unintended and unsafe behaviors from LLMs, remain a significant/leading LLM safety threat. In this paper, we aim to defend LLMs aga… ▽ More

    Submitted 7 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: To appear in ACL 2024

  16. arXiv:2402.08695  [pdf, other

    cs.CR cs.LG

    Game of Trojans: Adaptive Adversaries Against Output-based Trojaned-Model Detectors

    Authors: Dinuka Sahabandu, Xiaojun Xu, Arezoo Rajabi, Luyao Niu, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran

    Abstract: We propose and analyze an adaptive adversary that can retrain a Trojaned DNN and is also aware of SOTA output-based Trojaned model detectors. We show that such an adversary can ensure (1) high accuracy on both trigger-embedded and clean samples and (2) bypass detection. Our approach is based on an observation that the high dimensionality of the DNN parameters provides sufficient degrees of freedom… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  17. arXiv:2402.02224  [pdf, other

    cs.CV

    MSPM: A Multi-Site Physiological Monitoring Dataset for Remote Pulse, Respiration, and Blood Pressure Estimation

    Authors: Jeremy Speth, Nathan Vance, Benjamin Sporrer, Lu Niu, Patrick Flynn, Adam Czajka

    Abstract: Visible-light cameras can capture subtle physiological biomarkers without physical contact with the subject. We present the Multi-Site Physiological Monitoring (MSPM) dataset, which is the first dataset collected to support the study of simultaneous camera-based vital signs estimation from multiple locations on the body. MSPM enables research on remote photoplethysmography (rPPG), respiration rate… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  18. arXiv:2402.01123  [pdf, other

    cs.CV

    A Single Simple Patch is All You Need for AI-generated Image Detection

    Authors: Jiaxuan Chen, Jieteng Yao, Li Niu

    Abstract: The recent development of generative models unleashes the potential of generating hyper-realistic fake images. To prevent the malicious usage of fake images, AI-generated image detection aims to distinguish fake images from real images. However, existing method suffer from severe performance drop when detecting images generated by unseen generators. We find that generative models tend to focus on… ▽ More

    Submitted 20 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  19. arXiv:2401.05562  [pdf, ps, other

    cs.LG cs.CR cs.DC

    Brave: Byzantine-Resilient and Privacy-Preserving Peer-to-Peer Federated Learning

    Authors: Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Radha Poovendran

    Abstract: Federated learning (FL) enables multiple participants to train a global machine learning model without sharing their private training data. Peer-to-peer (P2P) FL advances existing centralized FL paradigms by eliminating the server that aggregates local models from participants and then updates the global model. However, P2P FL is vulnerable to (i) honest-but-curious participants whose objective is… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  20. arXiv:2312.11547  [pdf, other

    cs.AI cs.LG

    A Unified Pre-training and Adaptation Framework for Combinatorial Optimization on Graphs

    Authors: Ruibin Zeng, Minglong Lei, Lingfeng Niu, Lan Cheng

    Abstract: Combinatorial optimization (CO) on graphs is a classic topic that has been extensively studied across many scientific and industrial fields. Recently, solving CO problems on graphs through learning methods has attracted great attention. Advanced deep learning methods, e.g., graph neural networks (GNNs), have been used to effectively assist the process of solving COs. However, current frameworks ba… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

  21. arXiv:2312.10264  [pdf, other

    cs.CV

    Progressive Painterly Image Harmonization from Low-level Styles to High-level Styles

    Authors: Li Niu, Yan Hong, Junyan Cao, Liqing Zhang

    Abstract: Painterly image harmonization aims to harmonize a photographic foreground object on the painterly background. Different from previous auto-encoder based harmonization networks, we develop a progressive multi-stage harmonization network, which harmonizes the composite foreground from low-level styles (e.g., color, simple texture) to high-level styles (e.g., complex texture). Our network has better… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  22. arXiv:2312.10263  [pdf, other

    cs.CV

    Painterly Image Harmonization by Learning from Painterly Objects

    Authors: Li Niu, Junyan Cao, Yan Hong, Liqing Zhang

    Abstract: Given a composite image with photographic object and painterly background, painterly image harmonization targets at stylizing the composite object to be compatible with the background. Despite the competitive performance of existing painterly harmonization works, they did not fully leverage the painterly objects in artistic paintings. In this work, we explore learning from painterly objects for pa… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  23. arXiv:2311.16153  [pdf, other

    cs.CR cs.AI

    Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications

    Authors: Fengqing Jiang, Zhangchen Xu, Luyao Niu, Boxin Wang, Jinyuan Jia, Bo Li, Radha Poovendran

    Abstract: Large language models (LLMs) are increasingly deployed as the service backend for LLM-integrated applications such as code completion and AI-powered search. LLM-integrated applications serve as middleware to refine users' queries with domain-specific knowledge to better inform LLMs and enhance the responses. Despite numerous opportunities and benefits, LLM-integrated applications also introduce ne… ▽ More

    Submitted 28 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

  24. arXiv:2311.08646  [pdf, other

    cs.CV

    Painterly Image Harmonization via Adversarial Residual Learning

    Authors: Xudong Wang, Li Niu, Junyan Cao, Yan Hong, Liqing Zhang

    Abstract: Image compositing plays a vital role in photo editing. After inserting a foreground object into another background image, the composite image may look unnatural and inharmonious. When the foreground is photorealistic and the background is an artistic painting, painterly image harmonization aims to transfer the style of background painting to the foreground object, which is a challenging task due t… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: Accepted by WACV2024

  25. arXiv:2310.17131  [pdf, other

    cs.CV

    Virtual Accessory Try-On via Keypoint Hallucination

    Authors: Junhong Gou, Bo Zhang, Li Niu, Jianfu Zhang, Jianlou Si, Chen Qian, Liqing Zhang

    Abstract: The virtual try-on task refers to fitting the clothes from one image onto another portrait image. In this paper, we focus on virtual accessory try-on, which fits accessory (e.g., glasses, ties) onto a face or portrait image. Unlike clothing try-on, which relies on human silhouette as guidance, accessory try-on warps the accessory into an appropriate location and shape to generate a plausible compo… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  26. arXiv:2310.15929  [pdf, other

    cs.LG cs.AI cs.CL

    E-Sparse: Boosting the Large Language Model Inference through Entropy-based N:M Sparsity

    Authors: Yun Li, Lin Niu, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang

    Abstract: Traditional pruning methods are known to be challenging to work in Large Language Models (LLMs) for Generative AI because of their unaffordable training process and large computational demands. For the first time, we introduce the information entropy of hidden state features into a pruning metric design, namely E-Sparse, to improve the accuracy of N:M sparsity on LLM. E-Sparse employs the informat… ▽ More

    Submitted 22 March, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

  27. arXiv:2310.07995  [pdf, other

    cs.CV cs.AI

    HeightFormer: A Multilevel Interaction and Image-adaptive Classification-regression Network for Monocular Height Estimation with Aerial Images

    Authors: Zhan Chen, Yidan Zhang, Xiyu Qi, Yongqiang Mao, Xin Zhou, Lulu Niu, Hui Wu, Lei Wang, Yunping Ge

    Abstract: Height estimation has long been a pivotal topic within measurement and remote sensing disciplines, proving critical for endeavours such as 3D urban modelling, MR and autonomous driving. Traditional methods utilise stereo matching or multisensor fusion, both well-established techniques that typically necessitate multiple images from varying perspectives and adjunct sensors like SAR, leading to subs… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  28. arXiv:2309.15508  [pdf, other

    cs.CV

    DreamCom: Finetuning Text-guided Inpainting Model for Image Composition

    Authors: Lingxiao Lu, Jiangtong Li, Bo Zhang, Li Niu

    Abstract: The goal of image composition is merging a foreground object into a background image to obtain a realistic composite image. Recently, generative composition methods are built on large pretrained diffusion models, due to their unprecedented image generation ability. However, they are weak in preserving the foreground object details. Inspired by recent text-to-image generation customized for certain… ▽ More

    Submitted 24 January, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

  29. arXiv:2309.05457  [pdf, other

    cs.CR cs.LG

    Unveiling the Sentinels: Assessing AI Performance in Cybersecurity Peer Review

    Authors: Liang Niu, Nian Xue, Christina Pöpper

    Abstract: Peer review is the method employed by the scientific community for evaluating research advancements. In the field of cybersecurity, the practice of double-blind peer review is the de-facto standard. This paper touches on the holy grail of peer reviewing and aims to shed light on the performance of AI in reviewing for academic security conferences. Specifically, we investigate the predictability of… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

  30. arXiv:2308.15673  [pdf, other

    cs.CR cs.LG

    MDTD: A Multi Domain Trojan Detector for Deep Neural Networks

    Authors: Arezoo Rajabi, Surudhi Asokraj, Fengqing Jiang, Luyao Niu, Bhaskar Ramasubramanian, Jim Ritcey, Radha Poovendran

    Abstract: Machine learning models that use deep neural networks (DNNs) are vulnerable to backdoor attacks. An adversary carrying out a backdoor attack embeds a predefined perturbation called a trigger into a small subset of input samples and trains the DNN such that the presence of the trigger in the input results in an adversary-desired output class. Such adversarial retraining however needs to ensure that… ▽ More

    Submitted 2 September, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

    Comments: Accepted to ACM Conference on Computer and Communications Security (ACM CCS) 2023

  31. arXiv:2308.10040  [pdf, other

    cs.CV

    ControlCom: Controllable Image Composition using Diffusion Model

    Authors: Bo Zhang, Yuxuan Duan, Jun Lan, Yan Hong, Huijia Zhu, Weiqiang Wang, Li Niu

    Abstract: Image composition targets at synthesizing a realistic composite image from a pair of foreground and background images. Recently, generative composition methods are built on large pretrained diffusion models to generate composite images, considering their great potential in image generation. However, they suffer from lack of controllability on foreground attributes and poor preservation of foregrou… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

  32. arXiv:2308.09972  [pdf, other

    cs.CV

    DESOBAv2: Towards Large-scale Real-world Dataset for Shadow Generation

    Authors: Qingyang Liu, Jianting Wang, Li Niu

    Abstract: Image composition refers to inserting a foreground object into a background image to obtain a composite image. In this work, we focus on generating plausible shadow for the inserted foreground object to make the composite image more realistic. To supplement the existing small-scale dataset DESOBA, we create a large-scale dataset called DESOBAv2 by using object-shadow detection and inpainting techn… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: arXiv admin note: text overlap with arXiv:2306.17358

  33. arXiv:2308.04990  [pdf, other

    cs.CV

    Foreground Object Search by Distilling Composite Image Feature

    Authors: Bo Zhang, Jiacheng Sui, Li Niu

    Abstract: Foreground object search (FOS) aims to find compatible foreground objects for a given background image, producing realistic composite image. We observe that competitive retrieval performance could be achieved by using a discriminator to predict the compatibility of composite image, but this approach has unaffordable time cost. To this end, we propose a novel FOS method via distilling composite fea… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV 2023

  34. Deep Image Harmonization in Dual Color Spaces

    Authors: Linfeng Tan, Jiangtong Li, Li Niu, Liqing Zhang

    Abstract: Image harmonization is an essential step in image composition that adjusts the appearance of composite foreground to address the inconsistency between foreground and background. Existing methods primarily operate in correlated $RGB$ color space, leading to entangled features and limited representation ability. In contrast, decorrelated color space (e.g., $Lab$) has decorrelated channels that provi… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Comments: Accepted by ACMMM 2023

  35. Painterly Image Harmonization using Diffusion Model

    Authors: Lingxiao Lu, Jiangtong Li, Junyan Cao, Li Niu, Liqing Zhang

    Abstract: Painterly image harmonization aims to insert photographic objects into paintings and obtain artistically coherent composite images. Previous methods for this task mainly rely on inference optimization or generative adversarial network, but they are either very time-consuming or struggling at fine control of the foreground objects (e.g., texture and content details). To address these issues, we pro… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: Accepted by ACMMM 2023

  36. arXiv:2308.02177  [pdf, other

    cs.CV

    Scene-aware Human Pose Generation using Transformer

    Authors: Jieteng Yao, Junjie Chen, Li Niu, Bin Sheng

    Abstract: Affordance learning considers the interaction opportunities for an actor in the scene and thus has wide application in scene understanding and intelligent robotics. In this paper, we focus on contextual affordance learning, i.e., using affordance as context to generate a reasonable human pose in a scene. Existing scene-aware human pose generation methods could be divided into two categories depend… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: Accepted by ACMMM 2023

  37. arXiv:2308.00376  [pdf, other

    cs.CV

    Deep Image Harmonization with Learnable Augmentation

    Authors: Li Niu, Junyan Cao, Wenyan Cong, Liqing Zhang

    Abstract: The goal of image harmonization is adjusting the foreground appearance in a composite image to make the whole image harmonious. To construct paired training images, existing datasets adopt different ways to adjust the illumination statistics of foregrounds of real images to produce synthetic composite images. However, different datasets have considerable domain gap and the performances on small-sc… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV 2023

  38. arXiv:2308.00356  [pdf, other

    cs.CV

    Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation

    Authors: Li Niu, Linfeng Tan, Xinhao Tao, Junyan Cao, Fengjun Guo, Teng Long, Liqing Zhang

    Abstract: Given a composite image, image harmonization aims to adjust the foreground illumination to be consistent with background. Previous methods have explored transforming foreground features to achieve competitive performance. In this work, we show that using global information to guide foreground feature transformation could achieve significant improvement. Besides, we propose to transfer the foregrou… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV 2023

  39. arXiv:2306.17358  [pdf, other

    cs.CV

    Shadow Generation with Decomposed Mask Prediction and Attentive Shadow Filling

    Authors: Xinhao Tao, Junyan Cao, Yan Hong, Li Niu

    Abstract: Image composition refers to inserting a foreground object into a background image to obtain a composite image. In this work, we focus on generating plausible shadows for the inserted foreground object to make the composite image more realistic. To supplement the existing small-scale dataset, we create a large-scale dataset called RdSOBA with rendering techniques. Moreover, we design a two-stage ne… ▽ More

    Submitted 4 January, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

  40. arXiv:2305.06671  [pdf, other

    cs.CV

    WeditGAN: Few-Shot Image Generation via Latent Space Relocation

    Authors: Yuxuan Duan, Li Niu, Yan Hong, Liqing Zhang

    Abstract: In few-shot image generation, directly training GAN models on just a handful of images faces the risk of overfitting. A popular solution is to transfer the models pretrained on large source domains to small target ones. In this work, we introduce WeditGAN, which realizes model transfer by editing the intermediate latent codes $w$ in StyleGANs with learned constant offsets ($Δw$), discovering and c… ▽ More

    Submitted 14 January, 2024; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: AAAI 2024, see Appendix for update notes of this version

  41. arXiv:2305.06553  [pdf, other

    cs.CV

    WeLayout: WeChat Layout Analysis System for the ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents

    Authors: Mingliang Zhang, Zhen Cao, Juntao Liu, Liqiang Niu, Fandong Meng, Jie Zhou

    Abstract: In this paper, we introduce WeLayout, a novel system for segmenting the layout of corporate documents, which stands for WeChat Layout Analysis System. Our approach utilizes a sophisticated ensemble of DINO and YOLO models, specifically developed for the ICDAR 2023 Competition on Robust Layout Segmentation. Our method significantly surpasses the baseline, securing a top position on the leaderboard… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  42. arXiv:2304.09785  [pdf, other

    cs.CV

    Improving Post-Training Quantization on Object Detection with Task Loss-Guided Lp Metric

    Authors: Lin Niu, Jiawei Liu, Zhihang Yuan, Dawei Yang, Xinggang Wang, Wenyu Liu

    Abstract: Efficient inference for object detection networks is a major challenge on edge devices. Post-Training Quantization (PTQ), which transforms a full-precision model into low bit-width directly, is an effective and convenient approach to reduce model inference complexity. But it suffers severe accuracy drop when applied to complex tasks such as object detection. PTQ optimizes the quantization paramete… ▽ More

    Submitted 7 May, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

  43. arXiv:2304.02005  [pdf, other

    cs.AI cs.MA eess.SY

    Risk-Aware Distributed Multi-Agent Reinforcement Learning

    Authors: Abdullah Al Maruf, Luyao Niu, Bhaskar Ramasubramanian, Andrew Clark, Radha Poovendran

    Abstract: Autonomous cyber and cyber-physical systems need to perform decision-making, learning, and control in unknown environments. Such decision-making can be sensitive to multiple factors, including modeling errors, changes in costs, and impacts of events in the tails of probability distributions. Although multi-agent reinforcement learning (MARL) provides a framework for learning behaviors through repe… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  44. arXiv:2304.01089  [pdf, other

    cs.CL

    RPTQ: Reorder-based Post-training Quantization for Large Language Models

    Authors: Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang, Yuzhang Shang, Guangyu Sun, Qiang Wu, Jiaxiang Wu, Bingzhe Wu

    Abstract: Large-scale language models (LLMs) have demonstrated impressive performance, but their deployment presents challenges due to their significant memory usage. This issue can be alleviated through quantization. In this paper, we identify that the challenge in quantizing activations in LLMs arises from varying ranges across channels, rather than solely the presence of outliers. To address this challen… ▽ More

    Submitted 17 May, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: 18 pages

  45. arXiv:2303.10089  [pdf

    cs.RO

    LP-SLAM: Language-Perceptive RGB-D SLAM system based on Large Language Model

    Authors: Weiyi Zhang, Yushi Guo, Liting Niu, Peijun Li, Chun Zhang, Zeyu Wan, Jiaxiang Yan, Fasih Ud Din Farrukh, Debing Zhang

    Abstract: Simultaneous localization and mapping (SLAM) is a critical technology that enables autonomous robots to be aware of their surrounding environment. With the development of deep learning, SLAM systems can achieve a higher level of perception of the environment, including the semantic and text levels. However, current works are limited in their ability to achieve a natural-language level of perceptio… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: 12 pages, 16 figures

  46. arXiv:2303.09638  [pdf, other

    cs.CV

    Full-Body Cardiovascular Sensing with Remote Photoplethysmography

    Authors: Lu Niu, Jeremy Speth, Nathan Vance, Benjamin Sporrer, Adam Czajka, Patrick Flynn

    Abstract: Remote photoplethysmography (rPPG) allows for noncontact monitoring of blood volume changes from a camera by detecting minor fluctuations in reflected light. Prior applications of rPPG focused on face videos. In this paper we explored the feasibility of rPPG from non-face body regions such as the arms, legs, and hands. We collected a new dataset titled Multi-Site Physiological Monitoring (MSPM), w… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  47. Hallucinated Heartbeats: Anomaly-Aware Remote Pulse Estimation

    Authors: Jeremy Speth, Nathan Vance, Benjamin Sporrer, Lu Niu, Patrick Flynn, Adam Czajka

    Abstract: Camera-based physiological monitoring, especially remote photoplethysmography (rPPG), is a promising tool for health diagnostics, and state-of-the-art pulse estimators have shown impressive performance on benchmark datasets. We argue that evaluations of modern solutions may be incomplete, as we uncover failure cases for videos without a live person, or in the presence of severe noise. We demonstra… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

    Comments: Accepted at BIOSIGNALS 2023

  48. arXiv:2303.02389  [pdf, other

    cs.CV

    Few-Shot Defect Image Generation via Defect-Aware Feature Manipulation

    Authors: Yuxuan Duan, Yan Hong, Li Niu, Liqing Zhang

    Abstract: The performances of defect inspection have been severely hindered by insufficient defect images in industries, which can be alleviated by generating more samples as data augmentation. We propose the first defect image generation method in the challenging few-shot cases. Given just a handful of defect images and relatively more defect-free ones, our goal is to augment the dataset with new defect im… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

    Comments: Accepted by AAAI 2023

  49. PullupStructs: Digital Fabrication for Folding Structures via Pull-up Nets

    Authors: Lauren Niu, Xinyi Yang, Martin Nisser, Stefanie Mueller

    Abstract: In this paper, we introduce a method to rapidly create 3D geometries by folding 2D sheets via pull-up nets. Given a 3D structure, we unfold its mesh into a planar 2D sheet using heuristic algorithms and populate these with cutlines and throughholes. We develop a web-based simulation tool that translates users' 3D meshes into manufacturable 2D sheets. After laser-cutting the sheet and feeding threa… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: In ACM TEI '23: Proceedings of the Seventeenth International Conference on Tangible, Embedded, and Embodied Interaction (ACM TEI '23), February 26-March 1, 2023, Warsaw, Poland. ACM, New York, NY, USA, 10 pages

  50. arXiv:2212.08846  [pdf, other

    cs.CV

    Painterly Image Harmonization in Dual Domains

    Authors: Junyan Cao, Yan Hong, Li Niu

    Abstract: Image harmonization aims to produce visually harmonious composite images by adjusting the foreground appearance to be compatible with the background. When the composite image has photographic foreground and painterly background, the task is called painterly image harmonization. There are only few works on this task, which are either time-consuming or weak in generating well-harmonized results. In… ▽ More

    Submitted 4 July, 2023; v1 submitted 17 December, 2022; originally announced December 2022.

    Comments: Accepted by AAAI2023