Zum Hauptinhalt springen

Showing 101–150 of 623 results for author: Qi, X

.
  1. arXiv:2311.01714  [pdf, other

    cs.CV

    EXIM: A Hybrid Explicit-Implicit Representation for Text-Guided 3D Shape Generation

    Authors: Zhengzhe Liu, Jingyu Hu, Ka-Hei Hui, Xiaojuan Qi, Daniel Cohen-Or, Chi-Wing Fu

    Abstract: This paper presents a new text-guided technique for generating 3D shapes. The technique leverages a hybrid 3D shape representation, namely EXIM, combining the strengths of explicit and implicit representations. Specifically, the explicit stage controls the topology of the generated 3D shapes and enables local modifications, whereas the implicit stage refines the shape and paints it with plausible… ▽ More

    Submitted 30 November, 2023; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: SIGGRAPH Asia 2023 & TOG Project page: https://liuzhengzhe.github.io/EXIM.github.io/

  2. arXiv:2310.19415  [pdf, other

    cs.CV cs.AI cs.GR

    Text-to-3D with Classifier Score Distillation

    Authors: Xin Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Song-Hai Zhang, Xiaojuan Qi

    Abstract: Text-to-3D generation has made remarkable progress recently, particularly with methods based on Score Distillation Sampling (SDS) that leverages pre-trained 2D diffusion models. While the usage of classifier-free guidance is well acknowledged to be crucial for successful optimization, it is considered an auxiliary trick rather than the most essential component. In this paper, we re-evaluate the ro… ▽ More

    Submitted 31 October, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Our project page is https://xinyu-andy.github.io/Classifier-Score-Distillation

  3. arXiv:2310.18725  [pdf, other

    cs.LG cs.AI

    The Evolution of the Interplay Between Input Distributions and Linear Regions in Networks

    Authors: Xuan Qi, Yi Wei

    Abstract: It is commonly recognized that the expressiveness of deep neural networks is contingent upon a range of factors, encompassing their depth, width, and other relevant considerations. Currently, the practical performance of the majority of deep neural networks remains uncertain. For ReLU (Rectified Linear Unit) networks with piecewise linear activations, the number of linear convex regions serves as… ▽ More

    Submitted 6 November, 2023; v1 submitted 28 October, 2023; originally announced October 2023.

    Comments: Under review

  4. arXiv:2310.16667  [pdf, other

    cs.CV

    CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection

    Authors: Chuofan Ma, Yi Jiang, Xin Wen, Zehuan Yuan, Xiaojuan Qi

    Abstract: Deriving reliable region-word alignment from image-text pairs is critical to learn object-level vision-language representations for open-vocabulary object detection. Existing methods typically rely on pre-trained or self-trained vision-language models for alignment, which are prone to limitations in localization accuracy or generalization capabilities. In this paper, we propose CoDet, a novel appr… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023

  5. arXiv:2310.14664  [pdf, other

    cs.LG cs.AI cs.CV

    Data Pruning via Moving-one-Sample-out

    Authors: Haoru Tan, Sitong Wu, Fei Du, Yukang Chen, Zhibin Wang, Fan Wang, Xiaojuan Qi

    Abstract: In this paper, we propose a novel data-pruning approach called moving-one-sample-out (MoSo), which aims to identify and remove the least informative samples from the training set. The core insight behind MoSo is to determine the importance of each sample by assessing its impact on the optimal empirical risk. This is achieved by measuring the extent to which the empirical risk changes when a partic… ▽ More

    Submitted 25 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted by the Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

  6. arXiv:2310.12281  [pdf, other

    cs.LG cs.AI

    Enhancing the Performance of Automated Grade Prediction in MOOC using Graph Representation Learning

    Authors: Soheila Farokhi, Aswani Yaramala, Jiangtao Huang, Muhammad F. A. Khan, Xiaojun Qi, Hamid Karimi

    Abstract: In recent years, Massive Open Online Courses (MOOCs) have gained significant traction as a rapidly growing phenomenon in online learning. Unlike traditional classrooms, MOOCs offer a unique opportunity to cater to a diverse audience from different backgrounds and geographical locations. Renowned universities and MOOC-specific providers, such as Coursera, offer MOOC courses on various subjects. Aut… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  7. arXiv:2310.10644  [pdf, other

    cs.CV

    TOSS:High-quality Text-guided Novel View Synthesis from a Single Image

    Authors: Yukai Shi, Jianan Wang, He Cao, Boshi Tang, Xianbiao Qi, Tianyu Yang, Yukun Huang, Shilong Liu, Lei Zhang, Heung-Yeung Shum

    Abstract: In this paper, we present TOSS, which introduces text to the task of novel view synthesis (NVS) from just a single RGB image. While Zero-1-to-3 has demonstrated impressive zero-shot open-set NVS capability, it treats NVS as a pure image-to-image translation problem. This approach suffers from the challengingly under-constrained nature of single-view NVS: the process lacks means of explicit user co… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  8. arXiv:2310.07995  [pdf, other

    cs.CV cs.AI

    HeightFormer: A Multilevel Interaction and Image-adaptive Classification-regression Network for Monocular Height Estimation with Aerial Images

    Authors: Zhan Chen, Yidan Zhang, Xiyu Qi, Yongqiang Mao, Xin Zhou, Lulu Niu, Hui Wu, Lei Wang, Yunping Ge

    Abstract: Height estimation has long been a pivotal topic within measurement and remote sensing disciplines, proving critical for endeavours such as 3D urban modelling, MR and autonomous driving. Traditional methods utilise stereo matching or multisensor fusion, both well-established techniques that typically necessitate multiple images from varying perspectives and adjunct sensors like SAR, leading to subs… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  9. arXiv:2310.07178  [pdf, other

    physics.optics

    Engineering of energy band and its impact on light transmission in non-reciprocal Hermitian hourglass lattice

    Authors: Junhao Yang, Yuandan Wang, Yu Lin, Wenjing Zhang, Guoguo Xin, Xinyuan Qi

    Abstract: We study a quasi-one-dimensional non-reciprocal Hermitian hourglass photonic lattice that can accomplish multiple functions. Under the effect of non-reciprocal coupling, this lattice can produce an energy isolation effect, two kinds of flat bands, and energy band inversion. The excitation and propagation of a single energy band and multiple energy bands can be realized; in the flat band condition,… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  10. arXiv:2310.04030  [pdf

    stat.ME

    Robust inference with GhostKnockoffs in genome-wide association studies

    Authors: Xinran Qi, Michael E. Belloy, Jiaqi Gu, Xiaoxia Liu, Hua Tang, Zihuai He

    Abstract: Genome-wide association studies (GWASs) have been extensively adopted to depict the underlying genetic architecture of complex diseases. Motivated by GWASs' limitations in identifying small effect loci to understand complex traits' polygenicity and fine-mapping putative causal variants from proxy ones, we propose a knockoff-based method which only requires summary statistics from GWASs and demonst… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  11. arXiv:2310.03693  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!

    Authors: Xiangyu Qi, Yi Zeng, Tinghao Xie, Pin-Yu Chen, Ruoxi Jia, Prateek Mittal, Peter Henderson

    Abstract: Optimizing large language models (LLMs) for downstream use cases often involves the customization of pre-trained LLMs through further fine-tuning. Meta's open release of Llama models and OpenAI's APIs for fine-tuning GPT-3.5 Turbo on custom datasets also encourage this practice. But, what are the safety costs associated with such custom fine-tuning? We note that while existing safety alignment inf… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  12. arXiv:2309.16987  [pdf, other

    cs.CV

    SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features

    Authors: Song Wang, Zhu Wang, Can Li, Xiaojuan Qi, Hayden Kwok-Hay So

    Abstract: In comparison to conventional RGB cameras, the superior temporal resolution of event cameras allows them to capture rich information between frames, making them prime candidates for object tracking. Yet in practice, despite their theoretical advantages, the body of work on event-based multi-object tracking (MOT) remains in its infancy, especially in real-world settings where events from complex ba… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

  13. arXiv:2309.07109  [pdf, ps, other

    hep-ex astro-ph.HE hep-ph

    Real-time Monitoring for the Next Core-Collapse Supernova in JUNO

    Authors: Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli , et al. (606 additional authors not shown)

    Abstract: The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neu… ▽ More

    Submitted 4 December, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: 24 pages, 9 figures, accepted for the publication at JCAP

  14. arXiv:2309.06801  [pdf, ps, other

    cs.CC cs.AI cs.SI

    Defensive Alliances in Signed Networks

    Authors: Emmanuel Arrighi, Zhidan Feng, Henning Fernau, Kevin Mann, Xingqin Qi, Petra Wolf

    Abstract: The analysis of (social) networks and multi-agent systems is a central theme in Artificial Intelligence. Some line of research deals with finding groups of agents that could work together to achieve a certain goal. To this end, different notions of so-called clusters or communities have been introduced in the literature of graphs and networks. Among these, defensive alliance is a kind of quantitat… ▽ More

    Submitted 23 January, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

  15. arXiv:2309.04814  [pdf, other

    cs.CV

    Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video

    Authors: Xiuzhe Wu, Pengfei Hu, Yang Wu, Xiaoyang Lyu, Yan-Pei Cao, Ying Shan, Wenming Yang, Zhongqian Sun, Xiaojuan Qi

    Abstract: Synthesizing realistic videos according to a given speech is still an open challenge. Previous works have been plagued by issues such as inaccurate lip shape generation and poor image quality. The key reason is that only motions and appearances on limited facial areas (e.g., lip area) are mainly driven by the input speech. Therefore, directly learning a mapping function from speech to the entire h… ▽ More

    Submitted 9 September, 2023; originally announced September 2023.

  16. arXiv:2309.02411  [pdf, other

    cs.LG

    Delta-LoRA: Fine-Tuning High-Rank Parameters with the Delta of Low-Rank Matrices

    Authors: Bojia Zi, Xianbiao Qi, Lingzhi Wang, Jianan Wang, Kam-Fai Wong, Lei Zhang

    Abstract: In this paper, we present Delta-LoRA, which is a novel parameter-efficient approach to fine-tune large language models (LLMs). In contrast to LoRA and other low-rank adaptation methods such as AdaLoRA, Delta-LoRA not only updates the low-rank matrices $\bA$ and $\bB$, but also propagate the learning to the pre-trained weights $\bW$ via updates utilizing the delta of the product of two low-rank mat… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  17. arXiv:2309.00223  [pdf, other

    eess.AS cs.CL cs.SD

    The FruitShell French synthesis system at the Blizzard 2023 Challenge

    Authors: Xin Qi, Xiaopeng Wang, Zhiyong Wang, Wang Liu, Mingming Ding, Shuchen Shi

    Abstract: This paper presents a French text-to-speech synthesis system for the Blizzard Challenge 2023. The challenge consists of two tasks: generating high-quality speech from female speakers and generating speech that closely resembles specific individuals. Regarding the competition data, we conducted a screening process to remove missing or erroneous text data. We organized all symbols except for phoneme… ▽ More

    Submitted 20 August, 2024; v1 submitted 31 August, 2023; originally announced September 2023.

  18. arXiv:2308.12439  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection

    Authors: Tinghao Xie, Xiangyu Qi, Ping He, Yiming Li, Jiachen T. Wang, Prateek Mittal

    Abstract: We present a novel defense, against backdoor attacks on Deep Neural Networks (DNNs), wherein adversaries covertly implant malicious behaviors (backdoors) into DNNs. Our defense falls within the category of post-development defenses that operate independently of how the model was generated. The proposed defense is built upon a novel reverse engineering approach that can directly extract backdoor fu… ▽ More

    Submitted 5 October, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

  19. arXiv:2308.10490  [pdf, other

    cs.CV cs.AI cs.GR

    Texture Generation on 3D Meshes with Point-UV Diffusion

    Authors: Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Zhengzhe Liu, Xiaojuan Qi

    Abstract: In this work, we focus on synthesizing high-quality textures on 3D meshes. We present Point-UV diffusion, a coarse-to-fine pipeline that marries the denoising diffusion model with UV mapping to generate 3D consistent and high-quality texture images in UV space. We start with introducing a point diffusion model to synthesize low-frequency texture components with our tailored style guidance to tackl… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted to ICCV 2023, Oral

  20. arXiv:2308.02933  [pdf, other

    cs.HC cs.SI

    InnovationInsights: A Visual Analytics Approach for Understanding the Dual Frontiers of Science and Technology

    Authors: Yifang Wang, Yifan Qian, Xiaoyu Qi, Nan Cao, Dashun Wang

    Abstract: Science has long been viewed as a key driver of economic growth and rising standards of living. Knowledge about how scientific advances support marketplace inventions is therefore essential for understanding the role of science in propelling real-world applications and technological progress. The increasing availability of large-scale datasets tracing scientific publications and patented invention… ▽ More

    Submitted 8 August, 2023; v1 submitted 5 August, 2023; originally announced August 2023.

  21. arXiv:2308.00353  [pdf, other

    cs.CV

    Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding

    Authors: Runyu Ding, Jihan Yang, Chuhui Xue, Wenqing Zhang, Song Bai, Xiaojuan Qi

    Abstract: Open-world instance-level scene understanding aims to locate and recognize unseen object categories that are not present in the annotated dataset. This task is challenging because the model needs to both localize novel 3D objects and infer their semantic categories. A key factor for the recent progress in 2D open-world perception is the availability of large-scale image-text pairs from the Interne… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: submit to TPAMI

  22. arXiv:2307.16620  [pdf, other

    cs.SD cs.CV eess.AS

    Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics

    Authors: Chen Liu, Peike Li, Xingqun Qi, Hu Zhang, Lincheng Li, Dadong Wang, Xin Yu

    Abstract: The audio-visual segmentation (AVS) task aims to segment sounding objects from a given video. Existing works mainly focus on fusing audio and visual features of a given video to achieve sounding object masks. However, we observed that prior arts are prone to segment a certain salient object in a video regardless of the audio information. This is because sounding objects are often the most salient… ▽ More

    Submitted 31 July, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: This paper has been received by ACM MM 23

  23. arXiv:2307.16118  [pdf, other

    cs.RO

    MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections

    Authors: Jiaqi Liu, Peng Hang, Xiao qi, Jianqiang Wang, Jian Sun

    Abstract: Autonomous driving technology is poised to transform transportation systems. However, achieving safe and accurate multi-task decision-making in complex scenarios, such as unsignalized intersections, remains a challenge for autonomous vehicles. This paper presents a novel approach to this issue with the development of a Multi-Task Decision-Making Generative Pre-trained Transformer (MTD-GPT) model.… ▽ More

    Submitted 29 July, 2023; originally announced July 2023.

    Comments: Accepted by ITSC 2023

  24. arXiv:2307.15950  [pdf, other

    cs.RO

    Teaching Autonomous Vehicles to Express Interaction Intent during Unprotected Left Turns: A Human-Driving-Prior-Based Trajectory Planning Approach

    Authors: Jiaqi Liu, Xiao Qi, Ying Ni, Jian Sun, Peng Hang

    Abstract: Incorporating Autonomous Vehicles (AVs) into existing transportation systems necessitates examining their coexistence with Human-driven Vehicles (HVs) in mixed traffic environments. Central to this coexistence is the AVs' ability to emulate human-like interaction intentions within traffic scenarios. We introduce a novel framework for planning unprotected left-turn trajectories for AVs, designed to… ▽ More

    Submitted 25 November, 2023; v1 submitted 29 July, 2023; originally announced July 2023.

  25. arXiv:2307.15689  [pdf, other

    quant-ph cond-mat.stat-mech hep-th

    Engineering entanglement geometry via spacetime-modulated measurements

    Authors: Aditya Cowsik, Matteo Ippoliti, Xiao-Liang Qi

    Abstract: We introduce a general approach to realize quantum states with holographic entanglement structure via monitored dynamics. Starting from random unitary circuits in $1+1$ dimensions, we introduce measurements with a spatiotemporally-modulated density. Exploiting the known critical properties of the measurement-induced entanglement transition, this allows us to engineer arbitrary geometries for the b… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

    Comments: 5 pages, 3 figures

  26. arXiv:2307.15061  [pdf, other

    cs.CV cs.RO

    The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation

    Authors: Lingdong Kong, Yaru Niu, Shaoyuan Xie, Hanjiang Hu, Lai Xing Ng, Benoit R. Cottereau, Ding Zhao, Liangjun Zhang, Hesheng Wang, Wei Tsang Ooi, Ruijie Zhu, Ziyang Song, Li Liu, Tianzhu Zhang, Jun Yu, Mohan Jing, Pengwei Li, Xiaohua Qi, Cheng Jin, Yingfeng Chen, Jie Hou, Jie Zhang, Zhen Kan, Qiang Ling, Liang Peng , et al. (18 additional authors not shown)

    Abstract: Accurate depth estimation under out-of-distribution (OoD) scenarios, such as adverse weather conditions, sensor failure, and noise contamination, is desirable for safety-critical applications. Existing depth estimation systems, however, suffer inevitably from real-world corruptions and perturbations and are struggled to provide reliable depth predictions under such cases. In this paper, we summari… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: Technical Report; 65 pages, 34 figures, 24 tables; Code at https://github.com/ldkong1205/RoboDepth

  27. arXiv:2307.09316  [pdf, other

    cs.CV

    MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds

    Authors: Jiahui Liu, Chirui Chang, Jianhui Liu, Xiaoyang Wu, Lan Ma, Xiaojuan Qi

    Abstract: 3D semantic segmentation on multi-scan large-scale point clouds plays an important role in autonomous systems. Unlike the single-scan-based semantic segmentation task, this task requires distinguishing the motion states of points in addition to their semantic categories. However, methods designed for single-scan-based segmentation tasks perform poorly on the multi-scan task due to the lacking of a… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Journal ref: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023

  28. arXiv:2307.08388  [pdf, other

    cs.CV eess.IV

    Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation

    Authors: Yaolei Qi, Yuting He, Xiaoming Qi, Yuan Zhang, Guanyu Yang

    Abstract: Accurate segmentation of topological tubular structures, such as blood vessels and roads, is crucial in various fields, ensuring accuracy and efficiency in downstream tasks. However, many factors complicate the task, including thin local structures and variable global morphologies. In this work, we note the specificity of tubular structures and use this knowledge to guide our DSCNet to simultaneou… ▽ More

    Submitted 18 August, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted by ICCV 2023

  29. arXiv:2307.00771  [pdf, other

    cs.ET

    Resistive memory-based zero-shot liquid state machine for multimodal event data learning

    Authors: Ning Lin, Shaocong Wang, Yi Li, Bo Wang, Shuhui Shi, Yangu He, Woyu Zhang, Yifei Yu, Yue Zhang, Xiaojuan Qi, Xiaoming Chen, Hao Jiang, Xumeng Zhang, Peng Lin, Xiaoxin Xu, Qi Liu, Zhongrui Wang, Dashan Shang, Ming Liu

    Abstract: The human brain is a complex spiking neural network (SNN) that learns multimodal signals in a zero-shot manner by generalizing existing knowledge. Remarkably, the brain achieves this with minimal power consumption, using event-based signals that propagate within its structure. However, mimicking the human brain in neuromorphic hardware presents both hardware and software challenges. Hardware limit… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  30. arXiv:2306.16687  [pdf, other

    hep-th

    Bulk Reconstruction from Generalized Free Fields

    Authors: Tamra M. Nebabu, Xiaoliang Qi

    Abstract: We propose a generalized protocol for constructing a dual free bulk theory from any boundary model of generalized free fields (GFFs). To construct the bulk operators, we employ a linear ansatz similar to the Hamilton-Kabat-Liftschytz and Lowe (HKLL) construction. However, unlike the HKLL construction, our protocol relies only on boundary data with no presupposed form for the bulk equations of moti… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

  31. arXiv:2306.16064  [pdf, other

    cs.LG cs.AI

    Federated Generative Learning with Foundation Models

    Authors: Jie Zhang, Xiaohua Qi, Bo Zhao

    Abstract: Existing approaches in Federated Learning (FL) mainly focus on sending model parameters or gradients from clients to a server. However, these methods are plagued by significant inefficiency, privacy, and security concerns. Thanks to the emerging foundation generative models, we propose a novel federated learning framework, namely Federated Generative Learning. In this framework, each client can cr… ▽ More

    Submitted 31 May, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

  32. arXiv:2306.13213  [pdf, other

    cs.CR cs.CL cs.LG

    Visual Adversarial Examples Jailbreak Aligned Large Language Models

    Authors: Xiangyu Qi, Kaixuan Huang, Ashwinee Panda, Peter Henderson, Mengdi Wang, Prateek Mittal

    Abstract: Recently, there has been a surge of interest in integrating vision into Large Language Models (LLMs), exemplified by Visual Language Models (VLMs) such as Flamingo and GPT-4. This paper sheds light on the security and safety implications of this trend. First, we underscore that the continuous and high-dimensional nature of the visual input makes it a weak link against adversarial attacks, represen… ▽ More

    Submitted 16 August, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

  33. arXiv:2306.12422  [pdf, other

    cs.CV cs.GR cs.LG

    DreamTime: An Improved Optimization Strategy for Diffusion-Guided 3D Generation

    Authors: Yukun Huang, Jianan Wang, Yukai Shi, Boshi Tang, Xianbiao Qi, Lei Zhang

    Abstract: Text-to-image diffusion models pre-trained on billions of image-text pairs have recently enabled 3D content creation by optimizing a randomly initialized differentiable 3D representation with score distillation. However, the optimization process suffers slow convergence and the resultant 3D models often exhibit two limitations: (a) quality concerns such as missing attributes and distorted shape an… ▽ More

    Submitted 6 May, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: ICLR 2024

  34. arXiv:2306.09338  [pdf, other

    cs.LG cs.CV math.OC stat.ML

    Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant

    Authors: Xianbiao Qi, Jianan Wang, Lei Zhang

    Abstract: This article provides a comprehensive understanding of optimization in deep learning, with a primary focus on the challenges of gradient vanishing and gradient exploding, which normally lead to diminished model representational ability and training instability, respectively. We analyze these two challenges through several strategic measures, including the improvement of gradient flow and the impos… ▽ More

    Submitted 12 November, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: International Digital Economy Academy (IDEA)

  35. arXiv:2306.09076  [pdf, other

    cs.HC

    Awayvirus: A Playful and Tangible Approach to Improve Children's Hygiene Habits in Family Education

    Authors: Xiang Qi, Yaxiong Lei, Shijing He, Shuxin Cheng

    Abstract: Despite various playful and educational tools have been developed to support children's learning abilities, limited work focuses on tangible toys designed to improve and maintain children's hygiene perception, habits and awareness, as well as fostering their collaboration and social abilities in home education contexts. We developed \textbf{Awayvirus} to address this research and design gap, aimin… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 10 pages, 3 figures, Paper accepted by INTERACT23 short paper track

    ACM Class: H.5

  36. arXiv:2306.07760  [pdf, other

    cs.HC

    Urania: Visualizing Data Analysis Pipelines for Natural Language-Based Data Exploration

    Authors: Yi Guo, Nan Cao, Xiaoyu Qi, Haoyang Li, Danqing Shi, Jing Zhang, Qing Chen, Daniel Weiskopf

    Abstract: Exploratory Data Analysis (EDA) is an essential yet tedious process for examining a new dataset. To facilitate it, natural language interfaces (NLIs) can help people intuitively explore the dataset via data-oriented questions. However, existing NLIs primarily focus on providing accurate answers to questions, with few offering explanations or presentations of the data analysis pipeline used to unco… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  37. arXiv:2306.07505  [pdf

    q-bio.TO eess.IV

    Deep learning radiomics for assessment of gastroesophageal varices in people with compensated advanced chronic liver disease

    Authors: Lan Wang, Ruiling He, Lili Zhao, Jia Wang, Zhengzi Geng, Tao Ren, Guo Zhang, Peng Zhang, Kaiqiang Tang, Chaofei Gao, Fei Chen, Liting Zhang, Yonghe Zhou, Xin Li, Fanbin He, Hui Huan, Wenjuan Wang, Yunxiao Liang, Juan Tang, Fang Ai, Tingyu Wang, Liyun Zheng, Zhongwei Zhao, Jiansong Ji, Wei Liu , et al. (22 additional authors not shown)

    Abstract: Objective: Bleeding from gastroesophageal varices (GEV) is a medical emergency associated with high mortality. We aim to construct an artificial intelligence-based model of two-dimensional shear wave elastography (2D-SWE) of the liver and spleen to precisely assess the risk of GEV and high-risk gastroesophageal varices (HRV). Design: A prospective multicenter study was conducted in patients with… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

  38. arXiv:2306.07265  [pdf, other

    cs.CV

    detrex: Benchmarking Detection Transformers

    Authors: Tianhe Ren, Shilong Liu, Feng Li, Hao Zhang, Ailing Zeng, Jie Yang, Xingyu Liao, Ding Jia, Hongyang Li, He Cao, Jianan Wang, Zhaoyang Zeng, Xianbiao Qi, Yuhui Yuan, Jianwei Yang, Lei Zhang

    Abstract: The DEtection TRansformer (DETR) algorithm has received considerable attention in the research community and is gradually emerging as a mainstream approach for object detection and other perception tasks. However, the current field lacks a unified and comprehensive benchmark specifically tailored for DETR-based models. To address this issue, we develop a unified, highly modular, and lightweight co… ▽ More

    Submitted 13 June, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

    Comments: project link: https://github.com/IDEA-Research/detrex

  39. A generalized approach to photon avalanche upconversion in luminescent nanocrystals

    Authors: Artiom Skripka, Minji Lee, Xiao Qi, Jia-Ahn Pan, Haoran Yang, Changhwan Lee, P. James Schuck, Bruce E. Cohen, Daniel Jaque, Emory M. Chan

    Abstract: Photon avalanching nanoparticles (ANPs) exhibit extremely nonlinear upconverted emission valuable for sub-diffraction imaging, nanoscale sensing, and optical computing. Avalanching has been demonstrated with Tm3+, Nd3+ or Pr3+-doped nanocrystals, but their emission is limited to 600 and 800 nm, restricting applications. Here, we utilize Gd3+-assisted energy migration to tune the emission wavelengt… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 13 pages, 5 figures

  40. arXiv:2305.18891  [pdf, other

    cs.CV cs.AI cs.MM

    EmotionGesture: Audio-Driven Diverse Emotional Co-Speech 3D Gesture Generation

    Authors: Xingqun Qi, Chen Liu, Lincheng Li, Jie Hou, Haoran Xin, Xin Yu

    Abstract: Generating vivid and diverse 3D co-speech gestures is crucial for various applications in animating virtual avatars. While most existing methods can generate gestures from audio directly, they usually overlook that emotion is one of the key factors of authentic co-speech gesture generation. In this work, we propose EmotionGesture, a novel framework for synthesizing vivid and diverse emotional co-s… ▽ More

    Submitted 3 January, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Under review

  41. arXiv:2305.14691  [pdf, other

    cs.CV

    Label-Efficient Learning in Agriculture: A Comprehensive Review

    Authors: Jiajia Li, Dong Chen, Xinda Qi, Zhaojian Li, Yanbo Huang, Daniel Morris, Xiaobo Tan

    Abstract: The past decade has witnessed many great successes of machine learning (ML) and deep learning (DL) applications in agricultural systems, including weed control, plant disease diagnosis, agricultural robotics, and precision livestock management. Despite tremendous progresses, one downside of such ML/DL models is that they generally rely on large-scale labeled datasets for training, and the performa… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: 34 pages, 23 figures

  42. arXiv:2305.13948  [pdf, other

    cs.CV cs.LG

    Decoupled Kullback-Leibler Divergence Loss

    Authors: Jiequan Cui, Zhuotao Tian, Zhisheng Zhong, Xiaojuan Qi, Bei Yu, Hanwang Zhang

    Abstract: In this paper, we delve deeper into the Kullback-Leibler (KL) Divergence loss and observe that it is equivalent to the Doupled Kullback-Leibler (DKL) Divergence loss that consists of 1) a weighted Mean Square Error (wMSE) loss and 2) a Cross-Entropy loss incorporating soft labels. From our analysis of the DKL loss, we have identified two areas for improvement. Firstly, we address the limitation of… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: under review

  43. arXiv:2305.13869  [pdf, other

    physics.acc-ph cs.AI cs.LG eess.SY

    Trend-Based SAC Beam Control Method with Zero-Shot in Superconducting Linear Accelerator

    Authors: Xiaolong Chen, Xin Qi, Chunguang Su, Yuan He, Zhijun Wang, Kunxiang Sun, Chao Jin, Weilong Chen, Shuhui Liu, Xiaoying Zhao, Duanyang Jia, Man Yi

    Abstract: The superconducting linear accelerator is a highly flexiable facility for modern scientific discoveries, necessitating weekly reconfiguration and tuning. Accordingly, minimizing setup time proves essential in affording users with ample experimental time. We propose a trend-based soft actor-critic(TBSAC) beam control method with strong robustness, allowing the agents to be trained in a simulated en… ▽ More

    Submitted 25 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  44. arXiv:2305.12529  [pdf, other

    cs.CV cs.LG

    DreamWaltz: Make a Scene with Complex 3D Animatable Avatars

    Authors: Yukun Huang, Jianan Wang, Ailing Zeng, He Cao, Xianbiao Qi, Yukai Shi, Zheng-Jun Zha, Lei Zhang

    Abstract: We present DreamWaltz, a novel framework for generating and animating complex 3D avatars given text guidance and parametric human body prior. While recent methods have shown encouraging results for text-to-3D generation of common objects, creating high-quality and animatable 3D avatars remains challenging. To create high-quality 3D avatars, DreamWaltz proposes 3D-consistent occlusion-aware Score D… ▽ More

    Submitted 5 November, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: To appear in NeurIPS 2023; Project page: https://dreamwaltz3d.github.io/

  45. arXiv:2305.04454  [pdf, other

    physics.optics

    Frequency combs induced by optical feedback and harmonic order tunability in quantum cascade lasers

    Authors: Carlo Silvestri, Xiaoqiong Qi, Thomas Taimre, Aleksandar D. Rakić

    Abstract: This study investigates the interaction between frequency combs and optical feedback effects in Quantum Cascade Lasers (QCLs). The theoretical analysis reveals new phenomena arising from the interplay between comb generation and feedback. By considering the bias current corresponding to free-running single mode emission, the introduction of optical feedback can trigger the generation of frequency… ▽ More

    Submitted 16 October, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

  46. arXiv:2304.14503  [pdf

    cs.CV eess.IV physics.ins-det physics.optics

    UHRNet: A Deep Learning-Based Method for Accurate 3D Reconstruction from a Single Fringe-Pattern

    Authors: Yixiao Wang, Canlin Zhou, Xingyang Qi, Hui Li

    Abstract: The quick and accurate retrieval of an object height from a single fringe pattern in Fringe Projection Profilometry has been a topic of ongoing research. While a single shot fringe to depth CNN based method can restore height map directly from a single pattern, its accuracy is currently inferior to the traditional phase shifting technique. To improve this method's accuracy, we propose using a U sh… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

  47. arXiv:2304.12988  [pdf, other

    eess.IV cs.CV cs.LG

    Multi-Scale Feature Fusion using Parallel-Attention Block for COVID-19 Chest X-ray Diagnosis

    Authors: Xiao Qi, David J. Foran, John L. Nosher, Ilker Hacihaliloglu

    Abstract: Under the global COVID-19 crisis, accurate diagnosis of COVID-19 from Chest X-ray (CXR) images is critical. To reduce intra- and inter-observer variability, during the radiological assessment, computer-aided diagnostic tools have been utilized to supplement medical decision-making and subsequent disease management. Computational methods with high accuracy and robustness are required for rapid tria… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2023:008

    Journal ref: Machine.Learning.for.Biomedical.Imaging. 2 (2023)

  48. arXiv:2304.12652  [pdf, other

    cs.CV

    Hybrid Neural Rendering for Large-Scale Scenes with Motion Blur

    Authors: Peng Dai, Yinda Zhang, Xin Yu, Xiaoyang Lyu, Xiaojuan Qi

    Abstract: Rendering novel view images is highly desirable for many applications. Despite recent progress, it remains challenging to render high-fidelity and view-consistent novel views of large-scale scenes from in-the-wild images with inevitable artifacts (e.g., motion blur). To this end, we develop a hybrid neural rendering model that makes image-based representation and neural 3D representation join forc… ▽ More

    Submitted 9 July, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

  49. arXiv:2304.11537  [pdf, ps, other

    math.CO

    Bounds for eccentricity-based parameters of graphs

    Authors: Yunfang Tang, Xuli Qi, Douglas B. West

    Abstract: The \emph{eccentricity} of a vertex $u$ in a graph $G$, denoted by $e_G(u)$, is the maximum distance from $u$ to other vertices in $G$. We study extremal problems for the average eccentricity and the first and second Zagreb eccentricity indices, denoted by $σ_0(G)$, $σ_1(G)$, and $σ_2(G)$, respectively. These are defined by $σ_0(G)=\frac{1}{|V(G)|}\sum_{u\in V(G)}e_G(u)$,… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: 27 pages

  50. arXiv:2304.09856  [pdf, other

    cs.CV cs.AI cs.LG

    LipsFormer: Introducing Lipschitz Continuity to Vision Transformers

    Authors: Xianbiao Qi, Jianan Wang, Yihao Chen, Yukai Shi, Lei Zhang

    Abstract: We present a Lipschitz continuous Transformer, called LipsFormer, to pursue training stability both theoretically and empirically for Transformer-based models. In contrast to previous practical tricks that address training instability by learning rate warmup, layer normalization, attention formulation, and weight initialization, we show that Lipschitz continuity is a more essential property to ens… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: To appear in ICLR 2023, our code will be public at https://github.com/IDEA-Research/LipsFormer