Zum Hauptinhalt springen

Showing 1–50 of 582 results for author: Xiong, H

.
  1. arXiv:2408.17424  [pdf, other

    cs.CV cs.HC

    CinePreGen: Camera Controllable Video Previsualization via Engine-powered Diffusion

    Authors: Yiran Chen, Anyi Rao, Xuekun Jiang, Shishi Xiao, Ruiqing Ma, Zeyu Wang, Hui Xiong, Bo Dai

    Abstract: With advancements in video generative AI models (e.g., SORA), creators are increasingly using these techniques to enhance video previsualization. However, they face challenges with incomplete and mismatched AI workflows. Existing methods mainly rely on text descriptions and struggle with camera placement, a key component of previsualization. To address these issues, we introduce CinePreGen, a visu… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

  2. arXiv:2408.15861  [pdf, other

    cs.CR cs.LG

    Fusing Pruned and Backdoored Models: Optimal Transport-based Data-free Backdoor Mitigation

    Authors: Weilin Lin, Li Liu, Jianze Li, Hui Xiong

    Abstract: Backdoor attacks present a serious security threat to deep neuron networks (DNNs). Although numerous effective defense techniques have been proposed in recent years, they inevitably rely on the availability of either clean or poisoned data. In contrast, data-free defense techniques have evolved slowly and still lag significantly in performance. To address this issue, different from the traditional… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  3. arXiv:2408.10723  [pdf, ps, other

    physics.optics

    Optical refractive index and spacetime geometry

    Authors: Hui Xiong, Mengcheng Zhu

    Abstract: Classical mechanics and geometrical optics are deeply connected with each other. In this work, we generalize the analogy between these two disciplines to relativistic conditions. Using this analogy, we are able to make light follow the orbits of massive or massless particles in gravitational field, by designing a particular optical medium with a prescribed refractive index profile according to spa… ▽ More

    Submitted 21 August, 2024; v1 submitted 20 August, 2024; originally announced August 2024.

    Comments: 30 pages, 8 figures

  4. arXiv:2408.09698  [pdf, other

    cs.IR cs.AI

    Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation

    Authors: Yuyang Ye, Zhi Zheng, Yishan Shen, Tianshu Wang, Hengruo Zhang, Peijun Zhu, Runlong Yu, Kai Zhang, Hui Xiong

    Abstract: Recent advances in Large Language Models (LLMs) have demonstrated significant potential in the field of Recommendation Systems (RSs). Most existing studies have focused on converting user behavior logs into textual prompts and leveraging techniques such as prompt tuning to enable LLMs for recommendation tasks. Meanwhile, research interest has recently grown in multimodal recommendation systems tha… ▽ More

    Submitted 20 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  5. arXiv:2408.09098  [pdf, ps, other

    math.SP math.AP math.CV

    Boundary spectral estimates for semiclassical Gevrey operators

    Authors: Haoren Xiong

    Abstract: We obtain the spectral and resolvent estimates for semiclassical pseudodifferential operators with symbol of Gevrey-$s$ regularity, near the boundary of the range of the principal symbol. We prove that the boundary spectrum free region is of size ${\mathcal O}(h^{1-\frac{1}{s}})$ where the resolvent is at most fractional exponentially large in $h$, as the semiclassical parameter $h\to 0^+$. This i… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

  6. arXiv:2408.08328  [pdf, other

    cs.AI cs.LG stat.AP

    Unleash The Power of Pre-Trained Language Models for Irregularly Sampled Time Series

    Authors: Weijia Zhang, Chenlong Yin, Hao Liu, Hui Xiong

    Abstract: Pre-trained Language Models (PLMs), such as ChatGPT, have significantly advanced the field of natural language processing. This progress has inspired a series of innovative studies that explore the adaptation of PLMs to time series analysis, intending to create a unified foundation model that addresses various time series analytical tasks. However, these efforts predominantly focus on Regularly Sa… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  7. arXiv:2408.07009  [pdf, other

    cs.CV

    Imagen 3

    Authors: Imagen-Team-Google, :, Jason Baldridge, Jakob Bauer, Mukul Bhutani, Nicole Brichtova, Andrew Bunner, Kelvin Chan, Yichang Chen, Sander Dieleman, Yuqing Du, Zach Eaton-Rosen, Hongliang Fei, Nando de Freitas, Yilin Gao, Evgeny Gladchenko, Sergio Gómez Colmenarejo, Mandy Guo, Alex Haig, Will Hawkins, Hexiang Hu, Huilian Huang, Tobenna Peter Igwe, Christos Kaplanis, Siavash Khodadadeh , et al. (227 additional authors not shown)

    Abstract: We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.

    Submitted 13 August, 2024; originally announced August 2024.

  8. arXiv:2408.04974  [pdf, other

    cs.CR cs.CV

    XNN: Paradigm Shift in Mitigating Identity Leakage within Cloud-Enabled Deep Learning

    Authors: Kaixin Liu, Huixin Xiong, Bingyu Duan, Zexuan Cheng, Xinyu Zhou, Wanqian Zhang, Xiangyu Zhang

    Abstract: In the domain of cloud-based deep learning, the imperative for external computational resources coexists with acute privacy concerns, particularly identity leakage. To address this challenge, we introduce XNN and XNN-d, pioneering methodologies that infuse neural network features with randomized perturbations, striking a harmonious balance between utility and privacy. XNN, designed for the trainin… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  9. arXiv:2408.03042  [pdf

    physics.flu-dyn

    A unified transition mechanism from shock to detonation waves

    Authors: Hao Yan, Haochen Xiong, Xin Han, Chongguang Shi, Yancheng You

    Abstract: The transition of shock-to-detonation is of great significance for the investigation of supernova formation, disaster prevention and supersonic propulsion technology. In this paper, the influence Equation of shock-to-detonation transition is summarized for the oblique detonation problem from aerodynamic analysis. The Equation integrates the effects of parameters such as chemical reaction, shock in… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  10. arXiv:2408.00083  [pdf, other

    cs.CV

    Localized Gaussian Splatting Editing with Contextual Awareness

    Authors: Hanyuan Xiao, Yingshu Chen, Huajian Huang, Haolin Xiong, Jing Yang, Pratusha Prasad, Yajie Zhao

    Abstract: Recent text-guided generation of individual 3D object has achieved great success using diffusion priors. However, these methods are not suitable for object insertion and replacement tasks as they do not consider the background, leading to illumination mismatches within the environment. To bridge the gap, we introduce an illumination-aware 3D scene editing pipeline for 3D Gaussian Splatting (3DGS)… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

  11. arXiv:2407.21040  [pdf, other

    cs.AI cs.CL cs.DB cs.SE

    Towards Automated Data Sciences with Natural Language and SageCopilot: Practices and Lessons Learned

    Authors: Yuan Liao, Jiang Bian, Yuhui Yun, Shuo Wang, Yubo Zhang, Jiaming Chu, Tao Wang, Kewei Li, Yuchen Li, Xuhong Li, Shilei Ji, Haoyi Xiong

    Abstract: While the field of NL2SQL has made significant advancements in translating natural language instructions into executable SQL scripts for data querying and processing, achieving full automation within the broader data science pipeline - encompassing data querying, analysis, visualization, and reporting - remains a complex challenge. This study introduces SageCopilot, an advanced, industry-grade sys… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  12. arXiv:2407.14041  [pdf, other

    cs.CV

    Not All Noises Are Created Equally:Diffusion Noise Selection and Optimization

    Authors: Zipeng Qi, Lichen Bai, Haoyi Xiong, Zeke Xie

    Abstract: Diffusion models that can generate high-quality data from randomly sampled Gaussian noises have become the mainstream generative method in both academia and industry. Are randomly sampled Gaussian noises equally good for diffusion models? While a large body of works tried to understand and improve diffusion models, previous works overlooked the possibility to select or optimize the sampled noise t… ▽ More

    Submitted 27 July, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

  13. arXiv:2407.13541  [pdf, other

    cs.CV

    On the Discriminability of Self-Supervised Representation Learning

    Authors: Zeen Song, Wenwen Qiang, Changwen Zheng, Fuchun Sun, Hui Xiong

    Abstract: Self-supervised learning (SSL) has recently achieved significant success in downstream visual tasks. However, a notable gap still exists between SSL and supervised learning (SL), especially in complex downstream tasks. In this paper, we show that the features learned by SSL methods suffer from the crowding problem, where features of different classes are not distinctly separated, and features with… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  14. arXiv:2407.13145  [pdf, ps, other

    nlin.CD physics.optics

    Ultra-low threshold chaos in cavity magnomechanics

    Authors: Jiao Peng, Zeng-Xing Liu, Ya-Fei Yu, Hao Xiong

    Abstract: Cavity magnomechanics using mechanical degrees of freedom in ferromagnetic crystals provides a powerful platform for observing many interesting classical and quantum nonlinear phenomena in the emerging field of magnon spintronics. However, to date, the generation and control of chaotic motion in a cavity magnomechanical system remain an outstanding challenge due to the inherently weak nonlinear in… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  15. arXiv:2407.12519  [pdf, other

    cs.CV

    Causality-inspired Discriminative Feature Learning in Triple Domains for Gait Recognition

    Authors: Haijun Xiong, Bin Feng, Xinggang Wang, Wenyu Liu

    Abstract: Gait recognition is a biometric technology that distinguishes individuals by their walking patterns. However, previous methods face challenges when accurately extracting identity features because they often become entangled with non-identity clues. To address this challenge, we propose CLTD, a causality-inspired discriminative feature learning module designed to effectively eliminate the influence… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  16. A cryogenic on-chip microwave pulse generator for large-scale superconducting quantum computing

    Authors: Zenghui Bao, Yan Li, Zhiling Wang, Jiahui Wang, Jize Yang, Haonan Xiong, Yipu Song, Yukai Wu, Hongyi Zhang, Luming Duan

    Abstract: For superconducting quantum processors, microwave signals are delivered to each qubit from room-temperature electronics to the cryogenic environment through coaxial cables. Limited by the heat load of cabling and the massive cost of electronics, such an architecture is not viable for millions of qubits required for fault-tolerant quantum computing. Monolithic integration of the control electronics… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 12 pages, 4 figures

    Journal ref: Nat Commun 15, 5958 (2024)

  17. arXiv:2407.09853  [pdf, other

    cs.CV

    Image Compression for Machine and Human Vision with Spatial-Frequency Adaptation

    Authors: Han Li, Shaohui Li, Shuangrui Ding, Wenrui Dai, Maida Cao, Chenglin Li, Junni Zou, Hongkai Xiong

    Abstract: Image compression for machine and human vision (ICMH) has gained increasing attention in recent years. Existing ICMH methods are limited by high training and storage overheads due to heavy design of task-specific networks. To address this issue, in this paper, we develop a novel lightweight adapter-based tuning framework for ICMH, named Adapt-ICMH, that better balances task performance and bitrate… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024, project: https://github.com/qingshi9974/ECCV2024-AdpatICMH

  18. arXiv:2407.08997  [pdf, ps, other

    math.AP gr-qc

    Asymptotic expansions for semilinear waves on asymptotically flat spacetimes

    Authors: Shi-Zhuo Looi, Haoren Xiong

    Abstract: We establish precise asymptotic expansions for solutions to semilinear wave equations with power-type nonlinearities on asymptotically flat spacetimes. For cubic nonlinearities $a(t,x)φ^3$, we prove $φ(t, x) = 2c t^{-2} + O(t^{-3+})$ in compact spatial regions, with $c$ computable. For $a(t,x)φ^p$ with $p \geq 4$, we show $φ(t, x) = d t^{-3} + O(t^{-4+})$, extending Price's law to the nonlinear se… ▽ More

    Submitted 29 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

    Comments: 48 pages

  19. PAIL: Performance based Adversarial Imitation Learning Engine for Carbon Neutral Optimization

    Authors: Yuyang Ye, Lu-An Tang, Haoyu Wang, Runlong Yu, Wenchao Yu, Erhu He, Haifeng Chen, Hui Xiong

    Abstract: Achieving carbon neutrality within industrial operations has become increasingly imperative for sustainable development. It is both a significant challenge and a key opportunity for operational optimization in industry 4.0. In recent years, Deep Reinforcement Learning (DRL) based methods offer promising enhancements for sequential optimization processes and can be used for reducing carbon emission… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  20. arXiv:2407.08516  [pdf, other

    cs.AI

    Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents

    Authors: Haoyi Xiong, Zhiyuan Wang, Xuhong Li, Jiang Bian, Zeke Xie, Shahid Mumtaz, Laura E. Barnes

    Abstract: This article explores the convergence of connectionist and symbolic artificial intelligence (AI), from historical debates to contemporary advancements. Traditionally considered distinct paradigms, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic. Recent advancements in large language models (LLMs), exemplified by ChatGPT and GPT-4, highlig… ▽ More

    Submitted 6 August, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  21. arXiv:2407.01085  [pdf, other

    cs.LG cs.CL

    Rethinking LLM-based Preference Evaluation

    Authors: Zhengyu Hu, Linxin Song, Jieyu Zhang, Zheyuan Xiao, Jingang Wang, Zhenyu Chen, Hui Xiong

    Abstract: The use of large language model (LLM)-based preference evaluations has become widespread for comparing model responses, but it has revealed a notable bias towards longer responses, questioning the reliability of such evaluations. This paper explores the length bias in LLM evaluations from a data-centric perspective, analyzing 14 commonly used preference datasets and 10 reward models. Our findings… ▽ More

    Submitted 8 August, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  22. arXiv:2407.00709  [pdf, other

    stat.AP

    Comparative Effectiveness Research with Average Hazard for Censored Time-to-Event Outcomes: A Numerical Study

    Authors: Hong Xiong, Jean Connors, Deb Schrag, Hajime Uno

    Abstract: The average hazard (AH), recently introduced by Uno and Horiguchi, represents a novel summary metric of event time distributions, conceptualized as the general censoring-free average person-time incidence rate on a given time window, $[0,τ].$ This metric is calculated as the ratio of the cumulative incidence probability at $τ$ to the restricted mean survival time at $τ$ and can be estimated throug… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  23. arXiv:2407.00128  [pdf, other

    cs.IR cs.AI cs.LG

    When Search Engine Services meet Large Language Models: Visions and Challenges

    Authors: Haoyi Xiong, Jiang Bian, Yuchen Li, Xuhong Li, Mengnan Du, Shuaiqiang Wang, Dawei Yin, Sumi Helal

    Abstract: Combining Large Language Models (LLMs) with search engine services marks a significant shift in the field of services computing, opening up new possibilities to enhance how we search for and retrieve information, understand content, and interact with internet services. This paper conducts an in-depth examination of how integrating LLMs with search engines can mutually benefit both technologies. We… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Under Review

  24. arXiv:2406.18115  [pdf, other

    cs.RO cs.AI cs.CV

    Open-vocabulary Mobile Manipulation in Unseen Dynamic Environments with 3D Semantic Maps

    Authors: Dicong Qiu, Wenzong Ma, Zhenfu Pan, Hui Xiong, Junwei Liang

    Abstract: Open-Vocabulary Mobile Manipulation (OVMM) is a crucial capability for autonomous robots, especially when faced with the challenges posed by unknown and dynamic environments. This task requires robots to explore and build a semantic understanding of their surroundings, generate feasible plans to achieve manipulation goals, adapt to environmental changes, and comprehend natural language instruction… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Open-vocabulary, Mobile Manipulation, Dynamic Environments, 3D Semantic Maps, Zero-shot, LLMs, VLMs, 18 pages, 2 figures

  25. Joint Admission Control and Resource Allocation of Virtual Network Embedding via Hierarchical Deep Reinforcement Learning

    Authors: Tianfu Wang, Li Shen, Qilin Fan, Tong Xu, Tongliang Liu, Hui Xiong

    Abstract: As an essential resource management problem in network virtualization, virtual network embedding (VNE) aims to allocate the finite resources of physical network to sequentially arriving virtual network requests (VNRs) with different resource demands. Since this is an NP-hard combinatorial optimization problem, many efforts have been made to provide viable solutions. However, most existing approach… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE Transactions on Services Computing (TSC)

    Journal ref: IEEE Transactions on Services Computing ( Volume: 17, Issue: 3, May-June 2024)

  26. arXiv:2406.12923  [pdf, other

    cs.LG cs.MA

    Interpretable Cascading Mixture-of-Experts for Urban Traffic Congestion Prediction

    Authors: Wenzhao Jiang, Jindong Han, Hao Liu, Tao Tao, Naiqiang Tan, Hui Xiong

    Abstract: Rapid urbanization has significantly escalated traffic congestion, underscoring the need for advanced congestion prediction services to bolster intelligent transportation systems. As one of the world's largest ride-hailing platforms, DiDi places great emphasis on the accuracy of congestion prediction to enhance the effectiveness and reliability of their real-time services, such as travel time esti… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  27. arXiv:2406.12355  [pdf, other

    cs.CV

    LiCAF: LiDAR-Camera Asymmetric Fusion for Gait Recognition

    Authors: Yunze Deng, Haijun Xiong, Bin Feng

    Abstract: Gait recognition is a biometric technology that identifies individuals by using walking patterns. Due to the significant achievements of multimodal fusion in gait recognition, we consider employing LiDAR-camera fusion to obtain robust gait representations. However, existing methods often overlook intrinsic characteristics of modalities, and lack fine-grained fusion and temporal modeling. In this p… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted by ICIP2024

  28. arXiv:2406.11920  [pdf, other

    cs.LG cs.AI

    Job-SDF: A Multi-Granularity Dataset for Job Skill Demand Forecasting and Benchmarking

    Authors: Xi Chen, Chuan Qin, Chuyu Fang, Chao Wang, Chen Zhu, Fuzhen Zhuang, Hengshu Zhu, Hui Xiong

    Abstract: In a rapidly evolving job market, skill demand forecasting is crucial as it enables policymakers and businesses to anticipate and adapt to changes, ensuring that workforce skills align with market needs, thereby enhancing productivity and competitiveness. Additionally, by identifying emerging skill requirements, it directs individuals towards relevant training and education opportunities, promotin… ▽ More

    Submitted 19 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  29. arXiv:2406.11357  [pdf, other

    cs.CL cs.AI

    Refiner: Restructure Retrieval Content Efficiently to Advance Question-Answering Capabilities

    Authors: Zhonghao Li, Xuming Hu, Aiwei Liu, Kening Zheng, Sirui Huang, Hui Xiong

    Abstract: Large Language Models (LLMs) are limited by their parametric knowledge, leading to hallucinations in knowledge-extensive tasks. To address this, Retrieval-Augmented Generation (RAG) incorporates external document chunks to expand LLM knowledge. Furthermore, compressing information from document chunks through extraction or summarization can improve LLM performance. Nonetheless, LLMs still struggle… ▽ More

    Submitted 17 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 8 pages

  30. arXiv:2406.10855  [pdf, other

    cs.CV cs.AI

    ALPS: An Auto-Labeling and Pre-training Scheme for Remote Sensing Segmentation With Segment Anything Model

    Authors: Song Zhang, Qingzhong Wang, Junyi Liu, Haoyi Xiong

    Abstract: In the fast-growing field of Remote Sensing (RS) image analysis, the gap between massive unlabeled datasets and the ability to fully utilize these datasets for advanced RS analytics presents a significant challenge. To fill the gap, our work introduces an innovative auto-labeling framework named ALPS (Automatic Labeling for Pre-training in Segmentation), leveraging the Segment Anything Model (SAM)… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  31. arXiv:2406.07413  [pdf, other

    cs.LG

    Holistic Memory Diversification for Incremental Learning in Growing Graphs

    Authors: Ziyue Qiao, Junren Xiao, Qingqiang Sun, Meng Xiao, Hui Xiong

    Abstract: This paper addresses the challenge of incremental learning in growing graphs with increasingly complex tasks. The goal is to continually train a graph model to handle new tasks while retaining its inference ability on previous tasks. Existing methods usually neglect the importance of memory diversity, limiting in effectively selecting high-quality memory from previous tasks and remembering broad p… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  32. arXiv:2406.05504  [pdf, other

    cs.LG

    G-Transformer: Counterfactual Outcome Prediction under Dynamic and Time-varying Treatment Regimes

    Authors: Hong Xiong, Feng Wu, Leon Deng, Megan Su, Li-wei H Lehman

    Abstract: In the context of medical decision making, counterfactual prediction enables clinicians to predict treatment outcomes of interest under alternative courses of therapeutic actions given observed patient history. Prior machine learning approaches for counterfactual predictions under time-varying treatments focus on static time-varying treatment regimes where treatments do not depend on previous cova… ▽ More

    Submitted 27 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

  33. arXiv:2406.01512  [pdf, other

    cs.CL

    MAD: Multi-Alignment MEG-to-Text Decoding

    Authors: Yiqian Yang, Hyejeong Jo, Yiqun Duan, Qiang Zhang, Jinni Zhou, Won Hee Lee, Renjing Xu, Hui Xiong

    Abstract: Deciphering language from brain activity is a crucial task in brain-computer interface (BCI) research. Non-invasive cerebral signaling techniques including electroencephalography (EEG) and magnetoencephalography (MEG) are becoming increasingly popular due to their safety and practicality, avoiding invasive electrode implantation. However, current works under-investigated three points: 1) a predomi… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  34. arXiv:2405.20291  [pdf, other

    cs.CR cs.CV cs.LG

    Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness

    Authors: Weilin Lin, Li Liu, Shaokui Wei, Jianze Li, Hui Xiong

    Abstract: The security threat of backdoor attacks is a central concern for deep neural networks (DNNs). Recently, without poisoned data, unlearning models with clean data and then learning a pruning mask have contributed to backdoor defense. Additionally, vanilla fine-tuning with those clean data can help recover the lost clean accuracy. However, the behavior of clean unlearning is still under-explored, and… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  35. arXiv:2405.15154  [pdf, other

    cs.AI cs.LG

    Online Prompt Pricing based on Combinatorial Multi-Armed Bandit and Hierarchical Stackelberg Game

    Authors: Meiling Li, Hongrun Ren, Haixu Xiong, Zhenxing Qian, Xinpeng Zhang

    Abstract: Generation models have shown promising performance in various tasks, making trading around machine learning models possible. In this paper, we aim at a novel prompt trading scenario, prompt bundle trading (PBT) system, and propose an online pricing mechanism. Based on the combinatorial multi-armed bandit (CMAB) and three-stage hierarchical Stackelburg (HS) game, our pricing mechanism considers the… ▽ More

    Submitted 31 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  36. arXiv:2405.14398  [pdf, other

    cs.HC cs.AI eess.SP

    SpGesture: Source-Free Domain-adaptive sEMG-based Gesture Recognition with Jaccard Attentive Spiking Neural Network

    Authors: Weiyu Guo, Ying Sun, Yijie Xu, Ziyue Qiao, Yongkui Yang, Hui Xiong

    Abstract: Surface electromyography (sEMG) based gesture recognition offers a natural and intuitive interaction modality for wearable devices. Despite significant advancements in sEMG-based gesture-recognition models, existing methods often suffer from high computational latency and increased energy consumption. Additionally, the inherent instability of sEMG signals, combined with their sensitivity to distri… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  37. arXiv:2405.14312  [pdf, other

    cs.CV cs.CL cs.MM

    Improving Gloss-free Sign Language Translation by Reducing Representation Density

    Authors: Jinhui Ye, Xing Wang, Wenxiang Jiao, Junwei Liang, Hui Xiong

    Abstract: Gloss-free sign language translation (SLT) aims to develop well-performing SLT systems with no requirement for the costly gloss annotations, but currently still lags behind gloss-based approaches significantly. In this paper, we identify a representation density problem that could be a bottleneck in restricting the performance of gloss-free SLT. Specifically, the representation density problem des… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Representation Density and Performance Drop

  38. arXiv:2405.14279  [pdf, other

    cs.GT econ.TH

    Optimized Cost Per Click in Online Advertising: A Theoretical Analysis

    Authors: Kaichen Zhang, Zixuan Yuan, Hui Xiong

    Abstract: In recent years, Optimized Cost Per Click (OCPC) and Optimized Cost Per Mille (OCPM) have emerged as the most widely adopted pricing models in the online advertising industry. However, the existing literature has yet to identify the specific conditions under which these models outperform traditional pricing models like Cost Per Click (CPC) and Cost Per Action (CPA). To fill the gap, this paper bui… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted by SIGKDD2024 Research Track

  39. arXiv:2405.13389  [pdf, other

    cs.CV cs.MM cs.RO

    HR-INR: Continuous Space-Time Video Super-Resolution via Event Camera

    Authors: Yunfan Lu, Zipeng Wang, Yusheng Wang, Hui Xiong

    Abstract: Continuous space-time video super-resolution (C-STVSR) aims to simultaneously enhance video resolution and frame rate at an arbitrary scale. Recently, implicit neural representation (INR) has been applied to video restoration, representing videos as implicit fields that can be decoded at an arbitrary scale. However, the highly ill-posed nature of C-STVSR limits the effectiveness of current INR-bas… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 30 pages, 20 figures, 8 tables. This work was submitted for review in the second half of 2023. Project page: https://github.com/yunfanLu/HR-INR

  40. arXiv:2405.12821  [pdf, other

    cs.RO cs.CV

    Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension

    Authors: Runwei Guan, Ruixiao Zhang, Ningwei Ouyang, Jianan Liu, Ka Lok Man, Xiaohao Cai, Ming Xu, Jeremy Smith, Eng Gee Lim, Yutao Yue, Hui Xiong

    Abstract: Embodied perception is essential for intelligent vehicles and robots in interactive environmental understanding. However, these advancements primarily focus on vision, with limited attention given to using 3D modeling sensors, restricting a comprehensive understanding of objects in response to prompts containing qualitative and quantitative queries. Recently, as a promising automotive sensor with… ▽ More

    Submitted 18 July, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: 8 pages, 5 figures

  41. arXiv:2405.10640  [pdf, other

    cs.SI

    COMET: NFT Price Prediction with Wallet Profiling

    Authors: Tianfu Wang, Liwei Deng, Chao Wang, Jianxun Lian, Yue Yan, Nicholas Jing Yuan, Qi Zhang, Hui Xiong

    Abstract: As the non-fungible token (NFT) market flourishes, price prediction emerges as a pivotal direction for investors gaining valuable insight to maximize returns. However, existing works suffer from a lack of practical definitions and standardized evaluations, limiting their practical application. Moreover, the influence of users' multi-behaviour transactions that are publicly accessible on NFT price… ▽ More

    Submitted 2 July, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024 (ADS Track)

  42. arXiv:2405.07991  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    SPIN: Simultaneous Perception, Interaction and Navigation

    Authors: Shagun Uppal, Ananye Agarwal, Haoyu Xiong, Kenneth Shaw, Deepak Pathak

    Abstract: While there has been remarkable progress recently in the fields of manipulation and locomotion, mobile manipulation remains a long-standing challenge. Compared to locomotion or static manipulation, a mobile system must make a diverse range of long-horizon tasks feasible in unstructured and dynamic environments. While the applications are broad and interesting, there are a plethora of challenges in… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: In CVPR 2024. Website at https://spin-robot.github.io/

  43. arXiv:2405.06459  [pdf, other

    cs.CL cs.AI

    Are EEG-to-Text Models Working?

    Authors: Hyejeong Jo, Yiqian Yang, Juhyeok Han, Yiqun Duan, Hui Xiong, Won Hee Lee

    Abstract: This work critically analyzes existing models for open-vocabulary EEG-to-Text translation. We identify a crucial limitation: previous studies often employed implicit teacher-forcing during evaluation, artificially inflating performance metrics. Additionally, they lacked a critical benchmark - comparing model performance on pure noise inputs. We propose a methodology to differentiate between models… ▽ More

    Submitted 13 June, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

  44. arXiv:2405.04867  [pdf, other

    eess.IV cs.CV

    MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

    Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Haijin Zeng, Kai Feng , et al. (24 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: MIPI@CVPR2024. Website: https://mipi-challenge.org/MIPI2024/

  45. arXiv:2405.04103  [pdf, other

    cs.CV

    COM3D: Leveraging Cross-View Correspondence and Cross-Modal Mining for 3D Retrieval

    Authors: Hao Wu, Ruochong LI, Hao Wang, Hui Xiong

    Abstract: In this paper, we investigate an open research task of cross-modal retrieval between 3D shapes and textual descriptions. Previous approaches mainly rely on point cloud encoders for feature extraction, which may ignore key inherent features of 3D shapes, including depth, spatial hierarchy, geometric continuity, etc. To address this issue, we propose COM3D, making the first attempt to exploit the cr… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by ICME 2024 oral

  46. arXiv:2404.14642  [pdf, other

    cs.LG

    Uncertainty Quantification on Graph Learning: A Survey

    Authors: Chao Chen, Chenghua Guo, Rui Xu, Xiangwen Liao, Xi Zhang, Sihong Xie, Hui Xiong, Philip Yu

    Abstract: Graphical models, including Graph Neural Networks (GNNs) and Probabilistic Graphical Models (PGMs), have demonstrated their exceptional capabilities across numerous fields. These models necessitate effective uncertainty quantification to ensure reliable decision-making amid the challenges posed by model training discrepancies and unpredictable testing scenarios. This survey examines recent works t… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  47. arXiv:2404.13067  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Efficient Resume Understanding: A Multi-Granularity Multi-Modal Pre-Training Approach

    Authors: Feihu Jiang, Chuan Qin, Jingshuai Zhang, Kaichun Yao, Xi Chen, Dazhong Shen, Chen Zhu, Hengshu Zhu, Hui Xiong

    Abstract: In the contemporary era of widespread online recruitment, resume understanding has been widely acknowledged as a fundamental and crucial task, which aims to extract structured information from resume documents automatically. Compared to the traditional rule-based approaches, the utilization of recently proposed pre-trained document understanding models can greatly enhance the effectiveness of resu… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: ICME 2024 Accepted

  48. arXiv:2404.12633  [pdf, other

    cs.AI cs.NI

    FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation

    Authors: Tianfu Wang, Qilin Fan, Chao Wang, Long Yang, Leilei Ding, Nicholas Jing Yuan, Hui Xiong

    Abstract: Virtual network embedding (VNE) is an essential resource allocation task in network virtualization, aiming to map virtual network requests (VNRs) onto physical infrastructure. Reinforcement learning (RL) has recently emerged as a promising solution to this problem. However, existing RL-based VNE methods are limited by the unidirectional action design and one-size-fits-all training strategy, result… ▽ More

    Submitted 1 May, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  49. arXiv:2404.11213  [pdf, other

    eess.SP cs.AI

    Revisiting Noise Resilience Strategies in Gesture Recognition: Short-Term Enhancement in Surface Electromyographic Signal Analysis

    Authors: Weiyu Guo, Ziyue Qiao, Ying Sun, Hui Xiong

    Abstract: Gesture recognition based on surface electromyography (sEMG) has been gaining importance in many 3D Interactive Scenes. However, sEMG is easily influenced by various forms of noise in real-world environments, leading to challenges in providing long-term stable interactions through sEMG. Existing methods often struggle to enhance model noise resilience through various predefined data augmentation t… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  50. arXiv:2404.10337  [pdf, other

    cs.AI

    Intriguing Properties of Positional Encoding in Time Series Forecasting

    Authors: Jianqi Zhang, Jingyao Wang, Wenwen Qiang, Fanjiang Xu, Changwen Zheng, Fuchun Sun, Hui Xiong

    Abstract: Transformer-based methods have made significant progress in time series forecasting (TSF). They primarily handle two types of tokens, i.e., temporal tokens that contain all variables of the same timestamp, and variable tokens that contain all input time points for a specific variable. Transformer-based methods rely on positional encoding (PE) to mark tokens' positions, facilitating the model to pe… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.