Zum Hauptinhalt springen

Showing 1–50 of 336 results for author: Yu, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.16340  [pdf, other

    eess.IV cs.CV

    Learned Image Transmission with Hierarchical Variational Autoencoder

    Authors: Guangyi Zhang, Hanlei Li, Yunlong Cai, Qiyu Hu, Guanding Yu, Runmin Zhang

    Abstract: In this paper, we introduce an innovative hierarchical joint source-channel coding (HJSCC) framework for image transmission, utilizing a hierarchical variational autoencoder (VAE). Our approach leverages a combination of bottom-up and top-down paths at the transmitter to autoregressively generate multiple hierarchical representations of the original image. These representations are then directly m… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  2. arXiv:2408.13285  [pdf, other

    cs.CV cs.AI

    SIn-NeRF2NeRF: Editing 3D Scenes with Instructions through Segmentation and Inpainting

    Authors: Jiseung Hong, Changmin Lee, Gyusang Yu

    Abstract: TL;DR Perform 3D object editing selectively by disentangling it from the background scene. Instruct-NeRF2NeRF (in2n) is a promising method that enables editing of 3D scenes composed of Neural Radiance Field (NeRF) using text prompts. However, it is challenging to perform geometrical modifications such as shrinking, scaling, or moving on both the background and object simultaneously. In this projec… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: Code is available at: https://github.com/KAISTChangmin/SIn-NeRF2NeRF

  3. arXiv:2408.09265  [pdf, other

    cs.CR cs.LG cs.NI eess.SY

    ByCAN: Reverse Engineering Controller Area Network (CAN) Messages from Bit to Byte Level

    Authors: Xiaojie Lin, Baihe Ma, Xu Wang, Guangsheng Yu, Ying He, Ren Ping Liu, Wei Ni

    Abstract: As the primary standard protocol for modern cars, the Controller Area Network (CAN) is a critical research target for automotive cybersecurity threats and autonomous applications. As the decoding specification of CAN is a proprietary black-box maintained by Original Equipment Manufacturers (OEMs), conducting related research and industry developments can be challenging without a comprehensive unde… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: Accept by IEEE Internet of Things Journal, 15 pages, 5 figures, 6 tables

  4. arXiv:2408.08493  [pdf, other

    cs.LG stat.ML

    Fishers Harvest Parallel Unlearning in Inherited Model Networks

    Authors: Xiao Liu, Mingyuan Li, Xu Wang, Guangsheng Yu, Wei Ni, Lixiang Li, Haipeng Peng, Renping Liu

    Abstract: Unlearning in various learning frameworks remains challenging, with the continuous growth and updates of models exhibiting complex inheritance relationships. This paper presents a novel unlearning framework, which enables fully parallel unlearning among models exhibiting inheritance. A key enabler is the new Unified Model Inheritance Graph (UMIG), which captures the inheritance using a Directed Ac… ▽ More

    Submitted 20 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

  5. arXiv:2408.05477  [pdf, other

    cs.CV

    Scene123: One Prompt to 3D Scene Generation via Video-Assisted and Consistency-Enhanced MAE

    Authors: Yiying Yang, Fukun Yin, Jiayuan Fan, Xin Chen, Wanzhang Li, Gang Yu

    Abstract: As Artificial Intelligence Generated Content (AIGC) advances, a variety of methods have been developed to generate text, images, videos, and 3D objects from single or multimodal inputs, contributing efforts to emulate human-like cognitive content creation. However, generating realistic large-scale scenes from a single input presents a challenge due to the complexities involved in ensuring consiste… ▽ More

    Submitted 20 August, 2024; v1 submitted 10 August, 2024; originally announced August 2024.

    Comments: arXiv admin note: text overlap with arXiv:2305.11588 by other authors

  6. arXiv:2408.05006  [pdf, other

    cs.SE cs.AI

    Enhancing the Code Debugging Ability of LLMs via Communicative Agent Based Data Refinement

    Authors: Weiqing Yang, Hanbin Wang, Zhenghao Liu, Xinze Li, Yukun Yan, Shuo Wang, Yu Gu, Minghe Yu, Zhiyuan Liu, Ge Yu

    Abstract: Debugging is a vital aspect of software development, yet the debugging capabilities of Large Language Models (LLMs) remain largely unexplored. This paper first introduces DEBUGEVAL, a comprehensive benchmark designed to evaluate the debugging capabilities of LLMs. DEBUGEVAL collects data from existing high-quality datasets and designs four different tasks to evaluate the debugging effectiveness, i… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  7. arXiv:2408.04638  [pdf, other

    cs.CL cs.CY

    Affective Computing in the Era of Large Language Models: A Survey from the NLP Perspective

    Authors: Yiqun Zhang, Xiaocui Yang, Xingle Xu, Zeran Gao, Yijie Huang, Shiyi Mu, Shi Feng, Daling Wang, Yifei Zhang, Kaisong Song, Ge Yu

    Abstract: Affective Computing (AC), integrating computer science, psychology, and cognitive science knowledge, aims to enable machines to recognize, interpret, and simulate human emotions.To create more value, AC can be applied to diverse scenarios, including social media, finance, healthcare, education, etc. Affective Computing (AC) includes two mainstream tasks, i.e., Affective Understanding (AU) and Affe… ▽ More

    Submitted 30 July, 2024; originally announced August 2024.

  8. arXiv:2407.17190  [pdf, other

    cs.CE

    Fusing LLMs and KGs for Formal Causal Reasoning behind Financial Risk Contagion

    Authors: Guanyuan Yu, Xv Wang, Qing Li, Yu Zhao

    Abstract: Financial risks trend to spread from one entity to another, ultimately leading to systemic risks. The key to preventing such risks lies in understanding the causal chains behind risk contagion. Despite this, prevailing approaches primarily emphasize identifying risks, overlooking the underlying causal analysis of risk. To address such an issue, we propose a Risk Contagion Causal Reasoning model ca… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  9. Blueprinting the Cloud: Unifying and Automatically Optimizing Cloud Data Infrastructures with BRAD -- Extended Version

    Authors: Geoffrey X. Yu, Ziniu Wu, Ferdi Kossmann, Tianyu Li, Markos Markakis, Amadou Ngom, Samuel Madden, Tim Kraska

    Abstract: Modern organizations manage their data with a wide variety of specialized cloud database engines (e.g., Aurora, BigQuery, etc.). However, designing and managing such infrastructures is hard. Developers must consider many possible designs with non-obvious performance consequences; moreover, current software abstractions tightly couple applications to specific systems (e.g., with engine-specific cli… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: 17 pages, 15 figures

  10. arXiv:2407.14544  [pdf, other

    cs.DC

    Fast Iterative Graph Computing with Updated Neighbor States

    Authors: Yijie Zhou, Shufeng Gong, Feng Yao, Hanzhang Chen, Song Yu, Pengxi Liu, Yanfeng Zhang, Ge Yu, Jeffrey Xu Yu

    Abstract: Enhancing the efficiency of iterative computation on graphs has garnered considerable attention in both industry and academia. Nonetheless, the majority of efforts focus on expediting iterative computation by minimizing the running time per iteration step, ignoring the optimization of the number of iteration rounds, which is a crucial aspect of iterative computation. We experimentally verified the… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 14 pages, 13 figures, 2 tables; accepted for publication in ICDE 2024

  11. arXiv:2407.11292  [pdf

    cs.CV

    LoRA-PT: Low-Rank Adapting UNETR for Hippocampus Segmentation Using Principal Tensor Singular Values and Vectors

    Authors: Guanghua He, Wangang Cheng, Hancan Zhu, Gaohang Yu

    Abstract: The hippocampus is a crucial brain structure associated with various psychiatric disorders, and its automatic and precise segmentation is essential for studying these diseases. In recent years, deep learning-based methods have made significant progress in hippocampus segmentation. However, training deep neural network models requires substantial computational resources and time, as well as a large… ▽ More

    Submitted 18 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

  12. arXiv:2407.07339  [pdf, other

    cs.CR

    TDML -- A Trustworthy Distributed Machine Learning Framework

    Authors: Zhen Wang, Qin Wang, Guangsheng Yu, Shiping Chen

    Abstract: Recent years have witnessed a surge in deep learning research, marked by the introduction of expansive generative models like OpenAI's SORA and GPT, Meta AI's LLAMA series, and Google's FLAN, BART, and Gemini models. However, the rapid advancement of large models (LM) has intensified the demand for computing resources, particularly GPUs, which are crucial for their parallel processing capabilities… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  13. arXiv:2407.00125  [pdf, other

    cs.SE cs.AI cs.DC

    A Survey on Failure Analysis and Fault Injection in AI Systems

    Authors: Guangba Yu, Gou Tan, Haojia Huang, Zhenyu Zhang, Pengfei Chen, Roberto Natella, Zibin Zheng

    Abstract: The rapid advancement of Artificial Intelligence (AI) has led to its integration into various areas, especially with Large Language Models (LLMs) significantly enhancing capabilities in Artificial Intelligence Generated Content (AIGC). However, the complexity of AI systems has also exposed their vulnerabilities, necessitating robust methods for failure analysis (FA) and fault injection (FI) to ens… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  14. arXiv:2406.19280  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale

    Authors: Junying Chen, Ruyi Ouyang, Anningzhe Gao, Shunian Chen, Guiming Hardy Chen, Xidong Wang, Ruifei Zhang, Zhenyang Cai, Ke Ji, Guangjun Yu, Xiang Wan, Benyou Wang

    Abstract: The rapid development of multimodal large language models (MLLMs), such as GPT-4V, has led to significant advancements. However, these models still face challenges in medical multimodal capabilities due to limitations in the quantity and quality of medical vision-text data, stemming from data privacy concerns and high annotation costs. While pioneering approaches utilize PubMed's large-scale, de-i… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  15. arXiv:2406.15819  [pdf, other

    cs.LG cs.IT cs.NI eess.SP

    Automatic AI Model Selection for Wireless Systems: Online Learning via Digital Twinning

    Authors: Qiushuo Hou, Matteo Zecchin, Sangwoo Park, Yunlong Cai, Guanding Yu, Kaushik Chowdhury, Osvaldo Simeone

    Abstract: In modern wireless network architectures, such as O-RAN, artificial intelligence (AI)-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The AI "apps" are selected on the basis of contextual information such as network conditions, topology, traffic statistics, and design goals. The mapping between context and AI model parameter… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: submitted for a journal publication

  16. arXiv:2406.11689  [pdf, other

    cs.CV

    Lightweight Model Pre-training via Language Guided Knowledge Distillation

    Authors: Mingsheng Li, Lin Zhang, Mingzhen Zhu, Zilong Huang, Gang Yu, Jiayuan Fan, Tao Chen

    Abstract: This paper studies the problem of pre-training for small models, which is essential for many mobile devices. Current state-of-the-art methods on this problem transfer the representational knowledge of a large network (as a Teacher) into a smaller model (as a Student) using self-supervised distillation, improving the performance of the small model on downstream tasks. However, existing approaches a… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  17. arXiv:2406.10163  [pdf, other

    cs.CV cs.AI

    MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

    Authors: Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang

    Abstract: Recently, 3D assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement. However, this potential is largely unrealized because these assets always need to be converted to meshes for 3D industry applications, and the meshes produced by current mesh extraction methods are significantly inferior to Artist-Created… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Project Page: https://buaacyw.github.io/mesh-anything/ Code: https://github.com/buaacyw/MeshAnything

  18. arXiv:2406.09931  [pdf, other

    eess.IV cs.CV cs.LG

    SCKansformer: Fine-Grained Classification of Bone Marrow Cells via Kansformer Backbone and Hierarchical Attention Mechanisms

    Authors: Yifei Chen, Zhu Zhu, Shenghao Zhu, Linwei Qiu, Binfeng Zou, Fan Jia, Yunpeng Zhu, Chenyan Zhang, Zhaojie Fang, Feiwei Qin, Jin Fan, Changmiao Wang, Yu Gao, Gang Yu

    Abstract: The incidence and mortality rates of malignant tumors, such as acute leukemia, have risen significantly. Clinically, hospitals rely on cytological examination of peripheral blood and bone marrow smears to diagnose malignant tumors, with accurate blood cell counting being crucial. Existing automated methods face challenges such as low feature expression capability, poor interpretability, and redund… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 15 pages, 6 figures

  19. arXiv:2406.06051  [pdf, other

    cs.AI cs.HC cs.LG

    On the Utility of Accounting for Human Beliefs about AI Behavior in Human-AI Collaboration

    Authors: Guanghui Yu, Robert Kasumba, Chien-Ju Ho, William Yeoh

    Abstract: To enable effective human-AI collaboration, merely optimizing AI performance while ignoring humans is not sufficient. Recent research has demonstrated that designing AI agents to account for human behavior leads to improved performance in human-AI collaboration. However, a limitation of most existing approaches is their assumption that human behavior is static, irrespective of AI behavior. In real… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  20. arXiv:2406.05216  [pdf, other

    cs.LG

    TabPFGen -- Tabular Data Generation with TabPFN

    Authors: Junwei Ma, Apoorv Dankar, George Stein, Guangwei Yu, Anthony Caterini

    Abstract: Advances in deep generative modelling have not translated well to tabular data. We argue that this is caused by a mismatch in structure between popular generative models and discriminative models of tabular data. We thus devise a technique to turn TabPFN -- a highly performant transformer initially designed for in-context discriminative tabular tasks -- into an energy-based generative model, which… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  21. arXiv:2406.05207  [pdf, other

    cs.LG

    Retrieval & Fine-Tuning for In-Context Tabular Models

    Authors: Valentin Thomas, Junwei Ma, Rasa Hosseinzadeh, Keyvan Golestan, Guangwei Yu, Maksims Volkovs, Anthony Caterini

    Abstract: Tabular data is a pervasive modality spanning a wide range of domains, and the inherent diversity poses a considerable challenge for deep learning. Recent advancements using transformer-based in-context learning have shown promise on smaller and less complex datasets, but have struggled to scale to larger and more complex ones. To address this limitation, we propose a combination of retrieval and… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  22. arXiv:2406.00947  [pdf, other

    cs.CV

    Cross-Dimensional Medical Self-Supervised Representation Learning Based on a Pseudo-3D Transformation

    Authors: Fei Gao, Siwen Wang, Fandong Zhang, Hong-Yu Zhou, Yizhou Wang, Churan Wang, Gang Yu, Yizhou Yu

    Abstract: Medical image analysis suffers from a shortage of data, whether annotated or not. This becomes even more pronounced when it comes to 3D medical images. Self-Supervised Learning (SSL) can partially ease this situation by using unlabeled data. However, most existing SSL methods can only make use of data in a single dimensionality (e.g. 2D or 3D), and are incapable of enlarging the training dataset b… ▽ More

    Submitted 4 July, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

    Comments: MICCAI 2024 accept

  23. arXiv:2405.20853  [pdf, other

    cs.CV

    MeshXL: Neural Coordinate Field for Generative 3D Foundation Models

    Authors: Sijin Chen, Xin Chen, Anqi Pang, Xianfang Zeng, Wei Cheng, Yijun Fu, Fukun Yin, Yanru Wang, Zhibin Wang, Chi Zhang, Jingyi Yu, Gang Yu, Bin Fu, Tao Chen

    Abstract: The polygon mesh representation of 3D data exhibits great flexibility, fast rendering speed, and storage efficiency, which is widely preferred in various applications. However, given its unstructured graph representation, the direct generation of high-fidelity 3D meshes is challenging. Fortunately, with a pre-defined ordering strategy, 3D meshes can be represented as sequences, and the generation… ▽ More

    Submitted 18 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

  24. arXiv:2405.10481  [pdf, other

    cs.LG cs.AI

    Multi-Evidence based Fact Verification via A Confidential Graph Neural Network

    Authors: Yuqing Lan, Zhenghao Liu, Yu Gu, Xiaoyuan Yi, Xiaohua Li, Liner Yang, Ge Yu

    Abstract: Fact verification tasks aim to identify the integrity of textual contents according to the truthful corpus. Existing fact verification models usually build a fully connected reasoning graph, which regards claim-evidence pairs as nodes and connects them with edges. They employ the graph to propagate the semantics of the nodes. Nevertheless, the noisy nodes usually propagate their semantics via the… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 12pages

  25. arXiv:2404.15506  [pdf, other

    cs.CV

    Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation

    Authors: Mu Hu, Wei Yin, Chi Zhang, Zhipeng Cai, Xiaoxiao Long, Hao Chen, Kaixuan Wang, Gang Yu, Chunhua Shen, Shaojie Shen

    Abstract: We introduce Metric3D v2, a geometric foundation model for zero-shot metric depth and surface normal estimation from a single image, which is crucial for metric 3D recovery. While depth and normal are geometrically related and highly complimentary, they present distinct challenges. SoTA monocular depth methods achieve zero-shot generalization by learning affine-invariant depths, which cannot recov… ▽ More

    Submitted 16 August, 2024; v1 submitted 21 March, 2024; originally announced April 2024.

    Comments: Our project page is at https://JUGGHM.github.io/Metric3Dv2. Accpeted to TPAMI. arXiv admin note: text overlap with arXiv:2307.10984

  26. GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting

    Authors: Hongyun Yu, Zhan Qu, Qihang Yu, Jianchuan Chen, Zhonghua Jiang, Zhiwen Chen, Shengyu Zhang, Jimin Xu, Fei Wu, Chengfei Lv, Gang Yu

    Abstract: Recent works on audio-driven talking head synthesis using Neural Radiance Fields (NeRF) have achieved impressive results. However, due to inadequate pose and expression control caused by NeRF implicit representation, these methods still have some limitations, such as unsynchronized or unnatural lip movements, and visual jitter and artifacts. In this paper, we propose GaussianTalker, a novel method… ▽ More

    Submitted 9 August, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted by ACM MM 2024. Project page: https://yuhongyun777.github.io/GaussianTalker/

  27. arXiv:2404.08886  [pdf, other

    cs.CV cs.AI cs.CL cs.IR cs.LG

    EIVEN: Efficient Implicit Attribute Value Extraction using Multimodal LLM

    Authors: Henry Peng Zou, Gavin Heqing Yu, Ziwei Fan, Dan Bu, Han Liu, Peng Dai, Dongmei Jia, Cornelia Caragea

    Abstract: In e-commerce, accurately extracting product attribute values from multimodal data is crucial for improving user experience and operational efficiency of retailers. However, previous approaches to multimodal attribute value extraction often struggle with implicit attribute values embedded in images or text, rely heavily on extensive labeled data, and can easily confuse similar attribute values. To… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted by NAACL 2024 Industry Track

  28. arXiv:2404.08826  [pdf, other

    cs.PF math.PR

    Strongly Tail-Optimal Scheduling in the Light-Tailed M/G/1

    Authors: George Yu, Ziv Scully

    Abstract: We study the problem of scheduling jobs in a queueing system, specifically an M/G/1 with light-tailed job sizes, to asymptotically optimize the response time tail. This means scheduling to make $\mathbf{P}[T > t]$, the chance a job's response time exceeds $t$, decay as quickly as possible in the $t \to \infty$ limit. For some time, the best known policy was First-Come First-Served (FCFS), which ha… ▽ More

    Submitted 5 July, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 33 pages, 8 figures. SIGMETRICS 2024

  29. arXiv:2404.08681  [pdf, other

    cs.CL

    EFSA: Towards Event-Level Financial Sentiment Analysis

    Authors: Tianyu Chen, Yiming Zhang, Guoxin Yu, Dapeng Zhang, Li Zeng, Qing He, Xiang Ao

    Abstract: In this paper, we extend financial sentiment analysis~(FSA) to event-level since events usually serve as the subject of the sentiment in financial text. Though extracting events from the financial text may be conducive to accurate sentiment predictions, it has specialized challenges due to the lengthy and discontinuity of events in a financial text. To this end, we reconceptualize the event extrac… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  30. arXiv:2404.06077  [pdf, other

    cs.CR cs.AI cs.CY

    Is Your AI Truly Yours? Leveraging Blockchain for Copyrights, Provenance, and Lineage

    Authors: Yilin Sai, Qin Wang, Guangsheng Yu, H. M. N. Dilum Bandara, Shiping Chen

    Abstract: As Artificial Intelligence (AI) integrates into diverse areas, particularly in content generation, ensuring rightful ownership and ethical use becomes paramount. AI service providers are expected to prioritize responsibly sourcing training data and obtaining licenses from data owners. However, existing studies primarily center on safeguarding static copyrights, which simply treats metadata/dataset… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  31. arXiv:2404.03054  [pdf, other

    cs.AI cs.LG

    Data-Driven Goal Recognition Design for General Behavioral Agents

    Authors: Robert Kasumba, Guanghui Yu, Chien-Ju Ho, Sarah Keren, William Yeoh

    Abstract: Goal recognition design aims to make limited modifications to decision-making environments with the goal of making it easier to infer the goals of agents acting within those environments. Although various research efforts have been made in goal recognition design, existing approaches are computationally demanding and often assume that agents are (near-)optimal in their decision-making. To address… ▽ More

    Submitted 11 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

  32. arXiv:2404.01700  [pdf, other

    cs.CV

    MotionChain: Conversational Motion Controllers via Multimodal Prompts

    Authors: Biao Jiang, Xin Chen, Chi Zhang, Fukun Yin, Zhuoyuan Li, Gang YU, Jiayuan Fan

    Abstract: Recent advancements in language models have demonstrated their adeptness in conducting multi-turn dialogues and retaining conversational context. However, this proficiency remains largely unexplored in other multimodal generative models, particularly in human motion models. By integrating multi-turn conversations in controlling continuous virtual human movements, generative human motion models can… ▽ More

    Submitted 3 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 14 pages, 4 figures

  33. arXiv:2404.00964  [pdf, other

    cs.CV

    S2RC-GCN: A Spatial-Spectral Reliable Contrastive Graph Convolutional Network for Complex Land Cover Classification Using Hyperspectral Images

    Authors: Renxiang Guan, Zihao Li, Chujia Song, Guo Yu, Xianju Li, Ruyi Feng

    Abstract: Spatial correlations between different ground objects are an important feature of mining land cover research. Graph Convolutional Networks (GCNs) can effectively capture such spatial feature representations and have demonstrated promising results in performing hyperspectral imagery (HSI) classification tasks of complex land. However, the existing GCN-based HSI classification methods are prone to i… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted to IJCNN 2024 (International Joint Conference on Neural Networks)

  34. arXiv:2403.19895  [pdf, ps, other

    cs.IT cs.LG

    An Information-Theoretic Framework for Out-of-Distribution Generalization

    Authors: Wenliang Liu, Guanding Yu, Lele Wang, Renjie Liao

    Abstract: We study the Out-of-Distribution (OOD) generalization in machine learning and propose a general framework that provides information-theoretic generalization bounds. Our framework interpolates freely between Integral Probability Metric (IPM) and $f$-divergence, which naturally recovers some known results (including Wasserstein- and KL-bounds), as well as yields new generalization bounds. Moreover,… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  35. arXiv:2403.15010  [pdf, other

    cs.CV cs.CR

    Clean-image Backdoor Attacks

    Authors: Dazhong Rong, Guoyao Yu, Shuheng Shen, Xinyi Fu, Peng Qian, Jianhai Chen, Qinming He, Xing Fu, Weiqiang Wang

    Abstract: To gather a significant quantity of annotated training data for high-performance image classification models, numerous companies opt to enlist third-party providers to label their unlabeled data. This practice is widely regarded as secure, even in cases where some annotated errors occur, as the impact of these minor inaccuracies on the final performance of the models is negligible and existing bac… ▽ More

    Submitted 26 March, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  36. arXiv:2403.11469  [pdf, other

    cs.CV cs.GR

    Generative Motion Stylization of Cross-structure Characters within Canonical Motion Space

    Authors: Jiaxu Zhang, Xin Chen, Gang Yu, Zhigang Tu

    Abstract: Stylized motion breathes life into characters. However, the fixed skeleton structure and style representation hinder existing data-driven motion synthesis methods from generating stylized motion for various characters. In this work, we propose a generative motion stylization pipeline, named MotionS, for synthesizing diverse and stylized motion on cross-structure characters using cross-modality sty… ▽ More

    Submitted 23 July, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: ACM MM 2024

  37. arXiv:2403.05135  [pdf, other

    cs.CV

    ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

    Authors: Xiwei Hu, Rui Wang, Yixiao Fang, Bin Fu, Pei Cheng, Gang Yu

    Abstract: Diffusion models have demonstrated remarkable performance in the domain of text-to-image generation. However, most widely used models still employ CLIP as their text encoder, which constrains their ability to comprehend dense prompts, encompassing multiple objects, detailed attributes, complex relationships, long-text alignment, etc. In this paper, we introduce an Efficient Large Language Model Ad… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Project Page: https://ella-diffusion.github.io/

  38. arXiv:2403.01422  [pdf, other

    cs.CV

    MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies

    Authors: Zhende Song, Chenchen Wang, Jiamu Sheng, Chi Zhang, Gang Yu, Jiayuan Fan, Tao Chen

    Abstract: Development of multimodal models has marked a significant step forward in how machines understand videos. These models have shown promise in analyzing short video clips. However, when it comes to longer formats like movies, they often fall short. The main hurdles are the lack of high-quality, diverse video data and the intensive work required to collect or annotate such data. In face of these chal… ▽ More

    Submitted 24 June, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  39. arXiv:2402.19160  [pdf, other

    cs.CV

    Effective Message Hiding with Order-Preserving Mechanisms

    Authors: Gao Yu, Qiu Xuchong, Ye Zihan

    Abstract: Message hiding, a technique that conceals secret message bits within a cover image, aims to achieve an optimal balance among message capacity, recovery accuracy, and imperceptibility. While convolutional neural networks have notably improved message capacity and imperceptibility, achieving high recovery accuracy remains challenging. This challenge arises because convolutional operations struggle t… ▽ More

    Submitted 17 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: 7 Pages

  40. arXiv:2402.16294  [pdf, other

    cs.CR cs.AI

    BlockFUL: Enabling Unlearning in Blockchained Federated Learning

    Authors: Xiao Liu, Mingyuan Li, Xu Wang, Guangsheng Yu, Wei Ni, Lixiang Li, Haipeng Peng, Renping Liu

    Abstract: Unlearning in Federated Learning (FL) presents significant challenges, as models grow and evolve with complex inheritance relationships. This complexity is amplified when blockchain is employed to ensure the integrity and traceability of FL, where the need to edit multiple interlinked blockchain records and update all inherited models complicates the process.In this paper, we introduce Blockchaine… ▽ More

    Submitted 14 August, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

  41. arXiv:2402.16058  [pdf, other

    cs.CL

    Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression

    Authors: Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yukun Yan, Shuo Wang, Ge Yu

    Abstract: Large language models (LLMs) require lengthy prompts as the input context to produce output aligned with user intentions, a process that incurs extra costs during inference. In this paper, we propose the Gist COnditioned deCOding (Gist-COCO) model, introducing a novel method for compressing prompts which also can assist the prompt interpretation and engineering. Gist-COCO employs an encoder-decode… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  42. arXiv:2402.14652  [pdf, other

    cs.CL

    Cleaner Pretraining Corpus Curation with Neural Web Scraping

    Authors: Zhipeng Xu, Zhenghao Liu, Yukun Yan, Zhiyuan Liu, Ge Yu, Chenyan Xiong

    Abstract: The web contains large-scale, diverse, and abundant information to satisfy the information-seeking needs of humans. Through meticulous data collection, preprocessing, and curation, webpages can be used as a fundamental data resource for language model pretraining. However, when confronted with the progressively revolutionized and intricate nature of webpages, rule-based/feature-based web scrapers… ▽ More

    Submitted 14 June, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  43. arXiv:2402.13547  [pdf, other

    cs.CL

    ActiveRAG: Revealing the Treasures of Knowledge via Active Learning

    Authors: Zhipeng Xu, Zhenghao Liu, Yibin Liu, Chenyan Xiong, Yukun Yan, Shuo Wang, Shi Yu, Zhiyuan Liu, Ge Yu

    Abstract: Retrieval Augmented Generation (RAG) has introduced a new paradigm for Large Language Models (LLMs), aiding in the resolution of knowledge-intensive tasks. However, current RAG models position LLMs as passive knowledge receptors, thereby restricting their capacity for learning and comprehending external knowledge. In this paper, we present ActiveRAG, an innovative RAG framework that shifts from pa… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  44. arXiv:2402.12694  [pdf, other

    cs.LG

    Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series Dependencies and Intra-Series Variations Modeling

    Authors: Guoqi Yu, Jing Zou, Xiaowei Hu, Angelica I. Aviles-Rivero, Jing Qin, Shujun Wang

    Abstract: Predicting multivariate time series is crucial, demanding precise modeling of intricate patterns, including inter-series dependencies and intra-series variations. Distinctive trend characteristics in each time series pose challenges, and existing methods, relying on basic moving average kernels, may struggle with the non-linear structure and complex trends in real-world data. Given that, we introd… ▽ More

    Submitted 5 July, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  45. arXiv:2402.06971  [pdf, other

    cs.LG

    In-Context Data Distillation with TabPFN

    Authors: Junwei Ma, Valentin Thomas, Guangwei Yu, Anthony Caterini

    Abstract: Foundation models have revolutionized tasks in computer vision and natural language processing. However, in the realm of tabular data, tree-based models like XGBoost continue to dominate. TabPFN, a transformer model tailored for tabular data, mirrors recent foundation models in its exceptional in-context learning capability, being competitive with XGBoost's performance without the need for task-sp… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

  46. arXiv:2402.06459  [pdf, other

    cs.GT cs.CE cs.CR cs.CY econ.GN

    Maximizing NFT Incentives: References Make You Rich

    Authors: Guangsheng Yu, Qin Wang, Caijun Sun, Lam Duc Nguyen, H. M. N. Dilum Bandara, Shiping Chen

    Abstract: In this paper, we study how to optimize existing Non-Fungible Token (NFT) incentives. Upon exploring a large number of NFT-related standards and real-world projects, we come across an unexpected finding. That is, the current NFT incentive mechanisms, often organized in an isolated and one-time-use fashion, tend to overlook their potential for scalable organizational structures. We propose, analy… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  47. arXiv:2402.03804  [pdf, other

    cs.LG cs.AI

    ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse LLMs

    Authors: Zhengyan Zhang, Yixin Song, Guanghui Yu, Xu Han, Yankai Lin, Chaojun Xiao, Chenyang Song, Zhiyuan Liu, Zeyu Mi, Maosong Sun

    Abstract: Sparse computation offers a compelling solution for the inference of Large Language Models (LLMs) in low-resource scenarios by dynamically skipping the computation of inactive neurons. While traditional approaches focus on ReLU-based LLMs, leveraging zeros in activation values, we broaden the scope of sparse LLMs beyond zero activation values. We introduce a general method that defines neuron acti… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  48. arXiv:2402.01808  [pdf, other

    cs.SD eess.AS

    KS-Net: Multi-band joint speech restoration and enhancement network for 2024 ICASSP SSI Challenge

    Authors: Guochen Yu, Runqiang Han, Chenglin Xu, Haoran Zhao, Nan Li, Chen Zhang, Xiguang Zheng, Chao Zhou, Qi Huang, Bing Yu

    Abstract: This paper presents the speech restoration and enhancement system created by the 1024K team for the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. Our system consists of a generative adversarial network (GAN) in complex-domain for speech restoration and a fine-grained multi-band fusion module for speech enhancement. In the blind test set of SSI, the proposed system achieves an overall mean… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024; Rank 1st in ICASSP 2024 Speech Signal Improvement (SSI) Challenge

  49. arXiv:2401.17577  [pdf, other

    cs.IT eess.SP

    Robustness in Wireless Distributed Learning: An Information-Theoretic Analysis

    Authors: Yangshuo He, Guanding Yu

    Abstract: In this paper, we take an information-theoretic approach to understand the robustness in wireless distributed learning. Upon measuring the difference in loss functions, we provide an upper bound of the performance deterioration due to imperfect wireless channels. Moreover, we characterize the transmission rate under task performance guarantees and propose the channel capacity gain resulting from t… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  50. arXiv:2401.16760  [pdf, other

    cs.LG

    One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Training

    Authors: Lianbo Ma, Yuee Zhou, Jianlun Ma, Guo Yu, Qing Li

    Abstract: Weight quantization is an effective technique to compress deep neural networks for their deployment on edge devices with limited resources. Traditional loss-aware quantization methods commonly use the quantized gradient to replace the full-precision gradient. However, we discover that the gradient error will lead to an unexpected zig-zagging-like issue in the gradient descent learning procedures,… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 9 pages, 13 figures,accepted by AAAI-24