Skip to main content

Showing 1–50 of 243 results for author: Dong, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12273  [pdf, other

    cs.CV

    GRIDS: Grouped Multiple-Degradation Restoration with Image Degradation Similarity

    Authors: Shuo Cao, Yihao Liu, Wenlong Zhang, Yu Qiao, Chao Dong

    Abstract: Traditional single-task image restoration methods excel in handling specific degradation types but struggle with multiple degradations. To address this limitation, we propose Grouped Restoration with Image Degradation Similarity (GRIDS), a novel approach that harmonizes the competing objectives inherent in multiple-degradation restoration. We first introduce a quantitative method for assessing rel… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  2. arXiv:2407.05778  [pdf, other

    cs.CL cs.AI

    When is the consistent prediction likely to be a correct prediction?

    Authors: Alex Nguyen, Dheeraj Mekala, Chengyu Dong, Jingbo Shang

    Abstract: Self-consistency (Wang et al., 2023) suggests that the most consistent answer obtained through large language models (LLMs) is more likely to be correct. In this paper, we challenge this argument and propose a nuanced correction. Our observations indicate that consistent answers derived through more computation i.e. longer reasoning texts, rather than simply the most consistent answer across all o… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  3. arXiv:2407.01104  [pdf, other

    cs.CV

    Semantic-guided Adversarial Diffusion Model for Self-supervised Shadow Removal

    Authors: Ziqi Zeng, Chen Zhao, Weiling Cai, Chenyu Dong

    Abstract: Existing unsupervised methods have addressed the challenges of inconsistent paired data and tedious acquisition of ground-truth labels in shadow removal tasks. However, GAN-based training often faces issues such as mode collapse and unstable optimization. Furthermore, due to the complex mapping between shadow and shadow-free domains, merely relying on adversarial learning is not enough to capture… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  4. arXiv:2406.18145  [pdf, other

    cs.CR cs.LG

    Beyond Statistical Estimation: Differentially Private Individual Computation via Shuffling

    Authors: Shaowei Wang, Changyu Dong, Xiangfu Song, Jin Li, Zhili Zhou, Di Wang, Han Wu

    Abstract: In data-driven applications, preserving user privacy while enabling valuable computations remains a critical challenge. Technologies like Differential Privacy (DP) have been pivotal in addressing these concerns. The shuffle model of DP requires no trusted curators and can achieve high utility by leveraging the privacy amplification effect yielded from shuffling. These benefits have led to signific… ▽ More

    Submitted 11 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  5. arXiv:2406.11115  [pdf, other

    cs.CL

    Text Grafting: Near-Distribution Weak Supervision for Minority Classes in Text Classification

    Authors: Letian Peng, Yi Gu, Chengyu Dong, Zihan Wang, Jingbo Shang

    Abstract: For extremely weak-supervised text classification, pioneer research generates pseudo labels by mining texts similar to the class names from the raw corpus, which may end up with very limited or even no samples for the minority classes. Recent works have started to generate the relevant texts by prompting LLMs using the class names or definitions; however, there is a high risk that LLMs cannot gene… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  6. arXiv:2406.04460  [pdf, other

    cs.CL

    Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs

    Authors: Shang Zhou, Feng Yao, Chengyu Dong, Zihan Wang, Jingbo Shang

    Abstract: Controlling the attribute intensity of text generation is crucial across scenarios (e.g., writing conciseness, chatting emotion, and explanation clarity). The remarkable capabilities of large language models (LLMs) have revolutionized text generation, prompting us to explore such \emph{smooth control} of LLM generation. Specifically, we propose metrics to assess the range, calibration, and consist… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 Findings

  7. arXiv:2406.03865  [pdf, other

    cs.CV cs.AI

    Semantic Similarity Score for Measuring Visual Similarity at Semantic Level

    Authors: Senran Fan, Zhicheng Bao, Chen Dong, Haotai Liang, Xiaodong Xu, Ping Zhang

    Abstract: Semantic communication, as a revolutionary communication architecture, is considered a promising novel communication paradigm. Unlike traditional symbol-based error-free communication systems, semantic-based visual communication systems extract, compress, transmit, and reconstruct images at the semantic level. However, widely used image similarity evaluation metrics, whether pixel-based MSE or PSN… ▽ More

    Submitted 10 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  8. arXiv:2405.18842  [pdf, other

    cs.CV

    Descriptive Image Quality Assessment in the Wild

    Authors: Zhiyuan You, Jinjin Gu, Zheyuan Li, Xin Cai, Kaiwen Zhu, Chao Dong, Tianfan Xue

    Abstract: With the rapid advancement of Vision Language Models (VLMs), VLM-based Image Quality Assessment (IQA) seeks to describe image quality linguistically to align with human expression and capture the multifaceted nature of IQA tasks. However, current methods are still far from practical usage. First, prior works focus narrowly on specific sub-tasks or settings, which do not align with diverse real-wor… ▽ More

    Submitted 12 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  9. arXiv:2405.15734  [pdf, other

    cs.CV

    LM4LV: A Frozen Large Language Model for Low-level Vision Tasks

    Authors: Boyang Zheng, Jinjin Gu, Shijun Li, Chao Dong

    Abstract: The success of large language models (LLMs) has fostered a new research trend of multi-modality large language models (MLLMs), which changes the paradigm of various fields in computer vision. Though MLLMs have shown promising results in numerous high-level vision and vision-language tasks such as VQA and text-to-image, no works have demonstrated how low-level vision tasks can benefit from MLLMs. W… ▽ More

    Submitted 11 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  10. arXiv:2405.11281  [pdf, other

    cs.DC cs.AI

    Cooperative Cognitive Dynamic System in UAV Swarms: Reconfigurable Mechanism and Framework

    Authors: Ziye Jia, Jiahao You, Chao Dong, Qihui Wu, Fuhui Zhou, Dusit Niyato, Zhu Han

    Abstract: As the demands for immediate and effective responses increase in both civilian and military domains, the unmanned aerial vehicle (UAV) swarms emerge as effective solutions, in which multiple cooperative UAVs can work together to achieve specific goals. However, how to manage such complex systems to ensure real-time adaptability lack sufficient researches. Hence, in this paper, we propose the coope… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  11. arXiv:2405.04525  [pdf, other

    cs.GT

    Comparing Ways of Obtaining Candidate Orderings from Approval Ballots

    Authors: Théo Delemazure, Chris Dong, Dominik Peters, Magdaléna Tydrichová

    Abstract: To understand and summarize approval preferences and other binary evaluation data, it is useful to order the items on an axis which explains the data. In a political election using approval voting, this could be an ideological left-right axis such that each voter approves adjacent candidates, an analogue of single-peakedness. In a perfect axis, every approval set would be an interval, which is usu… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 43 pages including appendix, accepted to IJCAI 2024

  12. arXiv:2405.02335  [pdf, other

    cs.IT cs.LG

    sDAC -- Semantic Digital Analog Converter for Semantic Communications

    Authors: Zhicheng Bao, Chen Dong, Xiaodong Xu

    Abstract: In this paper, we propose a novel semantic digital analog converter (sDAC) for the compatibility of semantic communications and digital communications. Most of the current semantic communication systems are based on the analog modulations, ignoring their incorporation with digital communication systems, which are more common in practice. In fact, quantization methods in traditional communication s… ▽ More

    Submitted 26 April, 2024; originally announced May 2024.

  13. arXiv:2404.19500  [pdf, other

    cs.CV cs.AI cs.MM eess.IV

    Towards Real-world Video Face Restoration: A New Benchmark

    Authors: Ziyan Chen, Jingwen He, Xinqi Lin, Yu Qiao, Chao Dong

    Abstract: Blind face restoration (BFR) on images has significantly progressed over the last several years, while real-world video face restoration (VFR), which is more challenging for more complex face motions such as moving gaze directions and facial orientations involved, remains unsolved. Typical BFR methods are evaluated on privately synthesized datasets or self-collected real-world low-quality face ima… ▽ More

    Submitted 4 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: Project page: https://ziyannchen.github.io/projects/VFRxBenchmark/

  14. arXiv:2404.14607  [pdf, other

    cs.CL

    Q-Tuning: Queue-based Prompt Tuning for Lifelong Few-shot Language Learning

    Authors: Yanhui Guo, Shaoyuan Xu, Jinmiao Fu, Jia Liu, Chaosheng Dong, Bryan Wang

    Abstract: This paper introduces \textbf{Q-tuning}, a novel approach for continual prompt tuning that enables the lifelong learning of a pre-trained language model. When learning a new task, Q-tuning trains a task-specific prompt by adding it to a prompt queue consisting of the prompts from older tasks. To better transfer the knowledge of old tasks, we design an adaptive knowledge aggregation technique that… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024 findings

  15. arXiv:2403.16248  [pdf, other

    cs.CL

    Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling

    Authors: Yida Mu, Chun Dong, Kalina Bontcheva, Xingyi Song

    Abstract: Topic modelling, as a well-established unsupervised technique, has found extensive use in automatically detecting significant topics within a corpus of documents. However, classic topic modelling approaches (e.g., LDA) have certain drawbacks, such as the lack of semantic understanding and the presence of overlapping topics. In this work, we investigate the untapped potential of large language mode… ▽ More

    Submitted 26 March, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  16. arXiv:2403.15130  [pdf, ps, other

    cs.IT eess.SP

    Coexisting Passive RIS and Active Relay Assisted NOMA Systems

    Authors: Ao Huang, Li Guo, Xidong Mu, Chao Dong, Yuanwei Liu

    Abstract: A novel coexisting passive reconfigurable intelligent surface (RIS) and active decode-and-forward (DF) relay assisted non-orthogonal multiple access (NOMA) transmission framework is proposed. In particular, two communication protocols are conceived, namely Hybrid NOMA (H-NOMA) and Full NOMA (F-NOMA). Based on the proposed two protocols, both the sum rate maximization and max-min rate fairness prob… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  17. arXiv:2403.12363  [pdf, other

    cs.CR cs.NI

    E-DoH: Elegantly Detecting the Depths of Open DoH Service on the Internet

    Authors: Cong Dong, Jiahai Yang, Yun Li, Yue Wu, Yufan Chen, Chenglong Li, Haoran Jiao, Xia Yin, Yuling Liu

    Abstract: In recent years, DNS over Encrypted (DoE) methods have been regarded as a novel trend within the realm of the DNS ecosystem. In these DoE methods, DNS over HTTPS (DoH) provides encryption to protect data confidentiality while providing better obfuscation to avoid censorship by multiplexing port 443 with web services. This development introduced certain inconveniences in discovering publicly availa… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  18. arXiv:2403.05937  [pdf, other

    cs.CV eess.IV

    Wavelet-Like Transform-Based Technology in Response to the Call for Proposals on Neural Network-Based Image Coding

    Authors: Cunhui Dong, Haichuan Ma, Haotian Zhang, Changsheng Gao, Li Li, Dong Liu

    Abstract: Neural network-based image coding has been developing rapidly since its birth. Until 2022, its performance has surpassed that of the best-performing traditional image coding framework -- H.266/VVC. Witnessing such success, the IEEE 1857.11 working subgroup initializes a neural network-based image coding standard project and issues a corresponding call for proposals (CfP). In response to the CfP, t… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  19. arXiv:2403.01497  [pdf, other

    cs.CV

    Learning A Physical-aware Diffusion Model Based on Transformer for Underwater Image Enhancement

    Authors: Chen Zhao, Chenyu Dong, Weiling Cai

    Abstract: Underwater visuals undergo various complex degradations, inevitably influencing the efficiency of underwater vision tasks. Recently, diffusion models were employed to underwater image enhancement (UIE) tasks, and gained SOTA performance. However, these methods fail to consider the physical properties and underwater imaging mechanisms in the diffusion process, limiting information completion capaci… ▽ More

    Submitted 22 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  20. arXiv:2402.07246  [pdf, other

    cs.LG

    Towards Generalized Inverse Reinforcement Learning

    Authors: Chaosheng Dong, Yijia Wang

    Abstract: This paper studies generalized inverse reinforcement learning (GIRL) in Markov decision processes (MDPs), that is, the problem of learning the basic components of an MDP given observed behavior (policy) that might not be optimal. These components include not only the reward function and transition probability matrices, but also the action space and state space that are not exactly known but are kn… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  21. arXiv:2401.13627  [pdf, other

    cs.CV

    Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

    Authors: Fanghua Yu, Jinjin Gu, Zheyuan Li, Jinfan Hu, Xiangtao Kong, Xintao Wang, Jingwen He, Yu Qiao, Chao Dong

    Abstract: We introduce SUPIR (Scaling-UP Image Restoration), a groundbreaking image restoration method that harnesses generative prior and the power of model scaling up. Leveraging multi-modal techniques and advanced generative prior, SUPIR marks a significant advance in intelligent and realistic image restoration. As a pivotal catalyst within SUPIR, model scaling dramatically enhances its capabilities and… ▽ More

    Submitted 3 April, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: This paper has been accepted by CVPR 2024

  22. arXiv:2401.03379  [pdf, other

    cs.CV

    Towards Effective Multiple-in-One Image Restoration: A Sequential and Prompt Learning Strategy

    Authors: Xiangtao Kong, Chao Dong, Lei Zhang

    Abstract: While single task image restoration (IR) has achieved significant successes, it remains a challenging issue to train a single model which can tackle multiple IR tasks. In this work, we investigate in-depth the multiple-in-one (MiO) IR problem, which comprises seven popular IR tasks. We point out that MiO IR faces two pivotal challenges: the optimization of diverse objectives and the adaptation to… ▽ More

    Submitted 20 March, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

  23. arXiv:2312.16057  [pdf, other

    cs.IT eess.SP

    Semantic Importance-Aware Based for Multi-User Communication Over MIMO Fading Channels

    Authors: Haotai Liang, Zhicheng Bao, Wannian An, Chen Dong, Xiaodong Xu

    Abstract: Semantic communication, as a novel communication paradigm, has attracted the interest of many scholars, with multi-user, multi-input multi-output (MIMO) scenarios being one of the critical contexts. This paper presents a semantic importance-aware based communication system (SIA-SC) over MIMO Rayleigh fading channels. Combining the semantic symbols' inequality and the equivalent subchannels of MIMO… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  24. arXiv:2312.08962  [pdf, other

    cs.CV

    Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models

    Authors: Zhiyuan You, Zheyuan Li, Jinjin Gu, Zhenfei Yin, Tianfan Xue, Chao Dong

    Abstract: We introduce a Depicted image Quality Assessment method (DepictQA), overcoming the constraints of traditional score-based methods. DepictQA allows for detailed, language-based, human-like evaluation of image quality by leveraging Multi-modal Large Language Models (MLLMs). Unlike conventional Image Quality Assessment (IQA) methods relying on scores, DepictQA interprets image content and distortions… ▽ More

    Submitted 14 July, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted to ECCV2024, Camera Ready Version

  25. arXiv:2312.08862  [pdf, other

    cs.IT eess.SP

    Semantics-Division Duplexing: A Novel Full-Duplex Paradigm

    Authors: Kai Niu, Zijian Liang, Chao Dong, Jincheng Dai, Zhongwei Si, Ping Zhang

    Abstract: In-band full-duplex (IBFD) is a theoretically effective solution to increase the overall throughput for the future wireless communications system by enabling transmission and reception over the same time-frequency resources. However, reliable source reconstruction remains a great challenge in the practical IBFD systems due to the non-ideal elimination of the self-interference and the inherent limi… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: 9 pages, 5 figures, submitted to IEEE Wireless Communications Magazine

  26. arXiv:2312.08799  [pdf, ps, other

    cs.GT econ.TH

    Refined Characterizations of Approval-based Committee Scoring Rules

    Authors: Chris Dong, Patrick Lederer

    Abstract: In approval-based committee (ABC) elections, the goal is to select a fixed-size subset of the candidates, a so-called committee, based on the voters' approval ballots over the candidates. One of the most popular classes of ABC voting rules are ABC scoring rules, which have recently been characterized by Lackner and Skowron (2021). However, this characterization relies on a model where the output i… ▽ More

    Submitted 4 March, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Appears at AAAI-24

  27. arXiv:2312.08798  [pdf, ps, other

    cs.GT econ.TH

    Participation Incentives in Approval-Based Committee Elections

    Authors: Martin Bullinger, Chris Dong, Patrick Lederer, Clara Mehler

    Abstract: In approval-based committee (ABC) voting, the goal is to choose a subset of predefined size of the candidates based on the voters' approval preferences over the candidates. While this problem has attracted significant attention in recent years, the incentives for voters to participate in an election for a given ABC voting rule have been neglected so far. This paper is thus the first to explicitly… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: Appears at AAAI-24

  28. arXiv:2312.06739  [pdf, other

    cs.CV

    SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models

    Authors: Yuzhou Huang, Liangbin Xie, Xintao Wang, Ziyang Yuan, Xiaodong Cun, Yixiao Ge, Jiantao Zhou, Chao Dong, Rui Huang, Ruimao Zhang, Ying Shan

    Abstract: Current instruction-based editing methods, such as InstructPix2Pix, often fail to produce satisfactory results in complex scenarios due to their dependence on the simple CLIP text encoder in diffusion models. To rectify this, this paper introduces SmartEdit, a novel approach to instruction-based image editing that leverages Multimodal Large Language Models (MLLMs) to enhance their understanding an… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Project page: https://yuzhou914.github.io/SmartEdit/

  29. arXiv:2312.05651  [pdf, other

    math.PR cs.DS

    Set-valued recursions arising from vantage-point trees

    Authors: Congzao Dong, Alexander Marynych, Ilya Molchanov

    Abstract: We study vantage-point trees constructed using an independent sample from the uniform distribution on a fixed convex body $K$ in $(\mathbb{R}^d,\|\cdot\|)$, where $\|\cdot\|$ is an arbitrary homogeneous norm on $\mathbb{R}^d$. We prove that a sequence of sets, associated with the left boundary of a vantage-point tree, forms a recurrent Harris chain on the space of convex bodies in… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: 15 pages

    MSC Class: Primary: 60D05; Secondary: 60C05; 60J05; 68P10

  30. arXiv:2311.17696  [pdf

    cs.CL

    How to Build an AI Tutor that Can Adapt to Any Course and Provide Accurate Answers Using Large Language Model and Retrieval-Augmented Generation

    Authors: Chenxi Dong

    Abstract: This paper proposes a low-code solution to build an AI tutor that leverages advanced AI techniques to provide accurate and contextually relevant responses in a personalized learning environment. The OpenAI Assistants API allows AI Tutor to easily embed, store, retrieve, and manage files and chat history, enabling a low-code solution. Large Language Models (LLMs) and Retrieval-Augmented Generation… ▽ More

    Submitted 21 June, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: 7 pages, 5 figures

  31. arXiv:2311.16845  [pdf, other

    cs.CV

    Wavelet-based Fourier Information Interaction with Frequency Diffusion Adjustment for Underwater Image Restoration

    Authors: Chen Zhao, Weiling Cai, Chenyu Dong, Chengwei Hu

    Abstract: Underwater images are subject to intricate and diverse degradation, inevitably affecting the effectiveness of underwater visual tasks. However, most approaches primarily operate in the raw pixel space of images, which limits the exploration of the frequency characteristics of underwater images, leading to an inadequate utilization of deep models' representational capabilities in producing high-qua… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  32. arXiv:2311.16844  [pdf, other

    cs.CR cs.FL cs.LO

    A Direct Lazy Sampling Proof Technique in Probabilistic Relational Hoare Logic

    Authors: Roberto Metere, Changyu Dong

    Abstract: Programs using random values can either make all choices in advance (eagerly) or sample as needed (lazily). In formal proofs, we focus on indistinguishability between two lazy programs, a common requirement in the random oracle model (ROM). While rearranging sampling instructions often solves this, it gets complex when sampling is spread across procedures. The traditional approach, introduced by B… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: 12 pages, 13 figures, 1 table

  33. arXiv:2311.15683  [pdf

    eess.AS cs.SD eess.SP

    Ultrasensitive Textile Strain Sensors Redefine Wearable Silent Speech Interfaces with High Machine Learning Efficiency

    Authors: Chenyu Tang, Muzi Xu, Wentian Yi, Zibo Zhang, Edoardo Occhipinti, Chaoqun Dong, Dafydd Ravenscroft, Sung-Min Jung, Sanghyo Lee, Shuo Gao, Jong Min Kim, Luigi G. Occhipinti

    Abstract: Our research presents a wearable Silent Speech Interface (SSI) technology that excels in device comfort, time-energy efficiency, and speech decoding accuracy for real-world use. We developed a biocompatible, durable textile choker with an embedded graphene-based strain sensor, capable of accurately detecting subtle throat movements. This sensor, surpassing other strain sensors in sensitivity by 42… ▽ More

    Submitted 7 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: 5 figures in the article; 11 figures and 4 tables in supplementary information

    Journal ref: npj Flexible Electronics (2024)

  34. arXiv:2311.15593  [pdf, other

    cs.IT cs.PF eess.SP

    Performance Analysis of MDMA-Based Cooperative MRC Networks with Relays in Dissimilar Rayleigh Fading Channels

    Authors: Lei Teng, Wannian An, Chen Dong, Xiaoqi Qin, Xiaodong Xu

    Abstract: Multiple access technology is a key technology in various generations of wireless communication systems. As a potential multiple access technology for the next generation wireless communication systems, model division multiple access (MDMA) technology improves spectrum efficiency and feasibility regions. This implies that the MDMA scheme can achieve greater performance gains compared to traditiona… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 6 pages, 4 figures, conference

  35. arXiv:2311.14900  [pdf, other

    cs.CV cs.AI

    Resfusion: Denoising Diffusion Probabilistic Models for Image Restoration Based on Prior Residual Noise

    Authors: Zhenning Shi, Haoshuai Zheng, Chen Xu, Changsheng Dong, Bin Pan, Xueshuo Xie, Along He, Tao Li, Huazhu Fu

    Abstract: Recently, research on denoising diffusion models has expanded its application to the field of image restoration. Traditional diffusion-based image restoration methods utilize degraded images as conditional input to effectively guide the reverse generation process, without modifying the original denoising diffusion process. However, since the degraded images already include low-frequency informatio… ▽ More

    Submitted 20 May, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

  36. arXiv:2311.12059  [pdf, other

    cs.CV cs.CR

    Towards Function Space Mesh Watermarking: Protecting the Copyright of Signed Distance Fields

    Authors: Xingyu Zhu, Guanhui Ye, Chengdong Dong, Xiapu Luo, Xuetao Wei

    Abstract: The signed distance field (SDF) represents 3D geometries in continuous function space. Due to its continuous nature, explicit 3D models (e.g., meshes) can be extracted from it at arbitrary resolution, which means losing the SDF is equivalent to losing the mesh. Recent research has shown meshes can also be extracted from SDF-enhanced neural radiance fields (NeRF). Such a signal raises an alarm that… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  37. arXiv:2311.10492  [pdf, other

    cs.CV

    A Relay System for Semantic Image Transmission based on Shared Feature Extraction and Hyperprior Entropy Compression

    Authors: Wannian An, Zhicheng Bao, Haotai Liang, Chen Dong, Xiaodong

    Abstract: Nowadays, the need for high-quality image reconstruction and restoration is more and more urgent. However, most image transmission systems may suffer from image quality degradation or transmission interruption in the face of interference such as channel noise and link fading. To solve this problem, a relay communication network for semantic image transmission based on shared feature extraction and… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  38. arXiv:2311.09932  [pdf, other

    cs.CY

    The Communication GSC System with Energy Harvesting Nodes aided by Opportunistic Routing

    Authors: Hanyu Liu, Lei Teng, Wannian An, Xiaoqi Qin, Chen Dong, Xiaodong Xu

    Abstract: In this paper, a cooperative communication network based on energy-harvesting (EH) decode-and-forward (DF) relays is proposed. For relay nodes, there is harvest-storage-use (HSU) structure in this system. And energy can be obtained from the surrounding environment through energy buffering. In order to improve the performance of the communication system, the opportunistic routing algorithm and the… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  39. arXiv:2311.06968  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    Physics-Informed Data Denoising for Real-Life Sensing Systems

    Authors: Xiyuan Zhang, Xiaohan Fu, Diyan Teng, Chengyu Dong, Keerthivasan Vijayakumar, Jiayun Zhang, Ranak Roy Chowdhury, Junsheng Han, Dezhi Hong, Rashmi Kulkarni, Jingbo Shang, Rajesh Gupta

    Abstract: Sensors measuring real-life physical processes are ubiquitous in today's interconnected world. These sensors inherently bear noise that often adversely affects performance and reliability of the systems they support. Classic filtering-based approaches introduce strong assumptions on the time or frequency characteristics of sensory measurements, while learning-based denoising approaches typically r… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: SenSys 2023

  40. arXiv:2311.06805  [pdf, other

    cs.CL

    Tunable Soft Prompts are Messengers in Federated Learning

    Authors: Chenhe Dong, Yuexiang Xie, Bolin Ding, Ying Shen, Yaliang Li

    Abstract: Federated learning (FL) enables multiple participants to collaboratively train machine learning models using decentralized data sources, alleviating privacy concerns that arise from directly sharing local data. However, the lack of model privacy protection in FL becomes an unneglectable challenge, especially when people want to federally finetune models based on a proprietary large language model.… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: Accepted by EMNLP-23

  41. arXiv:2311.04528  [pdf, other

    cs.LG cs.IR

    Bandit Learning to Rank with Position-Based Click Models: Personalized and Equal Treatments

    Authors: Tianchen Zhou, Jia Liu, Yang Jiao, Chaosheng Dong, Yetian Chen, Yan Gao, Yi Sun

    Abstract: Online learning to rank (ONL2R) is a foundational problem for recommender systems and has received increasing attention in recent years. Among the existing approaches for ONL2R, a natural modeling architecture is the multi-armed bandit framework coupled with the position-based click model. However, developing efficient online learning policies for MAB-based ONL2R with position-based click models i… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  42. arXiv:2310.19341  [pdf, other

    cs.CL cs.AI

    Skywork: A More Open Bilingual Foundation Model

    Authors: Tianwen Wei, Liang Zhao, Lichang Zhang, Bo Zhu, Lijie Wang, Haihua Yang, Biye Li, Cheng Cheng, Weiwei Lü, Rui Hu, Chenxia Li, Liu Yang, Xilin Luo, Xuejie Wu, Lunan Liu, Wenjun Cheng, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Lei Lin, Xiaokun Wang, Yutuan Ma, Chuanhai Dong, Yanqi Sun, Yifu Chen , et al. (5 additional authors not shown)

    Abstract: In this technical report, we present Skywork-13B, a family of large language models (LLMs) trained on a corpus of over 3.2 trillion tokens drawn from both English and Chinese texts. This bilingual foundation model is the most extensively trained and openly published LLMs of comparable size to date. We introduce a two-stage training methodology using a segmented corpus, targeting general purpose tr… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  43. arXiv:2310.11881  [pdf, other

    cs.CV

    A Comparative Study of Image Restoration Networks for General Backbone Network Design

    Authors: Xiangyu Chen, Zheyuan Li, Yuandong Pu, Yihao Liu, Jiantao Zhou, Yu Qiao, Chao Dong

    Abstract: Despite the significant progress made by deep models in various image restoration tasks, existing image restoration networks still face challenges in terms of task generality. An intuitive manifestation is that networks which excel in certain tasks often fail to deliver satisfactory results in others. To illustrate this point, we select five representative networks and conduct a comparative study… ▽ More

    Submitted 16 July, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted to ECCV2024

  44. arXiv:2310.10513  [pdf, other

    cs.CV eess.IV

    Unifying Image Processing as Visual Prompting Question Answering

    Authors: Yihao Liu, Xiangyu Chen, Xianzheng Ma, Xintao Wang, Jiantao Zhou, Yu Qiao, Chao Dong

    Abstract: Image processing is a fundamental task in computer vision, which aims at enhancing image quality and extracting essential features for subsequent vision applications. Traditionally, task-specific models are developed for individual tasks and designing such models requires distinct expertise. Building upon the success of large language models (LLMs) in natural language processing (NLP), there is a… ▽ More

    Submitted 20 February, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: 16 pages, 12 figures

  45. arXiv:2310.09866  [pdf, other

    cs.LG cs.AI cs.DC

    Federated Multi-Objective Learning

    Authors: Haibo Yang, Zhuqing Liu, Jia Liu, Chaosheng Dong, Michinari Momma

    Abstract: In recent years, multi-objective optimization (MOO) emerges as a foundational problem underpinning many multi-agent multi-task learning applications. However, existing algorithms in MOO literature remain limited to centralized learning settings, which do not satisfy the distributed nature and data privacy needs of such multi-agent multi-task learning applications. This motivates us to propose a ne… ▽ More

    Submitted 8 January, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

    Comments: Accepted in NeurIPS 2023

  46. arXiv:2310.07347  [pdf, other

    cs.CL cs.AI cs.LG

    Fast-ELECTRA for Efficient Pre-training

    Authors: Chengyu Dong, Liyuan Liu, Hao Cheng, Jingbo Shang, Jianfeng Gao, Xiaodong Liu

    Abstract: ELECTRA pre-trains language models by detecting tokens in a sequence that have been replaced by an auxiliary model. Although ELECTRA offers a significant boost in efficiency, its potential is constrained by the training cost brought by the auxiliary model. Notably, this model, which is jointly trained with the main model, only serves to assist the training of the main model and is discarded post-t… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  47. arXiv:2310.03182  [pdf, other

    cs.CV cs.CL cs.LG

    Robust and Interpretable Medical Image Classifiers via Concept Bottleneck Models

    Authors: An Yan, Yu Wang, Yiwu Zhong, Zexue He, Petros Karypis, Zihan Wang, Chengyu Dong, Amilcare Gentili, Chun-Nan Hsu, Jingbo Shang, Julian McAuley

    Abstract: Medical image classification is a critical problem for healthcare, with the potential to alleviate the workload of doctors and facilitate diagnoses of patients. However, two challenges arise when deploying deep learning models to real-world healthcare applications. First, neural models tend to learn spurious correlations instead of desired features, which could fall short when generalizing to new… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 18 pages, 12 figures

  48. arXiv:2309.11992  [pdf, other

    eess.SP cs.NI

    UAV Swarm Deployment and Trajectory for 3D Area Coverage via Reinforcement Learning

    Authors: Jia He, Ziye Jia, Chao Dong, Junyu Liu, Qihui Wu, Jingxian Liu

    Abstract: Unmanned aerial vehicles (UAVs) are recognized as promising technologies for area coverage due to the flexibility and adaptability. However, the ability of a single UAV is limited, and as for the large-scale three-dimensional (3D) scenario, UAV swarms can establish seamless wireless communication services. Hence, in this work, we consider a scenario of UAV swarm deployment and trajectory to satisf… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  49. Symbol Detection for Coarsely Quantized OTFS

    Authors: Junwei He, Haochuan Zhang, Chao Dong, Huimin Zhu

    Abstract: This paper explicitly models a coarse and noisy quantization in a communication system empowered by orthogonal time frequency space (OTFS) for cost and power efficiency. We first point out, with coarse quantization, the effective channel is imbalanced and thus no longer able to circularly shift the transmitted symbols along the delay-Doppler domain. Meanwhile, the effective channel is non-isotropi… ▽ More

    Submitted 20 January, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

  50. arXiv:2309.05929  [pdf

    eess.IV cs.CV

    Introducing Shape Prior Module in Diffusion Model for Medical Image Segmentation

    Authors: Zhiqing Zhang, Guojia Fan, Tianyong Liu, Nan Li, Yuyang Liu, Ziyu Liu, Canwei Dong, Shoujun Zhou

    Abstract: Medical image segmentation is critical for diagnosing and treating spinal disorders. However, the presence of high noise, ambiguity, and uncertainty makes this task highly challenging. Factors such as unclear anatomical boundaries, inter-class similarities, and irrational annotations contribute to this challenge. Achieving both accurate and diverse segmentation templates is essential to support ra… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.