Skip to main content

Showing 1–50 of 137 results for author: Ding, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.11282  [pdf, other

    cs.CL

    Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models

    Authors: Qingcheng Zeng, Mingyu Jin, Qinkai Yu, Zhenting Wang, Wenyue Hua, Zihao Zhou, Guangyan Sun, Yanda Meng, Shiqing Ma, Qifan Wang, Felix Juefei-Xu, Kaize Ding, Fan Yang, Ruixiang Tang, Yongfeng Zhang

    Abstract: Large Language Models (LLMs) are employed across various high-stakes domains, where the reliability of their outputs is crucial. One commonly used method to assess the reliability of LLMs' responses is uncertainty estimation, which gauges the likelihood of their answers being correct. While many studies focus on improving the accuracy of uncertainty estimations for LLMs, our research investigates… ▽ More

    Submitted 16 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

  2. arXiv:2407.03937  [pdf, other

    cs.CL

    TongGu: Mastering Classical Chinese Understanding with Knowledge-Grounded Large Language Models

    Authors: Jiahuan Cao, Dezhi Peng, Peirong Zhang, Yongxin Shi, Yang Liu, Kai Ding, Lianwen Jin

    Abstract: Classical Chinese is a gateway to the rich heritage and wisdom of ancient China, yet its complexities pose formidable comprehension barriers for most modern people without specialized knowledge. While Large Language Models (LLMs) have shown remarkable capabilities in Natural Language Processing (NLP), they struggle with Classical Chinese Understanding (CCU), especially in data-demanding and knowle… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  3. arXiv:2406.18301  [pdf, other

    eess.AS cs.CL cs.SD

    MSR-86K: An Evolving, Multilingual Corpus with 86,300 Hours of Transcribed Audio for Speech Recognition Research

    Authors: Song Li, Yongbin You, Xuezhi Wang, Zhengkun Tian, Ke Ding, Guanglu Wan

    Abstract: Recently, multilingual artificial intelligence assistants, exemplified by ChatGPT, have gained immense popularity. As a crucial gateway to human-computer interaction, multilingual automatic speech recognition (ASR) has also garnered significant attention, as evidenced by systems like Whisper. However, the proprietary nature of the training data has impeded researchers' efforts to study multilingua… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted by InterSpeech 2024

  4. arXiv:2406.15720  [pdf, other

    cs.CL

    Scaling Laws for Fact Memorization of Large Language Models

    Authors: Xingyu Lu, Xiaonan Li, Qinyuan Cheng, Kai Ding, Xuanjing Huang, Xipeng Qiu

    Abstract: Fact knowledge memorization is crucial for Large Language Models (LLM) to generate factual and reliable responses. However, the behaviors of LLM fact memorization remain under-explored. In this paper, we analyze the scaling laws for LLM's fact knowledge and LLMs' behaviors of memorizing different types of facts. We find that LLMs' fact knowledge capacity has a linear and negative exponential law r… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  5. arXiv:2406.15523  [pdf, other

    cs.LG stat.ML

    Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark

    Authors: Yili Wang, Yixin Liu, Xu Shen, Chenyu Li, Kaize Ding, Rui Miao, Ying Wang, Shirui Pan, Xin Wang

    Abstract: To build safe and reliable graph machine learning systems, unsupervised graph-level anomaly detection (GLAD) and unsupervised graph-level out-of-distribution (OOD) detection (GLOD) have received significant attention in recent years. Though those two lines of research indeed share the same objective, they have been studied independently in the community due to distinct evaluation setups, creating… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  6. arXiv:2406.12747  [pdf, other

    cs.LG cs.AI

    TSI-Bench: Benchmarking Time Series Imputation

    Authors: Wenjie Du, Jun Wang, Linglong Qian, Yiyuan Yang, Fanxing Liu, Zepu Wang, Zina Ibrahim, Haoxin Liu, Zhiyuan Zhao, Yingjie Zhou, Wenjia Wang, Kaize Ding, Yuxuan Liang, B. Aditya Prakash, Qingsong Wen

    Abstract: Effective imputation is a crucial preprocessing step for time series analysis. Despite the development of numerous deep learning algorithms for time series imputation, the community lacks standardized and comprehensive benchmark platforms to effectively evaluate imputation performance across different settings. Moreover, although many deep learning forecasting algorithms have demonstrated excellen… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  7. arXiv:2406.10952  [pdf, other

    cs.CL

    Avoiding Copyright Infringement via Machine Unlearning

    Authors: Guangyao Dou, Zheyuan Liu, Qing Lyu, Kaize Ding, Eric Wong

    Abstract: Pre-trained Large Language Models (LLMs) have demonstrated remarkable capabilities but also pose risks by learning and generating copyrighted material, leading to significant legal and ethical concerns. To address these issues, it is critical for model owners to be able to unlearn copyrighted content at various time steps. We explore the setting of sequential unlearning, where copyrighted content… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  8. arXiv:2406.09098  [pdf, other

    cs.CL

    SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

    Authors: Kehua Feng, Keyan Ding, Weijie Wang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Yu Zhao, Jianhua Yao, Qiang Zhang, Huajun Chen

    Abstract: The burgeoning utilization of Large Language Models (LLMs) in scientific research necessitates advanced benchmarks capable of evaluating their understanding and application of scientific knowledge comprehensively. To address this need, we introduce the SciKnowEval benchmark, a novel framework that systematically evaluates LLMs across five progressive levels of scientific knowledge: studying extens… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 48 pages, 2 figures

  9. arXiv:2406.00115  [pdf, other

    cs.PL

    Towards LLM-Powered Verilog RTL Assistant: Self-Verification and Self-Correction

    Authors: Hanxian Huang, Zhenghan Lin, Zixuan Wang, Xin Chen, Ke Ding, Jishen Zhao

    Abstract: We explore the use of Large Language Models (LLMs) to generate high-quality Register-Transfer Level (RTL) code with minimal human interference. The traditional RTL design workflow requires human experts to manually write high-quality RTL code, which is time-consuming and error-prone. With the help of emerging LLMs, developers can describe their requirements to LLMs which then generate correspondin… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  10. arXiv:2405.18790  [pdf, other

    cs.CV cs.MM eess.IV

    Opinion-Unaware Blind Image Quality Assessment using Multi-Scale Deep Feature Statistics

    Authors: Zhangkai Ni, Yue Liu, Keyan Ding, Wenhan Yang, Hanli Wang, Shiqi Wang

    Abstract: Deep learning-based methods have significantly influenced the blind image quality assessment (BIQA) field, however, these methods often require training using large amounts of human rating data. In contrast, traditional knowledge-based methods are cost-effective for training but face challenges in effectively extracting features aligned with human visual perception. To bridge these gaps, we propos… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted to IEEE Transactions on Multimedia 2024

  11. arXiv:2405.15234  [pdf, other

    cs.CV cs.CR

    Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models

    Authors: Yimeng Zhang, Xin Chen, Jinghan Jia, Yihua Zhang, Chongyu Fan, Jiancheng Liu, Mingyi Hong, Ke Ding, Sijia Liu

    Abstract: Diffusion models (DMs) have achieved remarkable success in text-to-image generation, but they also pose safety risks, such as the potential generation of harmful content and copyright violations. The techniques of machine unlearning, also known as concept erasing, have been developed to address these risks. However, these techniques remain vulnerable to adversarial prompt attacks, which can prompt… ▽ More

    Submitted 14 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Codes are available at https://github.com/OPTML-Group/AdvUnlearn

  12. arXiv:2405.14743  [pdf, other

    cs.LG cs.AI

    Iterative Causal Segmentation: Filling the Gap between Market Segmentation and Marketing Strategy

    Authors: Kaihua Ding, Jingsong Cui, Mohammad Soltani, Jing Jin

    Abstract: The field of causal Machine Learning (ML) has made significant strides in recent years. Notable breakthroughs include methods such as meta learners (arXiv:1706.03461v6) and heterogeneous doubly robust estimators (arXiv:2004.14497) introduced in the last five years. Despite these advancements, the field still faces challenges, particularly in managing tightly coupled systems where both the causal t… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  13. arXiv:2405.12244  [pdf

    physics.soc-ph cs.LG

    Real-Time Go-Around Prediction: A case study of JFK airport

    Authors: Ke Liu, Kaijing Ding, Lu Dai, Mark Hansen, Kennis Chan, John Schade

    Abstract: In this paper, we employ the long-short-term memory model (LSTM) to predict the real-time go-around probability as an arrival flight is approaching JFK airport and within 10 nm of the landing runway threshold. We further develop methods to examine the causes to go-around occurrences both from a global view and an individual flight perspective. According to our results, in-trail spacing, and simult… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: https://www.icrat.org/

    Journal ref: International Conference on Research in Air Transportation (ICRAT2024)

  14. arXiv:2405.08293  [pdf, other

    cs.LG

    Airport Delay Prediction with Temporal Fusion Transformers

    Authors: Ke Liu, Kaijing Ding, Xi Cheng, Jianan Chen, Siyuan Feng, Hui Lin, Jilin Song, Chen Zhu

    Abstract: Since flight delay hurts passengers, airlines, and airports, its prediction becomes crucial for the decision-making of all stakeholders in the aviation industry and thus has been attempted by various previous research. However, previous delay predictions are often categorical and at a highly aggregated level. To improve that, this study proposes to apply the novel Temporal Fusion Transformer model… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  15. arXiv:2405.04757  [pdf, other

    eess.SY cs.GT

    Communication-efficient and Differentially-private Distributed Nash Equilibrium Seeking with Linear Convergence

    Authors: Xiaomeng Chen, Wei Huo, Kemi Ding, Subhrakanti Dey, Ling Shi

    Abstract: The distributed computation of a Nash equilibrium (NE) for non-cooperative games is gaining increased attention recently. Due to the nature of distributed systems, privacy and communication efficiency are two critical concerns. Traditional approaches often address these critical concerns in isolation. This work introduces a unified framework, named CDP-NES, designed to improve communication effici… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  16. arXiv:2405.03106  [pdf, other

    eess.SY cs.GT

    Compression-based Privacy Preservation for Distributed Nash Equilibrium Seeking in Aggregative Games

    Authors: Wei Huo, Xiaomeng Chen, Kemi Ding, Subhrakanti Dey, Ling Shi

    Abstract: This paper explores distributed aggregative games in multi-agent systems. Current methods for finding distributed Nash equilibrium require players to send original messages to their neighbors, leading to communication burden and privacy issues. To jointly address these issues, we propose an algorithm that uses stochastic compression to save communication resources and conceal information through r… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  17. arXiv:2404.17642  [pdf, other

    cs.CL cs.AI

    Empowering Large Language Models for Textual Data Augmentation

    Authors: Yichuan Li, Kaize Ding, Jianling Wang, Kyumin Lee

    Abstract: With the capabilities of understanding and executing natural language instructions, Large language models (LLMs) can potentially act as a powerful tool for textual data augmentation. However, the quality of augmented data depends heavily on the augmentation instructions provided, and the effectiveness can fluctuate across different downstream tasks. While manually crafting and selecting instructio… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  18. arXiv:2404.09438  [pdf, other

    math.OC cs.LG stat.ML

    Developing Lagrangian-based Methods for Nonsmooth Nonconvex Optimization

    Authors: Nachuan Xiao, Kuangyu Ding, Xiaoyin Hu, Kim-Chuan Toh

    Abstract: In this paper, we consider the minimization of a nonsmooth nonconvex objective function $f(x)$ over a closed convex subset $\mathcal{X}$ of $\mathbb{R}^n$, with additional nonsmooth nonconvex constraints $c(x) = 0$. We develop a unified framework for developing Lagrangian-based methods, which takes a single-step update to the primal variables by some subgradient methods in each iteration. These su… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 30 pages, 4 figures

  19. arXiv:2404.08008  [pdf, other

    cs.LG cs.CL cs.HC

    Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition

    Authors: Kehua Feng, Keyan Ding, Kede Ma, Zhihua Wang, Qiang Zhang, Huajun Chen

    Abstract: The past years have witnessed a proliferation of large language models (LLMs). Yet, automated and unbiased evaluation of LLMs is challenging due to the inaccuracy of standard metrics in reflecting human preferences and the inefficiency in sampling informative and diverse test examples. While human evaluation remains the gold standard, it is expensive and time-consuming, especially when dealing wit… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 32 pages, 6 figures

  20. arXiv:2404.07066  [pdf, other

    cs.CL cs.AI cs.LG

    Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?

    Authors: Mingyu Jin, Qinkai Yu, Jingyuan Huang, Qingcheng Zeng, Zhenting Wang, Wenyue Hua, Haiyan Zhao, Kai Mei, Yanda Meng, Kaize Ding, Fan Yang, Mengnan Du, Yongfeng Zhang

    Abstract: Large language models (LLMs) have shown remarkable performances across a wide range of tasks. However, the mechanisms by which these models encode tasks of varying complexities remain poorly understood. In this paper, we explore the hypothesis that LLMs process concepts of varying complexities in different layers, introducing the idea of "Concept Depth" to suggest that more complex concepts are ty… ▽ More

    Submitted 30 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: 12 pages

  21. arXiv:2404.03634  [pdf, other

    cs.RO cs.CV

    PreAfford: Universal Affordance-Based Pre-Grasping for Diverse Objects and Environments

    Authors: Kairui Ding, Boyuan Chen, Ruihai Wu, Yuyang Li, Zongzheng Zhang, Huan-ang Gao, Siqi Li, Guyue Zhou, Yixin Zhu, Hao Dong, Hao Zhao

    Abstract: Robotic manipulation with two-finger grippers is challenged by objects lacking distinct graspable features. Traditional pre-grasping methods, which typically involve repositioning objects or utilizing external aids like table edges, are limited in their adaptability across different object categories and environments. To overcome these limitations, we introduce PreAfford, a novel pre-grasping plan… ▽ More

    Submitted 4 July, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Project Page: https://air-discover.github.io/PreAfford/

  22. arXiv:2404.00603  [pdf, other

    cs.CV

    Weak Distribution Detectors Lead to Stronger Generalizability of Vision-Language Prompt Tuning

    Authors: Kun Ding, Haojian Zhang, Qiang Yu, Ying Wang, Shiming Xiang, Chunhong Pan

    Abstract: We propose a generalized method for boosting the generalization ability of pre-trained vision-language models (VLMs) while fine-tuning on downstream few-shot tasks. The idea is realized by exploiting out-of-distribution (OOD) detection to predict whether a sample belongs to a base distribution or a novel distribution and then using the score generated by a dedicated competition based scoring funct… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: Accepted by AAAI2024

  23. arXiv:2403.11631  [pdf, other

    cs.CV

    Compositional Kronecker Context Optimization for Vision-Language Models

    Authors: Kun Ding, Xiaohui Li, Qiang Yu, Ying Wang, Haojian Zhang, Shiming Xiang

    Abstract: Context Optimization (CoOp) has emerged as a simple yet effective technique for adapting CLIP-like vision-language models to downstream image recognition tasks. Nevertheless, learning compact context with satisfactory base-to-new, domain and cross-task generalization ability while adapting to new tasks is still a challenge. To tackle such a challenge, we propose a lightweight yet generalizable app… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  24. arXiv:2403.03348  [pdf, other

    cs.CL cs.AI

    Learning to Maximize Mutual Information for Chain-of-Thought Distillation

    Authors: Xin Chen, Hanxian Huang, Yanjun Gao, Yi Wang, Jishen Zhao, Ke Ding

    Abstract: Knowledge distillation, the technique of transferring knowledge from large, complex models to smaller ones, marks a pivotal step towards efficient AI deployment. Distilling Step-by-Step~(DSS), a novel method utilizing chain-of-thought~(CoT) distillation, has demonstrated promise by imbuing smaller models with the superior reasoning capabilities of their larger counterparts. In DSS, the distilled m… ▽ More

    Submitted 9 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted to ACL 2024 Findings

  25. arXiv:2403.01680  [pdf, other

    cs.CV

    Zero-shot Generalizable Incremental Learning for Vision-Language Object Detection

    Authors: Jieren Deng, Haojian Zhang, Kun Ding, Jianhua Hu, Xingxuan Zhang, Yunkuan Wang

    Abstract: This paper presents Incremental Vision-Language Object Detection (IVLOD), a novel learning task designed to incrementally adapt pre-trained Vision-Language Object Detection Models (VLODMs) to various specialized domains, while simultaneously preserving their zero-shot generalization capabilities for the generalized domain. To address this new challenge, we present the Zero-interference Reparameter… ▽ More

    Submitted 22 May, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  26. arXiv:2402.18041  [pdf, other

    cs.CL cs.AI

    Datasets for Large Language Models: A Comprehensive Survey

    Authors: Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin

    Abstract: This paper embarks on an exploration into the Large Language Model (LLM) datasets, which play a crucial role in the remarkable advancements of LLMs. The datasets serve as the foundational infrastructure analogous to a root system that sustains and nurtures the development of LLMs. Consequently, examination of these datasets emerges as a critical topic in research. In order to address the current l… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 181 pages, 21 figures

  27. arXiv:2402.16699  [pdf, other

    cs.RO

    SwarmPRM: Probabilistic Roadmap Motion Planning for Large-Scale Swarm Robotic Systems

    Authors: Yunze Hu, Xuru Yang, Kangjie Zhou, Qinghang Liu, Kang Ding, Han Gao, Pingping Zhu, Chang Liu

    Abstract: Large-scale swarm robotic systems consisting of numerous cooperative agents show considerable promise for performing autonomous tasks across various sectors. Nonetheless, traditional motion planning approaches often face a trade-off between scalability and solution quality due to the exponential growth of the joint state space of robots. In response, this work proposes SwarmPRM, a hierarchical, sc… ▽ More

    Submitted 24 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Submitted to IROS 2024

  28. arXiv:2402.16690  [pdf, other

    cs.RO

    Risk-Aware Non-Myopic Motion Planner for Large-Scale Robotic Swarm Using CVaR Constraints

    Authors: Xuru Yang, Yunze Hu, Han Gao, Kang Ding, Zhaoyang Li, Pingping Zhu, Ying Sun, Chang Liu

    Abstract: Swarm robotics has garnered significant attention due to its ability to accomplish elaborate and synchronized tasks. Existing methodologies for motion planning of swarm robotic systems mainly encounter difficulties in scalability and safety guarantee. To address these limitations, we propose a Risk-aware swarm mOtion planner using conditional ValuE at Risk (ROVER) that systematically navigates lar… ▽ More

    Submitted 15 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 9 pages, 5 figures

  29. arXiv:2402.14099  [pdf, other

    eess.IV cs.CV physics.med-ph

    EXACT-Net:EHR-guided lung tumor auto-segmentation for non-small cell lung cancer radiotherapy

    Authors: Hamed Hooshangnejad, Xue Feng, Gaofeng Huang, Rui Zhang, Quan Chen, Kai Ding

    Abstract: Lung cancer is a devastating disease with the highest mortality rate among cancer types. Over 60% of non-small cell lung cancer (NSCLC) patients, which accounts for 87% of diagnoses, require radiation therapy. Rapid treatment initiation significantly increases the patient's survival rate and reduces the mortality rate. Accurate tumor segmentation is a critical step in the diagnosis and treatment o… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  30. arXiv:2402.11153  [pdf, other

    cs.LG

    Beyond Generalization: A Survey of Out-Of-Distribution Adaptation on Graphs

    Authors: Shuhan Liu, Kaize Ding

    Abstract: Distribution shifts on graphs -- the data distribution discrepancies between training and testing a graph machine learning model, are often ubiquitous and unavoidable in real-world scenarios. Such shifts may severely deteriorate the performance of the model, posing significant challenges for reliable graph machine learning. Consequently, there has been a surge in research on graph Out-Of-Distribut… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: under review

  31. arXiv:2401.14656  [pdf, other

    cs.CL

    Scientific Large Language Models: A Survey on Biological & Chemical Domains

    Authors: Qiang Zhang, Keyang Ding, Tianwen Lyv, Xinda Wang, Qingyu Yin, Yiwen Zhang, Jing Yu, Yuhao Wang, Xiaotong Li, Zhuoyi Xiang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Mengyao Zhang, Jinlu Zhang, Jiyu Cui, Renjun Xu, Hongyang Chen, Xiaohui Fan, Huabin Xing, Huajun Chen

    Abstract: Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. This growing interest has led to the advent o… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  32. arXiv:2401.13210  [pdf, other

    cs.LG cs.SI

    Multitask Active Learning for Graph Anomaly Detection

    Authors: Wenjing Chang, Kay Liu, Kaize Ding, Philip S. Yu, Jianjun Yu

    Abstract: In the web era, graph machine learning has been widely used on ubiquitous graph-structured data. As a pivotal component for bolstering web security and enhancing the robustness of graph-based applications, the significance of graph anomaly detection is continually increasing. While Graph Neural Networks (GNNs) have demonstrated efficacy in supervised and semi-supervised graph anomaly detection, th… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Preprint. Under review. Code available at https://github.com/AhaChang/MITIGATE

  33. arXiv:2401.08107  [pdf, other

    cs.CV cs.MM

    Deep Shape-Texture Statistics for Completely Blind Image Quality Evaluation

    Authors: Yixuan Li, Peilin Chen, Hanwei Zhu, Keyan Ding, Leida Li, Shiqi Wang

    Abstract: Opinion-Unaware Blind Image Quality Assessment (OU-BIQA) models aim to predict image quality without training on reference images and subjective quality scores. Thereinto, image statistical comparison is a classic paradigm, while the performance is limited by the representation ability of visual descriptors. Deep features as visual descriptors have advanced IQA in recent research, but they are dis… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  34. arXiv:2401.05425  [pdf, other

    eess.SP cs.LG

    An Unobtrusive and Lightweight Ear-worn System for Continuous Epileptic Seizure Detection

    Authors: Abdul Aziz, Nhat Pham, Neel Vora, Cody Reynolds, Jaime Lehnen, Pooja Venkatesh, Zhuoran Yao, Jay Harvey, Tam Vu, Kan Ding, Phuc Nguyen

    Abstract: Epilepsy is one of the most common neurological diseases globally, affecting around 50 million people worldwide. Fortunately, up to 70 percent of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scal… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  35. arXiv:2401.03163  [pdf, other

    cs.LG

    An Empirical Investigation of Value-Based Multi-objective Reinforcement Learning for Stochastic Environments

    Authors: Kewen Ding, Peter Vamplew, Cameron Foale, Richard Dazeley

    Abstract: One common approach to solve multi-objective reinforcement learning (MORL) problems is to extend conventional Q-learning by using vector Q-values in combination with a utility function. However issues can arise with this approach in the context of stochastic environments, particularly when optimising for the Scalarised Expected Reward (SER) criterion. This paper extends prior research, providing a… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2211.08669

  36. arXiv:2401.02458  [pdf, other

    cs.LG cs.AI

    Data-Centric Foundation Models in Computational Healthcare: A Survey

    Authors: Yunkun Zhang, Jin Gao, Zheling Tan, Lingfeng Zhou, Kexin Ding, Mu Zhou, Shaoting Zhang, Dequan Wang

    Abstract: The advent of foundation models (FMs) as an emerging suite of AI techniques has struck a wave of opportunities in computational healthcare. The interactive nature of these models, guided by pre-training data and human instructions, has ignited a data-centric AI paradigm that emphasizes better data characterization, quality, and scale. In healthcare AI, obtaining and processing high-quality clinica… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  37. arXiv:2312.12587  [pdf, other

    eess.SP cs.DC q-bio.TO

    Real-Time Diagnostic Integrity Meets Efficiency: A Novel Platform-Agnostic Architecture for Physiological Signal Compression

    Authors: Neel R Vora, Amir Hajighasemi, Cody T. Reynolds, Amirmohammad Radmehr, Mohamed Mohamed, Jillur Rahman Saurav, Abdul Aziz, Jai Prakash Veerla, Mohammad S Nasr, Hayden Lotspeich, Partha Sai Guttikonda, Thuong Pham, Aarti Darji, Parisa Boodaghi Malidarreh, Helen H Shang, Jay Harvey, Kan Ding, Phuc Nguyen, Jacob M Luber

    Abstract: Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases. However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monit… ▽ More

    Submitted 4 January, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  38. arXiv:2312.08192  [pdf, other

    cs.CV

    PAD: Self-Supervised Pre-Training with Patchwise-Scale Adapter for Infrared Images

    Authors: Tao Zhang, Kun Ding, Jinyong Wen, Yu Xiong, Zeyu Zhang, Shiming Xiang, Chunhong Pan

    Abstract: Self-supervised learning (SSL) for RGB images has achieved significant success, yet there is still limited research on SSL for infrared images, primarily due to three prominent challenges: 1) the lack of a suitable large-scale infrared pre-training dataset, 2) the distinctiveness of non-iconic infrared images rendering common pre-training tasks like masked image modeling (MIM) less effective, and… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  39. arXiv:2312.02694  [pdf, other

    cs.CV

    UPOCR: Towards Unified Pixel-Level OCR Interface

    Authors: Dezhi Peng, Zhenhua Yang, Jiaxin Zhang, Chongyu Liu, Yongxin Shi, Kai Ding, Fengjun Guo, Lianwen Jin

    Abstract: In recent years, the optical character recognition (OCR) field has been proliferating with plentiful cutting-edge approaches for a wide spectrum of tasks. However, these approaches are task-specifically designed with divergent paradigms, architectures, and training strategies, which significantly increases the complexity of research and maintenance and hinders the fast deployment in applications.… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  40. arXiv:2311.01166  [pdf, other

    cs.CL cs.AI

    Generative Input: Towards Next-Generation Input Methods Paradigm

    Authors: Keyu Ding, Yongcan Wang, Zihang Xu, Zhenzhen Jia, Shijin Wang, Cong Liu, Enhong Chen

    Abstract: Since the release of ChatGPT, generative models have achieved tremendous success and become the de facto approach for various NLP tasks. However, its application in the field of input methods remains under-explored. Many neural network approaches have been applied to the construction of Chinese input method engines(IMEs).Previous research often assumed that the input pinyin was correct and focused… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  41. arXiv:2310.16520  [pdf, other

    cs.LG

    Towards Self-Interpretable Graph-Level Anomaly Detection

    Authors: Yixin Liu, Kaize Ding, Qinghua Lu, Fuyi Li, Leo Yu Zhang, Shirui Pan

    Abstract: Graph-level anomaly detection (GLAD) aims to identify graphs that exhibit notable dissimilarity compared to the majority in a collection. However, current works primarily focus on evaluating graph-level abnormality while failing to provide meaningful explanations for the predictions, which largely limits their reliability and application scope. In this paper, we investigate a new challenging probl… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 23 pages; accepted to NeurIPS 2023

  42. arXiv:2310.15109  [pdf, other

    cs.CL

    GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs

    Authors: Yichuan Li, Kaize Ding, Kyumin Lee

    Abstract: Self-supervised representation learning on text-attributed graphs, which aims to create expressive and generalizable representations for various downstream tasks, has received increasing research attention lately. However, existing methods either struggle to capture the full extent of structural context information or rely on task-specific training labels, which largely hampers their effectiveness… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023

  43. arXiv:2310.14170  [pdf, other

    cs.LG

    Learning Invariant Molecular Representation in Latent Discrete Space

    Authors: Xiang Zhuang, Qiang Zhang, Keyan Ding, Yatao Bian, Xiao Wang, Jingsong Lv, Hongyang Chen, Huajun Chen

    Abstract: Molecular representation learning lays the foundation for drug discovery. However, existing methods suffer from poor out-of-distribution (OOD) generalization, particularly when data for training and testing originate from different environments. To address this issue, we propose a new framework for learning molecular representations that exhibit invariance and robustness against distribution shift… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

  44. arXiv:2310.11868  [pdf, other

    cs.CV

    To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now

    Authors: Yimeng Zhang, Jinghan Jia, Xin Chen, Aochuan Chen, Yihua Zhang, Jiancheng Liu, Ke Ding, Sijia Liu

    Abstract: The recent advances in diffusion models (DMs) have revolutionized the generation of realistic and complex images. However, these models also introduce potential safety hazards, such as producing harmful content and infringing data copyrights. Despite the development of safety-driven unlearning techniques to counteract these challenges, doubts about their efficacy persist. To tackle this issue, we… ▽ More

    Submitted 7 July, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted by ECCV'24. Codes are available at https://github.com/OPTML-Group/Diffusion-MU-Attack

  45. arXiv:2310.08858  [pdf, other

    math.OC cs.AI cs.LG stat.ML

    Adam-family Methods with Decoupled Weight Decay in Deep Learning

    Authors: Kuangyu Ding, Nachuan Xiao, Kim-Chuan Toh

    Abstract: In this paper, we investigate the convergence properties of a wide class of Adam-family methods for minimizing quadratically regularized nonsmooth nonconvex optimization problems, especially in the context of training nonsmooth neural networks with weight decay. Motivated by the AdamW method, we propose a novel framework for Adam-family methods with decoupled weight decay. Within our framework, th… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

    Comments: 26 pages

  46. arXiv:2310.03269  [pdf, other

    q-bio.BM cs.CL

    InstructProtein: Aligning Human and Protein Language via Knowledge Instruction

    Authors: Zeyuan Wang, Qiang Zhang, Keyan Ding, Ming Qin, Xiang Zhuang, Xiaotong Li, Huajun Chen

    Abstract: Large Language Models (LLMs) have revolutionized the field of natural language processing, but they fall short in comprehending biological sequences such as proteins. To address this challenge, we propose InstructProtein, an innovative LLM that possesses bidirectional generation capabilities in both human and protein languages: (i) taking a protein sequence as input to predict its textual function… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  47. arXiv:2310.01680  [pdf, other

    cs.CV cs.AI

    Keypoint-Augmented Self-Supervised Learning for Medical Image Segmentation with Limited Annotation

    Authors: Zhangsihao Yang, Mengwei Ren, Kaize Ding, Guido Gerig, Yalin Wang

    Abstract: Pretraining CNN models (i.e., UNet) through self-supervision has become a powerful approach to facilitate medical image segmentation under low annotation regimes. Recent contrastive learning methods encourage similar global representations when the same image undergoes different transformations, or enforce invariance across different image/patch features that are intrinsically correlated. However,… ▽ More

    Submitted 18 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: Camera ready for NeurIPS 2023. Code available at https://github.com/zshyang/kaf.git

  48. arXiv:2309.09443  [pdf, other

    eess.AS cs.CL cs.SD

    Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter

    Authors: Song Li, Yongbin You, Xuezhi Wang, Ke Ding, Guanglu Wan

    Abstract: Multilingual intelligent assistants, such as ChatGPT, have recently gained popularity. To further expand the applications of multilingual artificial intelligence assistants and facilitate international communication, it is essential to enhance the performance of multilingual speech recognition, which is a crucial component of speech interaction. In this paper, we propose two simple and parameter-e… ▽ More

    Submitted 19 September, 2023; v1 submitted 17 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP2024

  49. arXiv:2309.07413  [pdf, other

    cs.CL cs.SD eess.AS

    CPPF: A contextual and post-processing-free model for automatic speech recognition

    Authors: Lei Zhang, Zhengkun Tian, Xiang Chen, Jiaming Sun, Hongyu Xiang, Ke Ding, Guanglu Wan

    Abstract: ASR systems have become increasingly widespread in recent years. However, their textual outputs often require post-processing tasks before they can be practically utilized. To address this issue, we draw inspiration from the multifaceted capabilities of LLMs and Whisper, and focus on integrating multiple ASR text processing tasks related to speech recognition into the ASR model. This integration n… ▽ More

    Submitted 20 September, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP2024

  50. arXiv:2308.16437  [pdf, other

    cs.IR cs.LG

    AntM$^{2}$C: A Large Scale Dataset For Multi-Scenario Multi-Modal CTR Prediction

    Authors: Zhaoxin Huan, Ke Ding, Ang Li, Xiaolu Zhang, Xu Min, Yong He, Liang Zhang, Jun Zhou, Linjian Mo, Jinjie Gu, Zhongyi Liu, Wenliang Zhong, Guannan Zhang

    Abstract: Click-through rate (CTR) prediction is a crucial issue in recommendation systems. There has been an emergence of various public CTR datasets. However, existing datasets primarily suffer from the following limitations. Firstly, users generally click different types of items from multiple scenarios, and modeling from multiple scenarios can provide a more comprehensive understanding of users. Existin… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.