Skip to main content

Showing 1–50 of 106 results for author: Lian, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.06964  [pdf, other

    cs.CV

    Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach

    Authors: Taolin Zhang, Jiawang Bai, Zhihe Lu, Dongze Lian, Genping Wang, Xinchao Wang, Shu-Tao Xia

    Abstract: Recent works on parameter-efficient transfer learning (PETL) show the potential to adapt a pre-trained Vision Transformer to downstream recognition tasks with only a few learnable parameters. However, since they usually insert new structures into the pre-trained model, entire intermediate features of that model are changed and thus need to be stored to be involved in back-propagation, resulting in… ▽ More

    Submitted 14 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: ECCV2024

  2. arXiv:2407.06645  [pdf, other

    cs.LG cs.CL

    Entropy Law: The Story Behind Data Compression and LLM Performance

    Authors: Mingjia Yin, Chuhan Wu, Yufei Wang, Hao Wang, Wei Guo, Yasheng Wang, Yong Liu, Ruiming Tang, Defu Lian, Enhong Chen

    Abstract: Data is the cornerstone of large language models (LLMs), but not all data is useful for model learning. Carefully selected data can better elicit the capabilities of LLMs with much less computational overhead. Most methods concentrate on evaluating the quality of individual samples in data selection, while the combinatorial effects among samples are neglected. Even if each sample is of perfect qua… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

  3. arXiv:2407.03125  [pdf, other

    cs.LG cs.AI

    Foundations and Frontiers of Graph Learning Theory

    Authors: Yu Huang, Min Zhou, Menglin Yang, Zhen Wang, Muhan Zhang, Jie Wang, Hong Xie, Hao Wang, Defu Lian, Enhong Chen

    Abstract: Recent advancements in graph learning have revolutionized the way to understand and analyze data with complex structures. Notably, Graph Neural Networks (GNNs), i.e. neural network architectures designed for learning graph representations, have become a popular paradigm. With these models being usually characterized by intuition-driven design or highly intricate components, placing them within the… ▽ More

    Submitted 7 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: 35pages,273references. Github link: https://github.com/minehly/awesome-paper-for-graph-learning-theory

  4. arXiv:2406.12251  [pdf, other

    cs.CL cs.AI cs.LG

    Mitigate Negative Transfer with Similarity Heuristic Lifelong Prompt Tuning

    Authors: Chenyuan Wu, Gangwei Jiang, Defu Lian

    Abstract: Lifelong prompt tuning has significantly advanced parameter-efficient lifelong learning with its efficiency and minimal storage demands on various tasks. Our empirical studies, however, highlights certain transferability constraints in the current methodologies: a universal algorithm that guarantees consistent positive transfer across all tasks is currently unattainable, especially when dealing di… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Findings

  5. arXiv:2406.12227  [pdf, other

    cs.AI

    Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector

    Authors: Gangwei Jiang, Caigao Jiang, Zhaoyi Li, Siqiao Xue, Jun Zhou, Linqi Song, Defu Lian, Ying Wei

    Abstract: Fine-tuning large language models (LLMs) can cause them to lose their general capabilities. However, the intrinsic mechanisms behind such forgetting remain unexplored. In this paper, we begin by examining this phenomenon by focusing on knowledge understanding and instruction following, with the latter identified as the main contributor to forgetting during fine-tuning. Consequently, we propose the… ▽ More

    Submitted 24 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  6. arXiv:2406.12178  [pdf, other

    cs.CV

    FCA-RAC: First Cycle Annotated Repetitive Action Counting

    Authors: Jiada Lu, WeiWei Zhou, Xiang Qian, Dongze Lian, Yanyu Xu, Weifeng Wang, Lina Cao, Shenghua Gao

    Abstract: Repetitive action counting quantifies the frequency of specific actions performed by individuals. However, existing action-counting datasets have limited action diversity, potentially hampering model performance on unseen actions. To address this issue, we propose a framework called First Cycle Annotated Repetitive Action Counting (FCA-RAC). This framework contains 4 parts: 1) a labeling technique… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  7. arXiv:2406.03085  [pdf, other

    cs.LG cs.IR

    Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation

    Authors: Tingjia Shen, Hao Wang, Jiaqing Zhang, Sirui Zhao, Liangyue Li, Zulong Chen, Defu Lian, Enhong Chen

    Abstract: Cross-Domain Sequential Recommendation (CDSR) aims to mine and transfer users' sequential preferences across different domains to alleviate the long-standing cold-start issue. Traditional CDSR models capture collaborative information through user and item modeling while overlooking valuable semantic information. Recently, Large Language Model (LLM) has demonstrated powerful semantic reasoning capa… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures

    ACM Class: I.2.7

  8. arXiv:2406.01027  [pdf, other

    cs.DB cs.LG

    PRICE: A Pretrained Model for Cross-Database Cardinality Estimation

    Authors: Tianjing Zeng, Junwei Lan, Jiahong Ma, Wenqing Wei, Rong Zhu, Pengfei Li, Bolin Ding, Defu Lian, Zhewei Wei, Jingren Zhou

    Abstract: Cardinality estimation (CardEst) is essential for optimizing query execution plans. Recent ML-based CardEst methods achieve high accuracy but face deployment challenges due to high preparation costs and lack of transferability across databases. In this paper, we propose PRICE, a PRetrained multI-table CardEst model, which addresses these limitations. PRICE takes low-level but transferable features… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  9. arXiv:2405.17795  [pdf, other

    cs.IR

    Dataset Regeneration for Sequential Recommendation

    Authors: Mingjia Yin, Hao Wang, Wei Guo, Yong Liu, Suojuan Zhang, Sirui Zhao, Defu Lian, Enhong Chen

    Abstract: The sequential recommender (SR) system is a crucial component of modern recommender systems, as it aims to capture the evolving preferences of users. Significant efforts have been made to enhance the capabilities of SR systems. These methods typically follow the model-centric paradigm, which involves developing effective models based on fixed datasets. However, this approach often overlooks potent… ▽ More

    Submitted 3 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  10. arXiv:2405.12473  [pdf, other

    cs.IR cs.AI

    Learning Partially Aligned Item Representation for Cross-Domain Sequential Recommendation

    Authors: Mingjia Yin, Hao Wang, Wei Guo, Yong Liu, Zhi Li, Sirui Zhao, Defu Lian, Enhong Chen

    Abstract: Cross-domain sequential recommendation (CDSR) aims to uncover and transfer users' sequential preferences across multiple recommendation domains. While significant endeavors have been made, they primarily concentrated on developing advanced transfer modules and aligning user representations using self-supervised learning techniques. However, the problem of aligning item representations has received… ▽ More

    Submitted 3 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

  11. arXiv:2405.10596  [pdf, other

    cs.IR

    CELA: Cost-Efficient Language Model Alignment for CTR Prediction

    Authors: Xingmei Wang, Weiwen Liu, Xiaolong Chen, Qi Liu, Xu Huang, Defu Lian, Xiangyang Li, Yasheng Wang, Zhenhua Dong, Ruiming Tang

    Abstract: Click-Through Rate (CTR) prediction holds a paramount position in recommender systems. The prevailing ID-based paradigm underperforms in cold-start scenarios due to the skewed distribution of feature frequency. Additionally, the utilization of a single modality fails to exploit the knowledge contained within textual features. Recent efforts have sought to mitigate these challenges by integrating P… ▽ More

    Submitted 17 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 10 pages, 5 figures

    MSC Class: 68T07

  12. arXiv:2405.06510  [pdf, other

    cs.AI

    UniDM: A Unified Framework for Data Manipulation with Large Language Models

    Authors: Yichen Qian, Yongyi He, Rong Zhu, Jintao Huang, Zhijian Ma, Haibin Wang, Yaohua Wang, Xiuyu Sun, Defu Lian, Bolin Ding, Jingren Zhou

    Abstract: Designing effective data manipulation methods is a long standing problem in data lakes. Traditional methods, which rely on rules or machine learning models, require extensive human efforts on training data collection and tuning models. Recent methods apply Large Language Models (LLMs) to resolve multiple data manipulation tasks. They exhibit bright benefits in terms of performance but still requir… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: MLSys24

  13. arXiv:2404.18533  [pdf, other

    cs.AI cs.HC

    Evaluating Concept-based Explanations of Language Models: A Study on Faithfulness and Readability

    Authors: Meng Li, Haoran Jin, Ruixuan Huang, Zhihao Xu, Defu Lian, Zijia Lin, Di Zhang, Xiting Wang

    Abstract: Despite the surprisingly high intelligence exhibited by Large Language Models (LLMs), we are somehow intimidated to fully deploy them into real-life applications considering their black-box nature. Concept-based explanations arise as a promising avenue for explaining what the LLMs have learned, making them more transparent to humans. However, current evaluations for concepts tend to be heuristic a… ▽ More

    Submitted 29 April, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  14. arXiv:2404.16587  [pdf, other

    cs.CL cs.AI

    Understanding Privacy Risks of Embeddings Induced by Large Language Models

    Authors: Zhihao Zhu, Ninglu Shao, Defu Lian, Chenwang Wu, Zheng Liu, Yi Yang, Enhong Chen

    Abstract: Large language models (LLMs) show early signs of artificial general intelligence but struggle with hallucinations. One promising solution to mitigate these hallucinations is to store external knowledge as embeddings, aiding LLMs in retrieval-augmented generation. However, such a solution risks compromising privacy, as recent studies experimentally showed that the original text can be partially rec… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  15. arXiv:2404.07456  [pdf, other

    cs.AI cs.MA

    WESE: Weak Exploration to Strong Exploitation for LLM Agents

    Authors: Xu Huang, Weiwen Liu, Xiaolong Chen, Xingmei Wang, Defu Lian, Yasheng Wang, Ruiming Tang, Enhong Chen

    Abstract: Recently, large language models (LLMs) have demonstrated remarkable potential as an intelligent agent. However, existing researches mainly focus on enhancing the agent's reasoning or decision-making abilities through well-designed prompt engineering or task-specific fine-tuning, ignoring the procedure of exploration and exploitation. When addressing complex tasks within open-world interactive envi… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  16. arXiv:2404.04232  [pdf, other

    cs.CL

    Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation

    Authors: Tianqi Zhong, Zhaoyi Li, Quan Wang, Linqi Song, Ying Wei, Defu Lian, Zhendong Mao

    Abstract: Compositional generalization, representing the model's ability to generate text with new attribute combinations obtained by recombining single attributes from the training data, is a crucial property for multi-aspect controllable text generation (MCTG) methods. Nonetheless, a comprehensive compositional generalization evaluation benchmark of MCTG is still lacking. We propose CompMCTG, a benchmark… ▽ More

    Submitted 3 June, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

    Comments: Accepted to ACL 2024 (Main); 32 pages

  17. arXiv:2404.00268  [pdf, other

    cs.IR

    A Unified Framework for Adaptive Representation Enhancement and Inversed Learning in Cross-Domain Recommendation

    Authors: Luankang Zhang, Hao Wang, Suojuan Zhang, Mingjia Yin, Yongqiang Han, Jiaqing Zhang, Defu Lian, Enhong Chen

    Abstract: Cross-domain recommendation (CDR), aiming to extract and transfer knowledge across domains, has attracted wide attention for its efficacy in addressing data sparsity and cold-start problems. Despite significant advances in representation disentanglement to capture diverse user preferences, existing methods usually neglect representation enhancement and lack rigorous decoupling constraints, thereby… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted by DASFAA 2024

  18. arXiv:2403.17603  [pdf, other

    cs.IR

    END4Rec: Efficient Noise-Decoupling for Multi-Behavior Sequential Recommendation

    Authors: Yongqiang Han, Hao Wang, Kefan Wang, Likang Wu, Zhi Li, Wei Guo, Yong Liu, Defu Lian, Enhong Chen

    Abstract: In recommendation systems, users frequently engage in multiple types of behaviors, such as clicking, adding to a cart, and purchasing. However, with diversified behavior data, user behavior sequences will become very long in the short term, which brings challenges to the efficiency of the sequence recommendation model. Meanwhile, some behavior data will also bring inevitable noise to the modeling… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  19. arXiv:2403.09747  [pdf, other

    cs.CL cs.AI

    Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

    Authors: Guanghua Li, Wensheng Lu, Wei Zhang, Defu Lian, Kezhong Lu, Rui Mao, Kai Shu, Hao Liao

    Abstract: The proliferation of fake news has had far-reaching implications on politics, the economy, and society at large. While Fake news detection methods have been employed to mitigate this issue, they primarily depend on two essential elements: the quality and relevance of the evidence, and the effectiveness of the verdict prediction mechanism. Traditional methods, which often source information from st… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  20. arXiv:2403.08840  [pdf, other

    cs.CV cs.AI

    NoiseDiffusion: Correcting Noise for Image Interpolation with Diffusion Models beyond Spherical Linear Interpolation

    Authors: PengFei Zheng, Yonggang Zhang, Zhen Fang, Tongliang Liu, Defu Lian, Bo Han

    Abstract: Image interpolation based on diffusion models is promising in creating fresh and interesting images. Advanced interpolation methods mainly focus on spherical linear interpolation, where images are encoded into the noise space and then interpolated for denoising to images. However, existing methods face challenges in effectively interpolating natural images (not generated by diffusion models), ther… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: ICLR 2024

  21. arXiv:2402.18899  [pdf, other

    cs.IR

    Aligning Language Models for Versatile Text-based Item Retrieval

    Authors: Yuxuan Lei, Jianxun Lian, Jing Yao, Mingqi Wu, Defu Lian, Xing Xie

    Abstract: This paper addresses the gap between general-purpose text embeddings and the specific demands of item retrieval tasks. We demonstrate the shortcomings of existing models in capturing the nuances necessary for zero-shot performance on item retrieval tasks. To overcome these limitations, we propose generate in-domain dataset from ten tasks tailored to unlocking models' representation ability for ite… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 4 pages,1 figures, 4 tables

  22. arXiv:2402.16312  [pdf, other

    cs.LG cs.AI

    Federated Contextual Cascading Bandits with Asynchronous Communication and Heterogeneous Users

    Authors: Hantao Yang, Xutong Liu, Zhiyong Wang, Hong Xie, John C. S. Lui, Defu Lian, Enhong Chen

    Abstract: We study the problem of federated contextual combinatorial cascading bandits, where $|\mathcal{U}|$ agents collaborate under the coordination of a central server to provide tailored recommendations to the $|\mathcal{U}|$ corresponding users. Existing works consider either a synchronous framework, necessitating full agent participation and global synchronization, or assume user homogeneity with ide… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI 2024

  23. arXiv:2402.14328  [pdf, other

    cs.CL

    Understanding and Patching Compositional Reasoning in LLMs

    Authors: Zhaoyi Li, Gangwei Jiang, Hong Xie, Linqi Song, Defu Lian, Ying Wei

    Abstract: LLMs have marked a revolutonary shift, yet they falter when faced with compositional reasoning tasks. Our research embarks on a quest to uncover the root causes of compositional reasoning failures of LLMs, uncovering that most of them stem from the improperly generated or leveraged implicit reasoning results. Inspired by our empirical findings, we resort to Logit Lens and an intervention experimen… ▽ More

    Submitted 6 June, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL'2024 Findings

  24. arXiv:2402.03216  [pdf, other

    cs.CL cs.AI cs.LG

    BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

    Authors: Jianlv Chen, Shitao Xiao, Peitian Zhang, Kun Luo, Defu Lian, Zheng Liu

    Abstract: In this paper, we present a new embedding model, called M3-Embedding, which is distinguished for its versatility in Multi-Linguality, Multi-Functionality, and Multi-Granularity. It can support more than 100 working languages, leading to new state-of-the-art performances on multi-lingual and cross-lingual retrieval tasks. It can simultaneously perform the three common retrieval functionalities of e… ▽ More

    Submitted 28 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  25. arXiv:2402.02716  [pdf, other

    cs.AI cs.CL cs.LG

    Understanding the planning of LLM agents: A survey

    Authors: Xu Huang, Weiwen Liu, Xiaolong Chen, Xingmei Wang, Hao Wang, Defu Lian, Yasheng Wang, Ruiming Tang, Enhong Chen

    Abstract: As Large Language Models (LLMs) have shown significant intelligence, the progress to leverage LLMs as planning modules of autonomous agents has attracted more attention. This survey provides the first systematic view of LLM-based agents planning, covering recent works aiming to improve planning ability. We provide a taxonomy of existing works on LLM-Agent planning, which can be categorized into Ta… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 9 pages, 2 tables, 2 figures

  26. arXiv:2401.12700  [pdf, other

    cs.AI

    Securing Recommender System via Cooperative Training

    Authors: Qingyang Wang, Chenwang Wu, Defu Lian, Enhong Chen

    Abstract: Recommender systems are often susceptible to well-crafted fake profiles, leading to biased recommendations. Among existing defense methods, data-processing-based methods inevitably exclude normal samples, while model-based methods struggle to enjoy both generalization and robustness. To this end, we suggest integrating data processing and the robust model to propose a general framework, Triple Coo… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: text overlap with arXiv:2210.13762

  27. arXiv:2312.11571  [pdf, other

    cs.CR cs.AI cs.LG

    Model Stealing Attack against Recommender System

    Authors: Zhihao Zhu, Rui Fan, Chenwang Wu, Yi Yang, Defu Lian, Enhong Chen

    Abstract: Recent studies have demonstrated the vulnerability of recommender systems to data privacy attacks. However, research on the threat to model privacy in recommender systems, such as model stealing attacks, is still in its infancy. Some adversarial attacks have achieved model stealing attacks against recommender systems, to some extent, by collecting abundant training data of the target model (target… ▽ More

    Submitted 26 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

  28. arXiv:2312.10943  [pdf, other

    cs.LG cs.CR

    Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity

    Authors: Zhihao Zhu, Chenwang Wu, Rui Fan, Yi Yang, Defu Lian, Enhong Chen

    Abstract: Recent research demonstrates that GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions. However, they mainly focus on node classification tasks, neglecting the potential threats entailed within the domain of graph classification tasks. Furthermore, their practicality is questionable due to unreasonable assumptions,… ▽ More

    Submitted 26 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

  29. arXiv:2312.08746  [pdf, other

    cs.CV

    DreamDrone

    Authors: Hanyang Kong, Dongze Lian, Michael Bi Mi, Xinchao Wang

    Abstract: We introduce DreamDrone, an innovative method for generating unbounded flythrough scenes from textual prompts. Central to our method is a novel feature-correspondence-guidance diffusion process, which utilizes the strong correspondence of intermediate features in the diffusion model. Leveraging this guidance strategy, we further propose an advanced technique for editing the intermediate latent cod… ▽ More

    Submitted 17 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 16 pages, 12 figures, project page: https://hyokong.github.io/dreamdrone-page/

  30. arXiv:2312.06683  [pdf, other

    cs.IR

    AT4CTR: Auxiliary Match Tasks for Enhancing Click-Through Rate Prediction

    Authors: Qi Liu, Xuyang Hou, Defu Lian, Zhe Wang, Haoran Jin, Jia Cheng, Jun Lei

    Abstract: Click-through rate (CTR) prediction is a vital task in industrial recommendation systems. Most existing methods focus on the network architecture design of the CTR model for better accuracy and suffer from the data sparsity problem. Especially in industrial recommendation systems, the widely applied negative sample down-sampling technique due to resource limitation worsens the problem, resulting i… ▽ More

    Submitted 18 December, 2023; v1 submitted 9 December, 2023; originally announced December 2023.

  31. arXiv:2312.06226  [pdf, other

    cs.CV cs.AI

    Invariant Representation via Decoupling Style and Spurious Features from Images

    Authors: Ruimeng Li, Yuanhao Pu, Zhaoyi Li, Hong Xie, Defu Lian

    Abstract: This paper considers the out-of-distribution (OOD) generalization problem under the setting that both style distribution shift and spurious features exist and domain labels are missing. This setting frequently arises in real-world applications and is underlooked because previous approaches mainly handle either of these two factors. The critical challenge is decoupling style and spurious features i… ▽ More

    Submitted 1 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: 10 pages, 12 figures

    ACM Class: I.2.6; I.2.10

  32. RecExplainer: Aligning Large Language Models for Explaining Recommendation Models

    Authors: Yuxuan Lei, Jianxun Lian, Jing Yao, Xu Huang, Defu Lian, Xing Xie

    Abstract: Recommender systems are widely used in online services, with embedding-based models being particularly popular due to their expressiveness in representing complex signals. However, these models often function as a black box, making them less transparent and reliable for both users and developers. Recently, large language models (LLMs) have demonstrated remarkable intelligence in understanding, rea… ▽ More

    Submitted 22 June, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: 12 pages, 9 figures, 5 tables

  33. arXiv:2311.10764  [pdf, other

    cs.IR cs.AI

    Deep Group Interest Modeling of Full Lifelong User Behaviors for CTR Prediction

    Authors: Qi Liu, Xuyang Hou, Haoran Jin, jin Chen, Zhe Wang, Defu Lian, Tan Qu, Jia Cheng, Jun Lei

    Abstract: Extracting users' interests from their lifelong behavior sequence is crucial for predicting Click-Through Rate (CTR). Most current methods employ a two-stage process for efficiency: they first select historical behaviors related to the candidate item and then deduce the user's interest from this narrowed-down behavior sub-sequence. This two-stage paradigm, though effective, leads to information lo… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  34. arXiv:2311.06184  [pdf, other

    cs.LG cs.AI

    Frequency-domain MLPs are More Effective Learners in Time Series Forecasting

    Authors: Kun Yi, Qi Zhang, Wei Fan, Shoujin Wang, Pengyang Wang, Hui He, Defu Lian, Ning An, Longbing Cao, Zhendong Niu

    Abstract: Time series forecasting has played the key role in different industrial, including finance, traffic, energy, and healthcare domains. While existing literatures have designed many sophisticated architectures based on RNNs, GNNs, or Transformers, another kind of approaches based on multi-layer perceptrons (MLPs) are proposed with simple structure, low complexity, and {superior performance}. However,… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  35. arXiv:2311.03427  [pdf, other

    cs.CV

    TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding

    Authors: Shuo Wang, Jing Li, Zibo Zhao, Dongze Lian, Binbin Huang, Xiaomei Wang, Zhengxin Li, Shenghua Gao

    Abstract: Holistic scene understanding includes semantic segmentation, surface normal estimation, object boundary detection, depth estimation, etc. The key aspect of this problem is to learn representation effectively, as each subtask builds upon not only correlated but also distinct attributes. Inspired by visual-prompt tuning, we propose a Task-Specific Prompts Transformer, dubbed TSP-Transformer, for hol… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: WACV 2024

  36. APGL4SR: A Generic Framework with Adaptive and Personalized Global Collaborative Information in Sequential Recommendation

    Authors: Mingjia Yin, Hao Wang, Xiang Xu, Likang Wu, Sirui Zhao, Wei Guo, Yong Liu, Ruiming Tang, Defu Lian, Enhong Chen

    Abstract: The sequential recommendation system has been widely studied for its promising effectiveness in capturing dynamic preferences buried in users' sequential behaviors. Despite the considerable achievements, existing methods usually focus on intra-sequence modeling while overlooking exploiting global collaborative information by inter-sequence modeling, resulting in inferior recommendation performance… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

  37. arXiv:2310.13260  [pdf, other

    cs.IR

    A Data-Centric Multi-Objective Learning Framework for Responsible Recommendation Systems

    Authors: Xu Huang, Jianxun Lian, Hao Wang, Defu Lian, Xing Xie

    Abstract: Recommendation systems effectively guide users in locating their desired information within extensive content repositories. Generally, a recommendation model is optimized to enhance accuracy metrics from a user utility standpoint, such as click-through rate or matching relevance. However, a responsible industrial recommendation system must address not only user utility (responsibility to users) bu… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 10 pages

  38. arXiv:2310.13024  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Anytime Fine-tuning: Continually Pre-trained Language Models with Hypernetwork Prompt

    Authors: Gangwei Jiang, Caigao Jiang, Siqiao Xue, James Y. Zhang, Jun Zhou, Defu Lian, Ying Wei

    Abstract: Continual pre-training has been urgent for adapting a pre-trained model to a multitude of domains and tasks in the fast-evolving world. In practice, a continually pre-trained model is expected to demonstrate not only greater capacity when fine-tuned on pre-trained domains but also a non-decreasing performance on unseen ones. In this work, we first investigate such anytime fine-tuning effectiveness… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

  39. arXiv:2310.05753  [pdf, other

    cs.AI

    Large-Scale OD Matrix Estimation with A Deep Learning Method

    Authors: Zheli Xiong, Defu Lian, Enhong Chen, Gang Chen, Xiaomin Cheng

    Abstract: The estimation of origin-destination (OD) matrices is a crucial aspect of Intelligent Transport Systems (ITS). It involves adjusting an initial OD matrix by regressing the current observations like traffic counts of road sections (e.g., using least squares). However, the OD estimation problem lacks sufficient constraints and is mathematically underdetermined. To alleviate this problem, some resear… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: 12 pages,25 figures

    MSC Class: I.2.1

  40. arXiv:2309.17278  [pdf, other

    cs.LG cs.CR cs.IR

    Toward Robust Recommendation via Real-time Vicinal Defense

    Authors: Yichang Xu, Chenwang Wu, Defu Lian

    Abstract: Recommender systems have been shown to be vulnerable to poisoning attacks, where malicious data is injected into the dataset to cause the recommender system to provide biased recommendations. To defend against such attacks, various robust learning methods have been proposed. However, most methods are model-specific or attack-specific, making them lack generality, while other methods, such as adver… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

  41. arXiv:2309.14907  [pdf, other

    cs.LG cs.AI

    Label Deconvolution for Node Representation Learning on Large-scale Attributed Graphs against Learning Bias

    Authors: Zhihao Shi, Jie Wang, Fanghua Lu, Hanzhu Chen, Defu Lian, Zheng Wang, Jieping Ye, Feng Wu

    Abstract: Node representation learning on attributed graphs -- whose nodes are associated with rich attributes (e.g., texts and protein sequences) -- plays a crucial role in many important downstream tasks. To encode the attributes and graph structures simultaneously, recent studies integrate pre-trained models with graph neural networks (GNNs), where pre-trained models serve as node encoders (NEs) to encod… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  42. arXiv:2309.13625  [pdf, other

    cs.CV cs.AI

    GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph

    Authors: Xin Li, Dongze Lian, Zhihe Lu, Jiawang Bai, Zhibo Chen, Xinchao Wang

    Abstract: Adapter-style efficient transfer learning (ETL) has shown excellent performance in the tuning of vision-language models (VLMs) under the low-data regime, where only a few additional parameters are introduced to excavate the task-specific knowledge based on the general and powerful representation of VLMs. However, most adapter-style works face two limitations: (i) modeling task-specific knowledge w… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

    Comments: Accepted by NeurIPS 2023. The manuscript will be further revised based on the reviews

  43. arXiv:2309.07597  [pdf, other

    cs.CL cs.AI cs.IR

    C-Pack: Packaged Resources To Advance General Chinese Embedding

    Authors: Shitao Xiao, Zheng Liu, Peitian Zhang, Niklas Muennighoff, Defu Lian, Jian-Yun Nie

    Abstract: We introduce C-Pack, a package of resources that significantly advance the field of general Chinese embeddings. C-Pack includes three critical resources. 1) C-MTEB is a comprehensive benchmark for Chinese text embeddings covering 6 tasks and 35 datasets. 2) C-MTP is a massive text embedding dataset curated from labeled and unlabeled Chinese corpora for training embedding models. 3) C-TEM is a fami… ▽ More

    Submitted 11 May, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted by SIGIR 2024

  44. arXiv:2309.01453  [pdf, ps, other

    cs.IR cs.AI

    Interactive Graph Convolutional Filtering

    Authors: Jin Zhang, Defu Lian, Hong Xie, Yawen Li, Enhong Chen

    Abstract: Interactive Recommender Systems (IRS) have been increasingly used in various domains, including personalized article recommendation, social media, and online advertising. However, IRS faces significant challenges in providing accurate recommendations under limited observations, especially in the context of interactive collaborative filtering. These problems are exacerbated by the cold start proble… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  45. arXiv:2308.16505  [pdf, other

    cs.IR cs.AI

    Recommender AI Agent: Integrating Large Language Models for Interactive Recommendations

    Authors: Xu Huang, Jianxun Lian, Yuxuan Lei, Jing Yao, Defu Lian, Xing Xie

    Abstract: Recommender models excel at providing domain-specific item recommendations by leveraging extensive user behavior data. Despite their ability to act as lightweight domain experts, they struggle to perform versatile tasks such as providing explanations and engaging in conversations. On the other hand, large language models (LLMs) represent a significant step towards artificial general intelligence,… ▽ More

    Submitted 29 January, 2024; v1 submitted 31 August, 2023; originally announced August 2023.

    Comments: 18 pages, 17 figures, 7 tables

  46. arXiv:2308.14480  [pdf, other

    cs.CV cs.MM

    Priority-Centric Human Motion Generation in Discrete Latent Space

    Authors: Hanyang Kong, Kehong Gong, Dongze Lian, Michael Bi Mi, Xinchao Wang

    Abstract: Text-to-motion generation is a formidable task, aiming to produce human motions that align with the input text while also adhering to human capabilities and physical laws. While there have been advancements in diffusion models, their application in discrete spaces remains underexplored. Current methods often overlook the varying significance of different motions, treating them uniformly. It is ess… ▽ More

    Submitted 30 August, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV2023

  47. arXiv:2308.10524  [pdf, other

    cs.CV cs.AI

    Dataset Quantization

    Authors: Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, Jiashi Feng

    Abstract: State-of-the-art deep neural networks are trained with large amounts (millions or even billions) of data. The expensive computation and memory costs make it difficult to train them on limited hardware resources, especially for recent popular large language models (LLM) and computer vision models (CV). Recent popular dataset distillation methods are thus developed, aiming to reduce the number of tr… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: 9 pages

  48. arXiv:2308.08563  [pdf, other

    cs.LG cs.AI cs.IR

    KMF: Knowledge-Aware Multi-Faceted Representation Learning for Zero-Shot Node Classification

    Authors: Likang Wu, Junji Jiang, Hongke Zhao, Hao Wang, Defu Lian, Mengdi Zhang, Enhong Chen

    Abstract: Recently, Zero-Shot Node Classification (ZNC) has been an emerging and crucial task in graph data analysis. This task aims to predict nodes from unseen classes which are unobserved in the training process. Existing work mainly utilizes Graph Neural Networks (GNNs) to associate features' prototypes and labels' semantics thus enabling knowledge transfer from seen to unseen classes. However, the mult… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

  49. arXiv:2308.05996  [pdf, other

    cs.AI

    Deep Task-specific Bottom Representation Network for Multi-Task Recommendation

    Authors: Qi Liu, Zhilong Zhou, Gangwei Jiang, Tiezheng Ge, Defu Lian

    Abstract: Neural-based multi-task learning (MTL) has gained significant improvement, and it has been successfully applied to recommendation system (RS). Recent deep MTL methods for RS (e.g. MMoE, PLE) focus on designing soft gating-based parameter-sharing networks that implicitly learn a generalized representation for each task. However, MTL methods may suffer from performance degeneration when dealing with… ▽ More

    Submitted 17 August, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

    Comments: CIKM'23

  50. arXiv:2307.16376  [pdf, other

    cs.IR cs.AI cs.CL

    When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities

    Authors: Jin Chen, Zheng Liu, Xu Huang, Chenwang Wu, Qi Liu, Gangwei Jiang, Yuanhao Pu, Yuxuan Lei, Xiaolong Chen, Xingmei Wang, Defu Lian, Enhong Chen

    Abstract: The advent of large language models marks a revolutionary breakthrough in artificial intelligence. With the unprecedented scale of training and model parameters, the capability of large language models has been dramatically improved, leading to human-like performances in understanding, language synthesizing, and common-sense reasoning, etc. Such a major leap-forward in general AI capacity will cha… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.