Skip to main content

Showing 1–50 of 121 results for author: Shi, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.06083  [pdf, other

    cs.LG cs.IR

    A Survey of Controllable Learning: Methods and Applications in Information Retrieval

    Authors: Chenglei Shen, Xiao Zhang, Teng Shi, Changshuo Zhang, Guofu Xie, Jun Xu

    Abstract: Controllable learning (CL) emerges as a critical component in trustworthy machine learning, ensuring that learners meet predefined targets and can adaptively adjust without retraining according to the changes in those targets. We provide a formal definition of CL, and discuss its applications in information retrieval (IR) where information needs are often complex and dynamic. The survey categorize… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2407.03332  [pdf, other

    cs.CV cs.AI

    DDPM-MoCo: Advancing Industrial Surface Defect Generation and Detection with Generative and Contrastive Learning

    Authors: Yangfan He, Xinyan Wang, Tianyu Shi

    Abstract: The task of industrial detection based on deep learning often involves solving two problems: (1) obtaining sufficient and effective data samples, (2) and using efficient and convenient model training methods. In this paper, we introduce a novel defect-generation method, named DDPM-MoCo, to address these issues. Firstly, we utilize the Denoising Diffusion Probabilistic Model (DDPM) to generate high… ▽ More

    Submitted 9 May, 2024; originally announced July 2024.

  3. arXiv:2407.01219  [pdf, other

    cs.CL

    Searching for Best Practices in Retrieval-Augmented Generation

    Authors: Xiaohua Wang, Zhenghua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qi Qian, Ruicheng Yin, Changze Lv, Xiaoqing Zheng, Xuanjing Huang

    Abstract: Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been proposed to enhance large language models through query-dependent retrievals, these approaches still suffer from their complex implementation and prolong… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  4. arXiv:2406.17807  [pdf, other

    cs.CL cs.AI

    Enhancing Commentary Strategies for Imperfect Information Card Games: A Study of Large Language Models in Guandan Commentary

    Authors: Meiling Tao, Xuechen Liang, Yiling Tao, Tianyu Shi

    Abstract: Recent advancements in large language models (LLMs) have unlocked the potential for generating high-quality game commentary. However, producing insightful and engaging commentary for complex games with incomplete information remains a significant challenge. In this paper, we introduce a novel commentary method that combine Reinforcement Learning (RL) and LLMs, tailored specifically for the Chinese… ▽ More

    Submitted 4 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  5. arXiv:2406.16942  [pdf, other

    eess.IV cs.AI cs.CV

    Enhancing Diagnostic Reliability of Foundation Model with Uncertainty Estimation in OCT Images

    Authors: Yuanyuan Peng, Aidi Lin, Meng Wang, Tian Lin, Ke Zou, Yinglin Cheng, Tingkun Shi, Xulong Liao, Lixia Feng, Zhen Liang, Xinjian Chen, Huazhu Fu, Haoyu Chen

    Abstract: Inability to express the confidence level and detect unseen classes has limited the clinical implementation of artificial intelligence in the real-world. We developed a foundation model with uncertainty estimation (FMUE) to detect 11 retinal conditions on optical coherence tomography (OCT). In the internal test set, FMUE achieved a higher F1 score of 96.76% than two state-of-the-art algorithms, RE… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: All codes are available at https://github.com/yuanyuanpeng0129/FMUE

  6. arXiv:2406.16062  [pdf, other

    cs.NE

    Towards Biologically Plausible Computing: A Comprehensive Comparison

    Authors: Changze Lv, Yufei Gu, Zhengkang Guo, Zhibo Xu, Yixin Wu, Feiran Zhang, Tianyuan Shi, Zhenghua Wang, Ruicheng Yin, Yu Shang, Siqi Zhong, Xiaohua Wang, Muling Wu, Wenhao Liu, Tianlong Li, Jianhao Zhu, Cenyuan Zhang, Zixuan Ling, Xiaoqing Zheng

    Abstract: Backpropagation is a cornerstone algorithm in training neural networks for supervised learning, which uses a gradient descent method to update network weights by minimizing the discrepancy between actual and desired outputs. Despite its pivotal role in propelling deep learning advancements, the biological plausibility of backpropagation is questioned due to its requirements for weight symmetry, gl… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  7. arXiv:2406.09317  [pdf, other

    eess.IV cs.CV

    Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases

    Authors: Meng Wang, Tian Lin, Aidi Lin, Kai Yu, Yuanyuan Peng, Lianyu Wang, Cheng Chen, Ke Zou, Huiyu Liang, Man Chen, Xue Yao, Meiqin Zhang, Binwei Huang, Chaoxin Zheng, Peixin Zhang, Wei Chen, Yilong Luo, Yifan Chen, Honghe Xia, Tingkun Shi, Qi Zhang, Jinming Guo, Xiaolin Chen, Jingcheng Wang, Yih Chung Tham , et al. (24 additional authors not shown)

    Abstract: Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  8. arXiv:2406.07590  [pdf, other

    cs.LG cs.AI

    StreamPrompt: Learnable Prompt-guided Data Selection for Efficient Stream Learning

    Authors: Tongjun Shi, Shuhao Zhang

    Abstract: Stream Learning (SL) requires models to rapidly adapt to continuous data streams, setting it apart from traditional Continual Learning (CL). Recent SL methods emphasize efficiency by selecting data subsets for training, but they often struggle due to their reliance on static, rule-based selection algorithms that cannot effectively adapt to the changing importance of data. In this work, we introduc… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  9. arXiv:2406.05628  [pdf, other

    cs.LG

    Domain Generalization Guided by Large-Scale Pre-Trained Priors

    Authors: Zongbin Wang, Bin Pan, Shiyu Shen, Tianyang Shi, Zhenwei Shi

    Abstract: Domain generalization (DG) aims to train a model from limited source domains, allowing it to generalize to unknown target domains. Typically, DG models only employ large-scale pre-trained models during the initialization of fine-tuning. However, large-scale pre-trained models already possess the ability to resist domain shift. If we reference pre-trained models continuously during fine-tuning to m… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  10. arXiv:2406.04828  [pdf, other

    cs.IR

    QAGCF: Graph Collaborative Filtering for Q&A Recommendation

    Authors: Changshuo Zhang, Teng Shi, Xiao Zhang, Yanping Zheng, Ruobing Xie, Qi Liu, Jun Xu, Ji-Rong Wen

    Abstract: Question and answer (Q&A) platforms usually recommend question-answer pairs to meet users' knowledge acquisition needs, unlike traditional recommendations that recommend only one item. This makes user behaviors more complex, and presents two challenges for Q&A recommendation, including: the collaborative information entanglement, which means user feedback is influenced by either the question or th… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  11. arXiv:2406.04712  [pdf, other

    cs.CL

    AICoderEval: Improving AI Domain Code Generation of Large Language Models

    Authors: Yinghui Xia, Yuyan Chen, Tianyu Shi, Jun Wang, Jinsong Yang

    Abstract: Automated code generation is a pivotal capability of large language models (LLMs). However, assessing this capability in real-world scenarios remains challenging. Previous methods focus more on low-level code generation, such as model loading, instead of generating high-level codes catering for real-world tasks, such as image-to-text, text classification, in various domains. Therefore, we construc… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  12. arXiv:2405.16847  [pdf, other

    cs.CV cs.AI

    TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction

    Authors: Yinda Chen, Haoyuan Shi, Xiaoyu Liu, Te Shi, Ruobing Zhang, Dong Liu, Zhiwei Xiong, Feng Wu

    Abstract: Autoregressive next-token prediction is a standard pretraining method for large-scale language models, but its application to vision tasks is hindered by the non-sequential nature of image data, leading to cumulative errors. Most vision models employ masked autoencoder (MAE) based pretraining, which faces scalability issues. To address these challenges, we introduce \textbf{TokenUnify}, a novel pr… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  13. arXiv:2405.16701  [pdf, other

    cs.CV

    Detail-Enhanced Intra- and Inter-modal Interaction for Audio-Visual Emotion Recognition

    Authors: Tong Shi, Xuri Ge, Joemon M. Jose, Nicolas Pugeault, Paul Henderson

    Abstract: Capturing complex temporal relationships between video and audio modalities is vital for Audio-Visual Emotion Recognition (AVER). However, existing methods lack attention to local details, such as facial state changes between video frames, which can reduce the discriminability of features and thus lower recognition accuracy. In this paper, we propose a Detail-Enhanced Intra- and Inter-modal Intera… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Submitted to 27th International Conference of Pattern Recognition (ICPR 2024)

  14. arXiv:2405.04135  [pdf, other

    cs.AI

    In-context Learning for Automated Driving Scenarios

    Authors: Ziqi Zhou, Jingyue Zhang, Jingyuan Zhang, Boyue Wang, Tianyu Shi, Alaa Khamis

    Abstract: One of the key challenges in current Reinforcement Learning (RL)-based Automated Driving (AD) agents is achieving flexible, precise, and human-like behavior cost-effectively. This paper introduces an innovative approach utilizing Large Language Models (LLMs) to intuitively and effectively optimize RL reward functions in a human-centric way. We developed a framework where instructions and dynamic e… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 7 pages, 6 figures, 35 references

  15. arXiv:2405.02289  [pdf, other

    cs.RO

    TSDiT: Traffic Scene Diffusion Models With Transformers

    Authors: Chen Yang, Tianyu Shi

    Abstract: In this paper, we introduce a novel approach to trajectory generation for autonomous driving, combining the strengths of Diffusion models and Transformers. First, we use the historical trajectory data for efficient preprocessing and generate action latent using a diffusion model with DiT(Diffusion with Transformers) Blocks to increase scene diversity and stochasticity of agent actions. Then, we co… ▽ More

    Submitted 21 December, 2023; originally announced May 2024.

  16. arXiv:2405.01063  [pdf, other

    cs.IR cs.CY cs.LG

    Fair Recommendations with Limited Sensitive Attributes: A Distributionally Robust Optimization Approach

    Authors: Tianhao Shi, Yang Zhang, Jizhi Zhang, Fuli Feng, Xiangnan He

    Abstract: As recommender systems are indispensable in various domains such as job searching and e-commerce, providing equitable recommendations to users with different sensitive attributes becomes an imperative requirement. Prior approaches for enhancing fairness in recommender systems presume the availability of all sensitive attributes, which can be difficult to obtain due to privacy concerns or inadequat… ▽ More

    Submitted 27 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 8 pages, 5 figures, accepted by SIGIR'24

  17. arXiv:2404.12529  [pdf, other

    cs.NI cs.HC

    A Survey of Bluetooth Indoor Localization

    Authors: Taolei Shi, Wei Gong

    Abstract: Nowadays, indoor localization has received extensive research interest due to more and more applications' needs for location information to provide a more precise and effective service [1], [2]. There are various wireless techniques and mechanisms that have been proposed; some of them have been studied in depth and come into use, such as Wi-Fi, RFID, and sensor networks. In comparison, the develop… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 8 pages, 2 figures

  18. arXiv:2404.09520  [pdf, other

    cs.IR

    UniSAR: Modeling User Transition Behaviors between Search and Recommendation

    Authors: Teng Shi, Zihua Si, Jun Xu, Xiao Zhang, Xiaoxue Zang, Kai Zheng, Dewei Leng, Yanan Niu, Yang Song

    Abstract: Nowadays, many platforms provide users with both search and recommendation services as important tools for accessing information. The phenomenon has led to a correlation between user search and recommendation behaviors, providing an opportunity to model user interests in a fine-grained way. Existing approaches either model user search and recommendation behaviors separately or overlook the differe… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by SIGIR 2024

  19. arXiv:2404.08226  [pdf, other

    cs.CV

    Improving Continuous Sign Language Recognition with Adapted Image Models

    Authors: Lianyu Hu, Tongkai Shi, Liqing Gao, Zekang Liu, Wei Feng

    Abstract: The increase of web-scale weakly labelled image-text pairs have greatly facilitated the development of large-scale vision-language models (e.g., CLIP), which have shown impressive generalization performance over a series of downstream tasks. However, the massive model size and scarcity of available data limit their applications to fine-tune the whole model in downstream tasks. Besides, fully fine-… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  20. arXiv:2404.05680  [pdf, other

    cs.CV

    SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation

    Authors: Heyuan Li, Ce Chen, Tianhao Shi, Yuda Qiu, Sizhe An, Guanying Chen, Xiaoguang Han

    Abstract: While recent advances in 3D-aware Generative Adversarial Networks (GANs) have aided the development of near-frontal view human face synthesis, the challenge of comprehensively synthesizing a full 3D head viewable from all angles still persists. Although PanoHead proves the possibilities of using a large-scale dataset with images of both frontal and back views for full-head synthesis, it often caus… ▽ More

    Submitted 16 July, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted by ECCV 2024. Project page: https://lhyfst.github.io/spherehead

  21. arXiv:2404.02082  [pdf, other

    cs.CV

    WcDT: World-centric Diffusion Transformer for Traffic Scene Generation

    Authors: Chen Yang, Aaron Xuxiang Tian, Dong Chen, Tianyu Shi, Arsalan Heydarian

    Abstract: In this paper, we introduce a novel approach for autonomous driving trajectory generation by harnessing the complementary strengths of diffusion probabilistic models (a.k.a., diffusion models) and transformers. Our proposed framework, termed the "World-Centric Diffusion Transformer" (WcDT), optimizes the entire trajectory generation process, from feature extraction to model inference. To enhance t… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 12 pages, 6 figures

  22. arXiv:2404.01663  [pdf, other

    cs.CL

    CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models

    Authors: Xuechen Liang, Meiling Tao, Tianyu Shi, Yiting Xie

    Abstract: Open large language models (LLMs) have significantly advanced the field of natural language processing, showcasing impressive performance across various tasks.Despite the significant advancements in LLMs, their effective operation still relies heavily on human input to accurately guide the dialogue flow, with agent tuning being a crucial optimization technique that involves human adjustments to th… ▽ More

    Submitted 4 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  23. arXiv:2402.11725  [pdf, other

    cs.CL cs.CR cs.CY

    How Susceptible are Large Language Models to Ideological Manipulation?

    Authors: Kai Chen, Zihao He, Jun Yan, Taiwei Shi, Kristina Lerman

    Abstract: Large Language Models (LLMs) possess the potential to exert substantial influence on public perceptions and interactions with information. This raises concerns about the societal impact that could arise if the ideologies within these models can be easily manipulated. In this work, we investigate how effectively LLMs can learn and generalize ideological biases from their instruction-tuning data. Ou… ▽ More

    Submitted 18 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  24. arXiv:2401.16501  [pdf

    cs.LG cond-mat.mtrl-sci cs.AI

    AFSD-Physics: Exploring the governing equations of temperature evolution during additive friction stir deposition by a human-AI teaming approach

    Authors: Tony Shi, Mason Ma, Jiajie Wu, Chase Post, Elijah Charles, Tony Schmitz

    Abstract: This paper presents a modeling effort to explore the underlying physics of temperature evolution during additive friction stir deposition (AFSD) by a human-AI teaming approach. AFSD is an emerging solid-state additive manufacturing technology that deposits materials without melting. However, both process modeling and modeling of the AFSD tool are at an early stage. In this paper, a human-AI teamin… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  25. arXiv:2401.09432  [pdf, other

    cs.CL cs.AI cs.LG

    RoleCraft-GLM: Advancing Personalized Role-Playing in Large Language Models

    Authors: Meiling Tao, Xuechen Liang, Tianyu Shi, Lei Yu, Yiting Xie

    Abstract: This study presents RoleCraft-GLM, an innovative framework aimed at enhancing personalized role-playing with Large Language Models (LLMs). RoleCraft-GLM addresses the key issue of lacking personalized interactions in conversational AI, and offers a solution with detailed and emotionally nuanced character portrayals. We contribute a unique conversational dataset that shifts from conventional celebr… ▽ More

    Submitted 4 April, 2024; v1 submitted 17 December, 2023; originally announced January 2024.

  26. arXiv:2401.00625  [pdf, ps, other

    cs.LG

    Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models

    Authors: Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao

    Abstract: The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated models like OpenAI's ChatGPT, represents a significant advancement in artificial intelligence. These models, however, bring forth substantial challenges in the high consumption of computational, memory, energy, and financial resources, especially in environments with limited resource capabilities. This survey aims t… ▽ More

    Submitted 3 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

    Comments: Preprint. GitHub repo: https://github.com/tiingweii-shii/Awesome-Resource-Efficient-LLM-Papers

  27. arXiv:2312.15599  [pdf, other

    cs.IR

    Preliminary Study on Incremental Learning for Large Language Model-based Recommender Systems

    Authors: Tianhao Shi, Yang Zhang, Zhijian Xu, Chong Chen, Fuli Feng, Xiangnan He, Qi Tian

    Abstract: Adapting Large Language Models for recommendation (LLM4Rec)has garnered substantial attention and demonstrated promising results. However, the challenges of practically deploying LLM4Rec are largely unexplored, with the need for incremental adaptation to evolving user preferences being a critical concern. Nevertheless, the suitability of traditional incremental learning within LLM4Rec remains ambi… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

    Comments: 8 pages, 8 figures

  28. arXiv:2311.15920  [pdf, other

    cs.AI

    A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning

    Authors: Jianxiong Li, Shichao Lin, Tianyu Shi, Chujie Tian, Yu Mei, Jian Song, Xianyuan Zhan, Ruimin Li

    Abstract: The optimization of traffic signal control (TSC) is critical for an efficient transportation system. In recent years, reinforcement learning (RL) techniques have emerged as a popular approach for TSC and show promising results for highly adaptive control. However, existing RL-based methods suffer from notably poor real-world applicability and hardly have any successful deployments. The reasons for… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 15 pages, 6 figures

  29. arXiv:2311.10781  [pdf, other

    cs.CL cs.AI

    Can Language Model Moderators Improve the Health of Online Discourse?

    Authors: Hyundong Cho, Shuai Liu, Taiwei Shi, Darpan Jain, Basem Rizk, Yuyang Huang, Zixun Lu, Nuan Wen, Jonathan Gratch, Emilio Ferrara, Jonathan May

    Abstract: Conversational moderation of online communities is crucial to maintaining civility for a constructive environment, but it is challenging to scale and harmful to moderators. The inclusion of sophisticated natural language generation modules as a force multiplier to aid human moderators is a tantalizing prospect, but adequate evaluation approaches have so far been elusive. In this paper, we establis… ▽ More

    Submitted 6 May, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: 9 pages, NAACL 2024 Main

  30. arXiv:2311.09632  [pdf, other

    cs.CL cs.AI

    Online Continual Knowledge Learning for Language Models

    Authors: Yuhao Wu, Tongjun Shi, Karthick Sharma, Chun Wei Seah, Shuhao Zhang

    Abstract: Large Language Models (LLMs) serve as repositories of extensive world knowledge, enabling them to perform tasks such as question-answering and fact-checking. However, this knowledge can become obsolete as global contexts change. In this paper, we introduce a novel problem in the realm of continual learning: Online Continual Knowledge Learning (OCKL). This problem formulation aims to manage the dyn… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  31. arXiv:2311.08685  [pdf, other

    cs.CL cs.AI

    Safer-Instruct: Aligning Language Models with Automated Preference Data

    Authors: Taiwei Shi, Kai Chen, Jieyu Zhao

    Abstract: Reinforcement learning from human feedback (RLHF) is a vital strategy for enhancing model capability in language models. However, annotating preference data for RLHF is a resource-intensive and creativity-demanding process, while existing automatic generation methods face limitations in data diversity and quality. In response, we present Safer-Instruct, a novel pipeline for automatically construct… ▽ More

    Submitted 31 March, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: 16 pages. NAACL 2024 Camera-ready

  32. arXiv:2311.07613  [pdf

    eess.SY cs.LG math.DS

    A Physics-informed Machine Learning-based Control Method for Nonlinear Dynamic Systems with Highly Noisy Measurements

    Authors: Mason Ma, Jiajie Wu, Chase Post, Tony Shi, Jingang Yi, Tony Schmitz, Hong Wang

    Abstract: This study presents a physics-informed machine learning-based control method for nonlinear dynamic systems with highly noisy measurements. Existing data-driven control methods that use machine learning for system identification cannot effectively cope with highly noisy measurements, resulting in unstable control performance. To address this challenge, the present study extends current physics-info… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

  33. arXiv:2311.00706  [pdf, other

    cs.HC cs.AI

    Can AI Mitigate Human Perceptual Biases? A Pilot Study

    Authors: Ross Geuy, Nate Rising, Tiancheng Shi, Meng Ling, Jian Chen

    Abstract: We present results from a pilot experiment to measure if machine recommendations can debias human perceptual biases in visualization tasks. We specifically studied the ``pull-down'' effect, i.e., people underestimate the average position of lines, for the task of estimating the ensemble average of data points in line charts. These line charts can show for example temperature or precipitation in 12… ▽ More

    Submitted 10 October, 2023; originally announced November 2023.

    Comments: This paper was accepted IEEE VIS 2023 VISxVISION Workshop

  34. arXiv:2310.20256  [pdf, other

    cs.CL

    PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for Personality Detection

    Authors: Tao Yang, Tianyuan Shi, Fanqi Wan, Xiaojun Quan, Qifan Wang, Bingzhe Wu, Jiaxiang Wu

    Abstract: Recent advances in large language models (LLMs), such as ChatGPT, have showcased remarkable zero-shot performance across various NLP tasks. However, the potential of LLMs in personality detection, which involves identifying an individual's personality from their written texts, remains largely unexplored. Drawing inspiration from Psychological Questionnaires, which are carefully designed by psychol… ▽ More

    Submitted 4 November, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of EMNLP 2023

  35. arXiv:2310.17189  [pdf, other

    cs.CV

    Exploring Iterative Refinement with Diffusion Models for Video Grounding

    Authors: Xiao Liang, Tao Shi, Yaoyuan Liang, Te Tao, Shao-Lun Huang

    Abstract: Video grounding aims to localize the target moment in an untrimmed video corresponding to a given sentence query. Existing methods typically select the best prediction from a set of predefined proposals or directly regress the target span in a single-shot manner, resulting in the absence of a systematical prediction refinement process. In this paper, we propose DiffusionVG, a novel framework with… ▽ More

    Submitted 29 December, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

  36. arXiv:2310.16676  [pdf, other

    cs.CL

    SSLCL: An Efficient Model-Agnostic Supervised Contrastive Learning Framework for Emotion Recognition in Conversations

    Authors: Tao Shi, Xiao Liang, Yaoyuan Liang, Xinyi Tong, Shao-Lun Huang

    Abstract: Emotion recognition in conversations (ERC) is a rapidly evolving task within the natural language processing community, which aims to detect the emotions expressed by speakers during a conversation. Recently, a growing number of ERC methods have focused on leveraging supervised contrastive learning (SCL) to enhance the robustness and generalizability of learned features. However, current SCL-based… ▽ More

    Submitted 10 December, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

  37. arXiv:2310.16277  [pdf, other

    cs.LG cs.AI

    Bayesian Domain Invariant Learning via Posterior Generalization of Parameter Distributions

    Authors: Shiyu Shen, Bin Pan, Tianyang Shi, Tao Li, Zhenwei Shi

    Abstract: Domain invariant learning aims to learn models that extract invariant features over various training domains, resulting in better generalization to unseen target domains. Recently, Bayesian Neural Networks have achieved promising results in domain invariant learning, but most works concentrate on aligning features distributions rather than parameter distributions. Inspired by the principle of Baye… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  38. CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation

    Authors: Minzhi Li, Taiwei Shi, Caleb Ziems, Min-Yen Kan, Nancy F. Chen, Zhengyuan Liu, Diyi Yang

    Abstract: Annotated data plays a critical role in Natural Language Processing (NLP) in training models and evaluating their performance. Given recent developments in Large Language Models (LLMs), models such as ChatGPT demonstrate zero-shot capability on many text-annotation tasks, comparable with or even exceeding human annotators. Such LLMs can serve as alternatives for manual annotation, due to lower cos… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  39. arXiv:2310.14528  [pdf, other

    cs.CL

    Dual-Feedback Knowledge Retrieval for Task-Oriented Dialogue Systems

    Authors: Tianyuan Shi, Liangzhi Li, Zijian Lin, Tao Yang, Xiaojun Quan, Qifan Wang

    Abstract: Efficient knowledge retrieval plays a pivotal role in ensuring the success of end-to-end task-oriented dialogue systems by facilitating the selection of relevant information necessary to fulfill user requests. However, current approaches generally integrate knowledge retrieval and response generation, which poses scalability challenges when dealing with extensive knowledge bases. Taking inspiratio… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 (Main Conference)

  40. arXiv:2310.13027  [pdf, other

    cs.LG cs.AI

    Be Bayesian by Attachments to Catch More Uncertainty

    Authors: Shiyu Shen, Bin Pan, Tianyang Shi, Tao Li, Zhenwei Shi

    Abstract: Bayesian Neural Networks (BNNs) have become one of the promising approaches for uncertainty estimation due to the solid theorical foundations. However, the performance of BNNs is affected by the ability of catching uncertainty. Instead of only seeking the distribution of neural network weights by in-distribution (ID) data, in this paper, we propose a new Bayesian Neural Network with an Attached st… ▽ More

    Submitted 12 April, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

  41. arXiv:2308.12069  [pdf, ps, other

    cs.RO cs.AI cs.LG

    Identifying Reaction-Aware Driving Styles of Stochastic Model Predictive Controlled Vehicles by Inverse Reinforcement Learning

    Authors: Ni Dang, Tao Shi, Zengjie Zhang, Wanxin Jin, Marion Leibold, Martin Buss

    Abstract: The driving style of an Autonomous Vehicle (AV) refers to how it behaves and interacts with other AVs. In a multi-vehicle autonomous driving system, an AV capable of identifying the driving styles of its nearby AVs can reliably evaluate the risk of collisions and make more reasonable driving decisions. However, there has not been a consistent definition of driving styles for an AV in the literatur… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

  42. arXiv:2307.09729  [pdf, other

    cs.CV cs.MM eess.IV

    NTIRE 2023 Quality Assessment of Video Enhancement Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Wei Sun, Yulun Zhang, Kai Zhang, Radu Timofte, Guangtao Zhai, Yixuan Gao, Yuqin Cao, Tengchuan Kou, Yunlong Dong, Ziheng Jia, Yilin Li, Wei Wu, Shuming Hu, Sibin Deng, Pengxiang Xiao, Ying Chen, Kai Li, Kai Zhao, Kun Yuan, Ming Sun, Heng Cong, Hao Wang, Lingzhi Fu , et al. (47 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  43. Alioth: A Machine Learning Based Interference-Aware Performance Monitor for Multi-Tenancy Applications in Public Cloud

    Authors: Tianyao Shi, Yingxuan Yang, Yunlong Cheng, Xiaofeng Gao, Zhen Fang, Yongqiang Yang

    Abstract: Multi-tenancy in public clouds may lead to co-location interference on shared resources, which possibly results in performance degradation of cloud applications. Cloud providers want to know when such events happen and how serious the degradation is, to perform interference-aware migrations and alleviate the problem. However, virtual machines (VM) in Infrastructure-as-a-Service public clouds are b… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted by 2023 IEEE International Parallel & Distributed Processing Symposium (IPDPS)

    Journal ref: 2023 IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp. 908-917

  44. arXiv:2307.07245  [pdf, other

    cs.CV

    FreeCOS: Self-Supervised Learning from Fractals and Unlabeled Images for Curvilinear Object Segmentation

    Authors: Tianyi Shi, Xiaohuan Ding, Liang Zhang, Xin Yang

    Abstract: Curvilinear object segmentation is critical for many applications. However, manually annotating curvilinear objects is very time-consuming and error-prone, yielding insufficiently available annotated datasets for existing supervised methods and domain adaptation methods. This paper proposes a self-supervised curvilinear object segmentation method that learns robust and distinctive features from fr… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: Accepted by ICCV 2023

  45. arXiv:2306.02121  [pdf, other

    cs.LG cs.AI cs.CY math.OC

    Identifying Subgroups of ICU Patients Using End-to-End Multivariate Time-Series Clustering Algorithm Based on Real-World Vital Signs Data

    Authors: Tongyue Shi, Zhilong Zhang, Wentie Liu, Junhua Fang, Jianguo Hao, Shuai Jin, Huiying Zhao, Guilan Kong

    Abstract: This study employed the MIMIC-IV database as data source to investigate the use of dynamic, high-frequency, multivariate time-series vital signs data, including temperature, heart rate, mean blood pressure, respiratory rate, and SpO2, monitored first 8 hours data in the ICU stay. Various clustering algorithms were compared, and an end-to-end multivariate time series clustering system called Time2F… ▽ More

    Submitted 11 July, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

  46. arXiv:2306.01925  [pdf, other

    cs.LG

    Improving the generalizability and robustness of large-scale traffic signal control

    Authors: Tianyu Shi, Francois-Xavier Devailly, Denis Larocque, Laurent Charlin

    Abstract: A number of deep reinforcement-learning (RL) approaches propose to control traffic signals. In this work, we study the robustness of such methods along two axes. First, sensor failures and GPS occlusions create missing-data challenges and we show that recent methods remain brittle in the face of these missing data. Second, we provide a more systematic study of the generalization ability of RL meth… ▽ More

    Submitted 7 June, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  47. Reformulating CTR Prediction: Learning Invariant Feature Interactions for Recommendation

    Authors: Yang Zhang, Tianhao Shi, Fuli Feng, Wenjie Wang, Dingxian Wang, Xiangnan He, Yongdong Zhang

    Abstract: Click-Through Rate (CTR) prediction plays a core role in recommender systems, serving as the final-stage filter to rank items for a user. The key to addressing the CTR task is learning feature interactions that are useful for prediction, which is typically achieved by fitting historical click data with the Empirical Risk Minimization (ERM) paradigm. Representative methods include Factorization Mac… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

    Comments: 11 pages, 6 Postscript figures, to be published in SIGIR2023

  48. arXiv:2304.09423  [pdf, other

    cs.CV

    ASM: Adaptive Skinning Model for High-Quality 3D Face Modeling

    Authors: Kai Yang, Hong Shang, Tianyang Shi, Xinghan Chen, Jingkai Zhou, Zhongqian Sun, Wei Yang

    Abstract: The research fields of parametric face model and 3D face reconstruction have been extensively studied. However, a critical question remains unanswered: how to tailor the face model for specific reconstruction settings. We argue that reconstruction with multi-view uncalibrated images demands a new model with stronger capacity. Our study shifts attention from data-dependent 3D Morphable Models (3DMM… ▽ More

    Submitted 8 October, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: 18 pages

  49. arXiv:2304.04926  [pdf, other

    cs.CV

    Life Regression based Patch Slimming for Vision Transformers

    Authors: Jiawei Chen, Lin Chen, Jiang Yang, Tianqi Shi, Lechao Cheng, Zunlei Feng, Mingli Song

    Abstract: Vision transformers have achieved remarkable success in computer vision tasks by using multi-head self-attention modules to capture long-range dependencies within images. However, the high inference computation cost poses a new challenge. Several methods have been proposed to address this problem, mainly by slimming patches. In the inference stage, these methods classify patches into two classes,… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: 8 pages,4 figures

  50. arXiv:2301.10371  [pdf, other

    cs.CL cs.AI cs.LG

    Weakly Supervised Headline Dependency Parsing

    Authors: Adrian Benton, Tianze Shi, Ozan İrsoy, Igor Malioutov

    Abstract: English news headlines form a register with unique syntactic properties that have been documented in linguistics literature since the 1930s. However, headlines have received surprisingly little attention from the NLP syntactic parsing community. We aim to bridge this gap by providing the first news headline corpus of Universal Dependencies annotated syntactic dependency trees, which enables us to… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

    Comments: Findings of EMNLP 2022

    ACM Class: I.2.7

    Journal ref: In Proceedings of Findings of EMNLP 2022