Zum Hauptinhalt springen

Showing 51–100 of 508 results for author: Wen, Z

.
  1. arXiv:2404.02589  [pdf, other

    cs.CL cs.AI

    Affective-NLI: Towards Accurate and Interpretable Personality Recognition in Conversation

    Authors: Zhiyuan Wen, Jiannong Cao, Yu Yang, Ruosong Yang, Shuaiqi Liu

    Abstract: Personality Recognition in Conversation (PRC) aims to identify the personality traits of speakers through textual dialogue content. It is essential for providing personalized services in various applications of Human-Computer Interaction (HCI), such as AI-based mental therapy and companion robots for the elderly. Most recent studies analyze the dialog content for personality classification yet ove… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE PerCom 2024

  2. arXiv:2404.02002  [pdf, ps, other

    astro-ph.CO astro-ph.GA

    A catalog of 1.58 million clusters of galaxies identified from the DESI Legacy Imaging Surveys

    Authors: Z. L. Wen, J. L. Han

    Abstract: Based on the DESI Legacy Imaging Surveys released data and available spectroscopic redshifts, we identify 1.58 million clusters of galaxies by searching for the overdensity of stellar mass distribution of galaxies within redshift slices around pre-selected massive galaxies, among which 877,806 clusters are found for the first time. The identified clusters have an equivalent mass of M_{500}> 0.47*1… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 17 pages, 14 figures, 2 tables, re-submitted to ApJS after referee's comments being incorporated

  3. arXiv:2404.00321  [pdf, other

    astro-ph.CO astro-ph.GA

    Intrinsic mass-richness relation of clusters from THE THREE HUNDRED hydrodynamic simulations

    Authors: Mingjing Chen, Weiguang Cui, Wenjuan Fang, Zhonglue Wen

    Abstract: The main systematics in cluster cosmology is the uncertainty in the mass-observable relation. In this paper, we focus on the most direct cluster observable in optical surveys, i.e. richness, and constrain the intrinsic mass-richness (MR) relation of clusters in THE THREE HUNDRED hydrodynamic simulations with two runs: GIZMO-SIMBA and GADGET-X. We find that modeling the richness at fixed halo mass… ▽ More

    Submitted 2 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: accepted to ApJ

    Journal ref: The Astrophysical Journal (2024)

  4. ChatGPT v.s. Media Bias: A Comparative Study of GPT-3.5 and Fine-tuned Language Models

    Authors: Zehao Wen, Rabih Younes

    Abstract: In our rapidly evolving digital sphere, the ability to discern media bias becomes crucial as it can shape public sentiment and influence pivotal decisions. The advent of large language models (LLMs), such as ChatGPT, noted for their broad utility in various natural language processing (NLP) tasks, invites exploration of their efficacy in media bias detection. Can ChatGPT detect media bias? This st… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 9 pages, 1 figure, published on Applied and Computational Engineering

    Journal ref: ACE (2023) Vol. 21: 249-257.

  5. arXiv:2403.16557  [pdf, ps, other

    cs.LG cs.DC

    Accelerating Federated Learning by Selecting Beneficial Herd of Local Gradients

    Authors: Ping Luo, Xiaoge Deng, Ziqing Wen, Tao Sun, Dongsheng Li

    Abstract: Federated Learning (FL) is a distributed machine learning framework in communication network systems. However, the systems' Non-Independent and Identically Distributed (Non-IID) data negatively affect the convergence efficiency of the global model, since only a subset of these data samples are beneficial for model convergence. In pursuit of this subset, a reliable approach involves determining a m… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  6. arXiv:2403.15044  [pdf, other

    cs.CV cs.AI

    Multimodal Fusion with Pre-Trained Model Features in Affective Behaviour Analysis In-the-wild

    Authors: Zhuofan Wen, Fengyu Zhang, Siyuan Zhang, Haiyang Sun, Mingyu Xu, Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao

    Abstract: Multimodal fusion is a significant method for most multimodal tasks. With the recent surge in the number of large pre-trained models, combining both multimodal fusion methods and pre-trained model features can achieve outstanding performance in many multimodal tasks. In this paper, we present our approach, which leverages both advantages for addressing the task of Expression (Expr) Recognition and… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  7. arXiv:2403.11437  [pdf, other

    math.OC math.NA

    Formalization of Complexity Analysis of the First-order Algorithms for Convex Optimization

    Authors: Chenyi Li, Ziyu Wang, Wanyi He, Yuxuan Wu, Shengyang Xu, Zaiwen Wen

    Abstract: The convergence rate of various first-order optimization algorithms is a pivotal concern within the numerical optimization community, as it directly reflects the efficiency of these algorithms across different optimization problems. Our goal is making a significant step forward in the formal mathematical representation of optimization techniques using the Lean4 theorem prover. We first formalize t… ▽ More

    Submitted 21 July, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    ACM Class: G.1.6

  8. arXiv:2403.06769  [pdf, other

    cs.CL

    Strength Lies in Differences! Towards Effective Non-collaborative Dialogues via Tailored Strategy Planning

    Authors: Tong Zhang, Chen Huang, Yang Deng, Hongru Liang, Jia Liu, Zujie Wen, Wenqiang Lei, Tat-Seng Chua

    Abstract: We investigate non-collaborative dialogue agents, which are expected to engage in strategic conversations with diverse users, for securing a mutual agreement that leans favorably towards the system's objectives. This poses two main challenges for existing dialogue agents: 1) The inability to integrate user-specific characteristics into the strategic planning, and 2) The difficulty of training stra… ▽ More

    Submitted 6 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: V2: 20 pages, 8 figures, and 20 tables

  9. arXiv:2403.04454  [pdf, other

    cs.CL cs.AI

    Low-Resource Court Judgment Summarization for Common Law Systems

    Authors: Shuaiqi Liu, Jiannong Cao, Yicong Li, Ruosong Yang, Zhiyuan Wen

    Abstract: Common law courts need to refer to similar precedents' judgments to inform their current decisions. Generating high-quality summaries of court judgment documents can facilitate legal practitioners to efficiently review previous cases and assist the general public in accessing how the courts operate and how the law is applied. Previous court judgment summarization research focuses on civil law or a… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: First submitted to Information Processing and Management on Oct. 29, 2023. Major Revision submitted on Mar.6, 2024

    ACM Class: I.2.7; I.7

  10. arXiv:2403.02233  [pdf, other

    cs.LG math.OC stat.ML

    How Transformers Learn Diverse Attention Correlations in Masked Vision Pretraining

    Authors: Yu Huang, Zixin Wen, Yuejie Chi, Yingbin Liang

    Abstract: Masked reconstruction, which predicts randomly masked patches from unmasked ones, has emerged as an important approach in self-supervised pretraining. However, the theoretical understanding of masked pretraining is rather limited, especially for the foundational architecture of transformers. In this paper, to the best of our knowledge, we provide the first end-to-end theoretical guarantee of learn… ▽ More

    Submitted 4 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: v2 polishes writing

  11. arXiv:2403.01450  [pdf, other

    cs.RO

    Collision-Free Robot Navigation in Crowded Environments using Learning based Convex Model Predictive Control

    Authors: Zhuanglei Wen, Mingze Dong, Xiai Chen

    Abstract: Navigating robots safely and efficiently in crowded and complex environments remains a significant challenge. However, due to the dynamic and intricate nature of these settings, planning efficient and collision-free paths for robots to track is particularly difficult. In this paper, we uniquely bridge the robot's perception, decision-making and control processes by utilizing the convex obstacle-fr… ▽ More

    Submitted 14 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  12. arXiv:2403.01101  [pdf, other

    cs.LG cs.AI

    Feature Alignment: Rethinking Efficient Active Learning via Proxy in the Context of Pre-trained Models

    Authors: Ziting Wen, Oscar Pizarro, Stefan Williams

    Abstract: Fine-tuning the pre-trained model with active learning holds promise for reducing annotation costs. However, this combination introduces significant computational costs, particularly with the growing scale of pre-trained models. Recent research has proposed proxy-based active learning, which pre-computes features to reduce computational costs. Yet, this approach often incurs a significant loss in… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  13. arXiv:2403.00352  [pdf, other

    cs.CV cs.LG

    Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning

    Authors: Ruiqian Nai, Zixin Wen, Ji Li, Yuanzhi Li, Yang Gao

    Abstract: In representation learning, a disentangled representation is highly desirable as it encodes generative factors of data in a separable and compact pattern. Researchers have advocated leveraging disentangled representations to complete downstream tasks with encouraging empirical evidence. This paper further investigates the necessity of disentangled representation in downstream applications. Specifi… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: Accepted to AAAI-2024

  14. A Search for Radio Pulsars in Supernova Remnants Using FAST with One Pulsar Discovered

    Authors: Zhen Zhang, Wen-Ming Yan, Jian-Ping Yuan, Na Wang, Jun-Tao Bai, Zhi-Gang Wen, Bao-Da Li, Jin-Tao Xie, De Zhao, Yu-Bin Wang, Nan-Nan Zhai

    Abstract: We report on the results of a search for radio pulsars in five supernova remnants (SNRs) with FAST. The observations were made using the 19-beam receiver in the Snapshot mode. The integration time for each pointing is 10 min. We discovered a new pulsar PSR J1845$-$0306 which has a spin period of 983.6 ms and a dispersion measure of 444.6$\pm$2.0 cm$^{-3}$ pc in observations of SNR G29.6+0.1. To ju… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 6 pages, 2 figures, 2 tables published in CPL

    Journal ref: Chin. Phys. Lett. 2024, 41 (2): 029701 February 2024

  15. arXiv:2402.16568  [pdf, other

    cs.CL

    Two-stage Generative Question Answering on Temporal Knowledge Graph Using Large Language Models

    Authors: Yifu Gao, Linbo Qiao, Zhigang Kan, Zhihua Wen, Yongquan He, Dongsheng Li

    Abstract: Temporal knowledge graph question answering (TKGQA) poses a significant challenge task, due to the temporal constraints hidden in questions and the answers sought from dynamic structured knowledge. Although large language models (LLMs) have made considerable progress in their reasoning ability over structured data, their application to the TKGQA task is a relatively unexplored area. This paper fir… ▽ More

    Submitted 23 July, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL(Findings) 2024

  16. arXiv:2402.15069  [pdf, other

    astro-ph.HE

    Investigation of profile shifting and subpulse movement in PSR J0344-0901 with FAST

    Authors: H. M. Tedila, R. Yuen, N. Wang, D. Li, Z. G. Wen, W. M. Yan, J. P. Yuan, X. H. Han, P. Wang, W. W. Zhu, S. J. Dang, S. Q. Wang, J. T. Xie, Q. D. Wu, Sh. Khasanov, FAST Collaboration

    Abstract: We report two phenomena detected in PSR J0344$-$0901 from two observations conducted at frequency centered at 1.25 GHz using the Five-hundred-meter Aperture Spherical radio Telescope (FAST). The first phenomenon manifests as shifting in the pulse emission to later longitudinal phases and then gradually returns to its original location. The event lasts for about 216 pulse periods, with an average s… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  17. arXiv:2402.11896  [pdf, other

    cs.CL

    SIBO: A Simple Booster for Parameter-Efficient Fine-Tuning

    Authors: Zhihao Wen, Jie Zhang, Yuan Fang

    Abstract: Fine-tuning all parameters of large language models (LLMs) necessitates substantial computational power and extended time. Latest advancements in parameter-efficient fine-tuning (PEFT) techniques, such as Adapter tuning and LoRA, allow for adjustments to only a minor fraction of the parameters of these LLMs. Concurrently, it has been noted that the issue of over-smoothing diminishes the effectiven… ▽ More

    Submitted 1 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024, 17 pages

  18. arXiv:2402.09543  [pdf, other

    cs.IR

    Rethinking Large Language Model Architectures for Sequential Recommendations

    Authors: Hanbing Wang, Xiaorui Liu, Wenqi Fan, Xiangyu Zhao, Venkataramana Kini, Devendra Yadav, Fei Wang, Zhen Wen, Jiliang Tang, Hui Liu

    Abstract: Recently, sequential recommendation has been adapted to the LLM paradigm to enjoy the power of LLMs. LLM-based methods usually formulate recommendation information into natural language and the model is trained to predict the next item in an auto-regressive manner. Despite their notable success, the substantial computational overhead of inference poses a significant obstacle to their real-world ap… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 8 pages, 5 figures, conference

  19. arXiv:2402.06880  [pdf, other

    q-bio.QM

    A single-snapshot inverse solver for two-species graph model of tau pathology spreading in human Alzheimer disease

    Authors: Zheyu Wen, Ali Ghafouri, George Biros

    Abstract: We propose a method that uses a two-species ordinary differential equation (ODE) biophysical model to characterize misfolded tau (or simply tau) protein spreading in Alzheimer disease (AD) and calibrates it from clinical data. The unknown model parameters are the initial condition (IC) for tau and three scalar parameters representing the migration, proliferation, and clearance of tau proteins. Dri… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: 10 pages, 6 figures

  20. arXiv:2402.01469  [pdf, other

    cs.CL

    AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback

    Authors: Jian Guan, Wei Wu, Zujie Wen, Peng Xu, Hongning Wang, Minlie Huang

    Abstract: The notable success of large language models (LLMs) has sparked an upsurge in building language agents to complete various complex tasks. We present AMOR, an agent framework based on open-source LLMs, which reasons with external knowledge bases and adapts to specific domains through human supervision to the reasoning process. AMOR builds reasoning logic over a finite state machine (FSM) that solve… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: Work in progress

  21. arXiv:2402.01440  [pdf, other

    cs.LG cs.AI cs.SI

    Few-Shot Learning on Graphs: from Meta-learning to Pre-training and Prompting

    Authors: Xingtong Yu, Yuan Fang, Zemin Liu, Yuxia Wu, Zhihao Wen, Jianyuan Bo, Xinming Zhang, Steven C. H. Hoi

    Abstract: Graph representation learning, a critical step in graph-centric tasks, has seen significant advancements. Earlier techniques often operate in an end-to-end setting, where performance heavily relies on the availability of ample labeled data. This constraint has spurred the emergence of few-shot learning on graphs, where only a few task-specific labels are available for each task. Given the extensiv… ▽ More

    Submitted 2 March, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  22. arXiv:2401.16702  [pdf, other

    cs.CV

    Multi-granularity Correspondence Learning from Long-term Noisy Videos

    Authors: Yijie Lin, Jie Zhang, Zhenyu Huang, Jia Liu, Zujie Wen, Xi Peng

    Abstract: Existing video-language studies mainly focus on learning short video clips, leaving long-term temporal dependencies rarely explored due to over-high computational cost of modeling long videos. To address this issue, one feasible solution is learning the correspondence between video clips and captions, which however inevitably encounters the multi-granularity noisy correspondence (MNC) problem. To… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted by ICLR 2024 (oral)

  23. arXiv:2401.13503  [pdf, other

    cs.CV

    Learning Representations for Clustering via Partial Information Discrimination and Cross-Level Interaction

    Authors: Hai-Xin Zhang, Dong Huang, Hua-Bao Ling, Guang-Yu Zhang, Wei-jun Sun, Zi-hao Wen

    Abstract: In this paper, we present a novel deep image clustering approach termed PICI, which enforces the partial information discrimination and the cross-level interaction in a joint learning framework. In particular, we leverage a Transformer encoder as the backbone, through which the masked image modeling with two paralleled augmented views is formulated. After deriving the class tokens from the masked… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  24. arXiv:2401.10296  [pdf, other

    astro-ph.HE astro-ph.SR

    The Study of Mode Switching behavior of PSR J0614+2229 Using the Parkes Ultra-wideband Receiver Observations

    Authors: Yanqing Cai, Shijun Dang, Rai Yuen, Lunhua Shang, Feifei Kou, Jianping Yuan, Lei Zhang, Zurong Zhou, Na Wang, Qingying Li, Zhigang Wen, Wenming Yan, Shuangqiang Wang, Shengnan Sun, Habtamu Menberu Tedila, Shuo Xiao, Xin Xu, Rushuang Zhao, Qijun Zhi, Aijun Dong, Bing Zhang, Wei Li, Yingying Ren, Yujia Liu

    Abstract: In this paper, we presented a detailed single pulse and polarization study of PSR J0614+2229 based on the archived data observed on 2019 August 15 (MJD 58710) and September 12 (MJD 58738) using the Ultra-wideband Low-frequency Receiver on the Parkes radio telescope. The single-pulse sequences show that this pulsar switches between two emission states, in which the emission of state A occurs earlie… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  25. arXiv:2401.09085  [pdf

    physics.optics

    3D orientation super-resolution spatial-frequency-shift microscopy

    Authors: Xiaowei Liu, Mingwei Tang, Ning Zhou, Chenlei Pang, Zhong Wen, Xu Liu, Qing Yang

    Abstract: Super-resolution mapping of the 3D orientation of fluorophores reveals the alignment of biological structures where the fluorophores are tightly attached, and thus plays a vital role in studying the organization and dynamics of bio-complexes. However, current super-resolution imaging techniques are either limited to 2D orientation mapping or suffer from slow speed and the requirement of special la… ▽ More

    Submitted 22 January, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: 22 pages, 5 figures

  26. arXiv:2401.05778  [pdf, other

    cs.CL cs.AI

    Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

    Authors: Tianyu Cui, Yanling Wang, Chuanpu Fu, Yong Xiao, Sijia Li, Xinhao Deng, Yunpeng Liu, Qinglin Zhang, Ziyi Qiu, Peiyang Li, Zhixing Tan, Junwu Xiong, Xinyu Kong, Zujie Wen, Ke Xu, Qi Li

    Abstract: Large language models (LLMs) have strong capabilities in solving diverse natural language processing tasks. However, the safety and security issues of LLM systems have become the major obstacle to their widespread application. Many studies have extensively investigated risks in LLM systems and developed the corresponding mitigation strategies. Leading-edge enterprises such as OpenAI, Google, Meta,… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  27. arXiv:2401.05596  [pdf

    cs.CL cs.AI

    POMP: Probability-driven Meta-graph Prompter for LLMs in Low-resource Unsupervised Neural Machine Translation

    Authors: Shilong Pan, Zhiliang Tian, Liang Ding, Zhen Huang, Zhihua Wen, Dongsheng Li

    Abstract: Low-resource languages (LRLs) face challenges in supervised neural machine translation due to limited parallel data, prompting research into unsupervised methods. Unsupervised neural machine translation (UNMT) methods, including back-translation, transfer learning, and pivot-based translation, offer practical solutions for LRL translation, but they are hindered by issues like synthetic data noise,… ▽ More

    Submitted 16 January, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

  28. arXiv:2401.02682  [pdf, other

    cs.LG cs.SI

    Homophily-Related: Adaptive Hybrid Graph Filter for Multi-View Graph Clustering

    Authors: Zichen Wen, Yawen Ling, Yazhou Ren, Tianyi Wu, Jianpeng Chen, Xiaorong Pu, Zhifeng Hao, Lifang He

    Abstract: Recently there is a growing focus on graph data, and multi-view graph clustering has become a popular area of research interest. Most of the existing methods are only applicable to homophilous graphs, yet the extensive real-world graph data can hardly fulfill the homophily assumption, where the connected nodes tend to belong to the same class. Several studies have pointed out that the poor perform… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI2024

  29. arXiv:2312.16998  [pdf, other

    eess.IV cs.CV

    Deep Unfolding Network with Spatial Alignment for multi-modal MRI reconstruction

    Authors: Hao Zhang, Qi Wang, Jun Shi, Shihui Ying, Zhijie Wen

    Abstract: Multi-modal Magnetic Resonance Imaging (MRI) offers complementary diagnostic information, but some modalities are limited by the long scanning time. To accelerate the whole acquisition process, MRI reconstruction of one modality from highly undersampled k-space data with another fully-sampled reference modality is an efficient solution. However, the misalignment between modalities, which is common… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  30. arXiv:2312.12693  [pdf, other

    math.NA

    Anderson Accelerated Gauss-Newton-guided deep learning for nonlinear inverse problems with Application to Electrical Impedance Tomography

    Authors: Qingping Zhou, Guixian Xu, Zhexin Wen, Hongqiao Wang

    Abstract: Physics-guided deep learning is an important prevalent research topic in scientific machine learning, which has tremendous potential in various complex applications including science and engineering. In these applications, data is expensive to acquire and high accuracy is required for making decisions. In this work, we introduce an efficient physics-guided deep learning framework for the variation… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    MSC Class: 78A46; 68U10; 68T07

  31. arXiv:2312.07889  [pdf, other

    math.OC cs.CE cs.CG cs.GR

    Adaptive Isogeometric Topology Optimization of Shell Structures based on PHT-splines

    Authors: Zepeng Wen, Qiong Pan, Xiaoya Zhai, Hongmei Kang, Falai Chen

    Abstract: This paper proposes an Adaptive Isogeometric Topology Optimization framework for shell structures based on PHT-splines (PHT-AITO). In this framework, the design domain, displacement, and density are represented by PHT-splines. Leveraging the local refinement capability of PHT-splines, mesh elements defining the density function are adaptively refined to achieve a suitable resolution at the interfa… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  32. arXiv:2312.06993  [pdf

    cs.LG

    Dynamically configured physics-informed neural network in topology optimization applications

    Authors: Jichao Yin, Ziming Wen, Shuhao Li, Yaya Zhanga, Hu Wang

    Abstract: Integration of machine learning (ML) into the topology optimization (TO) framework is attracting increasing attention, but data acquisition in data-driven models is prohibitive. Compared with popular ML methods, the physics-informed neural network (PINN) can avoid generating enormous amounts of data when solving forward problems and additionally provide better inference. To this end, a dynamically… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 31 pages, 22 figures

  33. arXiv:2312.06644  [pdf, other

    cs.CV cs.AI cs.GR

    AnyHome: Open-Vocabulary Generation of Structured and Textured 3D Homes

    Authors: Rao Fu, Zehao Wen, Zichen Liu, Srinath Sridhar

    Abstract: Inspired by cognitive theories, we introduce AnyHome, a framework that translates any text into well-structured and textured indoor scenes at a house-scale. By prompting Large Language Models (LLMs) with designed templates, our approach converts provided textual narratives into amodal structured representations. These representations guarantee consistent and realistic spatial layouts by directing… ▽ More

    Submitted 28 July, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: accepted by ECCV 2024

  34. arXiv:2312.04293  [pdf, other

    cs.CV cs.MM

    GPT-4V with Emotion: A Zero-shot Benchmark for Generalized Emotion Recognition

    Authors: Zheng Lian, Licai Sun, Haiyang Sun, Kang Chen, Zhuofan Wen, Hao Gu, Bin Liu, Jianhua Tao

    Abstract: Recently, GPT-4 with Vision (GPT-4V) has demonstrated remarkable visual capabilities across various tasks, but its performance in emotion recognition has not been fully evaluated. To bridge this gap, we present the quantitative evaluation results of GPT-4V on 21 benchmark datasets covering 6 tasks: visual sentiment analysis, tweet sentiment analysis, micro-expression recognition, facial emotion re… ▽ More

    Submitted 17 March, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

  35. arXiv:2312.01801  [pdf, other

    cs.HC cs.SE

    SPROUT: Authoring Programming Tutorials with Interactive Visualization of Large Language Model Generation Process

    Authors: Yihan Liu, Zhen Wen, Luoxuan Weng, Ollie Woodman, Yi Yang, Wei Chen

    Abstract: The rapid development of large language models (LLMs), such as ChatGPT, has revolutionized the efficiency of creating programming tutorials. LLMs can be instructed with text prompts to generate comprehensive text descriptions of code snippets. However, the lack of transparency in the end-to-end generation process has hindered the understanding of model behavior and limited user control over the ge… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  36. arXiv:2312.01273  [pdf, other

    math.OC

    An Augmented Lagrangian Primal-Dual Semismooth Newton Method for Multi-Block Composite Optimization

    Authors: Zhanwang Deng, Kangkang Deng, Jiang Hu, Zaiwen Wen

    Abstract: In this paper, we develop a novel primal-dual semismooth Newton method for solving linearly constrained multi-block convex composite optimization problems. First, a differentiable augmented Lagrangian (AL) function is constructed by utilizing the Moreau envelopes of the nonsmooth functions. It enables us to derive an equivalent saddle point problem and establish the strong AL duality under the Sla… ▽ More

    Submitted 15 May, 2024; v1 submitted 2 December, 2023; originally announced December 2023.

    Comments: 27 pages

  37. arXiv:2312.01057  [pdf, other

    cs.LG cs.AI cs.CL

    RLHF and IIA: Perverse Incentives

    Authors: Wanqiao Xu, Shi Dong, Xiuyuan Lu, Grace Lam, Zheng Wen, Benjamin Van Roy

    Abstract: Existing algorithms for reinforcement learning from human feedback (RLHF) can incentivize responses at odds with preferences because they are based on models that assume independence of irrelevant alternatives (IIA). The perverse incentives induced by IIA hinder innovations on query formats and learning algorithms.

    Submitted 1 February, 2024; v1 submitted 2 December, 2023; originally announced December 2023.

  38. arXiv:2310.18894  [pdf, other

    cs.CV

    Emergence of Shape Bias in Convolutional Neural Networks through Activation Sparsity

    Authors: Tianqin Li, Ziqi Wen, Yangfan Li, Tai Sing Lee

    Abstract: Current deep-learning models for object recognition are known to be heavily biased toward texture. In contrast, human visual systems are known to be biased toward shape and structure. What could be the design principles in human visual systems that led to this difference? How could we introduce more shape bias into the deep learning models? In this paper, we report that sparse coding, a ubiquitous… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Published as NeurIPS 2023 (Oral)

  39. arXiv:2310.11531  [pdf, ps, other

    cs.LG cs.AI eess.SY stat.ML

    Efficient Online Learning with Offline Datasets for Infinite Horizon MDPs: A Bayesian Approach

    Authors: Dengwang Tang, Rahul Jain, Botao Hao, Zheng Wen

    Abstract: In this paper, we study the problem of efficient online reinforcement learning in the infinite horizon setting when there is an offline dataset to start with. We assume that the offline dataset is generated by an expert but with unknown level of competence, i.e., it is not perfect and not necessarily using the optimal policy. We show that if the learning agent models the behavioral policy (paramet… ▽ More

    Submitted 1 February, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: 22 pages

    MSC Class: 93E35

  40. Dual-Branch Knowledge Distillation for Noise-Robust Synthetic Speech Detection

    Authors: Cunhang Fan, Mingming Ding, Jianhua Tao, Ruibo Fu, Jiangyan Yi, Zhengqi Wen, Zhao Lv

    Abstract: Most research in synthetic speech detection (SSD) focuses on improving performance on standard noise-free datasets. However, in actual situations, noise interference is usually present, causing significant performance degradation in SSD systems. To improve noise robustness, this paper proposes a dual-branch knowledge distillation synthetic speech detection (DKDSSD) method. Specifically, a parallel… ▽ More

    Submitted 16 April, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

  41. arXiv:2310.07555  [pdf, other

    cs.CV

    Does resistance to style-transfer equal Global Shape Bias? Measuring network sensitivity to global shape configuration

    Authors: Ziqi Wen, Tianqin Li, Zhi Jing, Tai Sing Lee

    Abstract: Deep learning models are known to exhibit a strong texture bias, while human tends to rely heavily on global shape structure for object recognition. The current benchmark for evaluating a model's global shape bias is a set of style-transferred images with the assumption that resistance to the attack of style transfer is related to the development of global structure sensitivity in the model. In th… ▽ More

    Submitted 29 February, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  42. arXiv:2310.06713  [pdf, other

    cs.LG stat.AP

    Interpretable Traffic Event Analysis with Bayesian Networks

    Authors: Tong Yuan, Jian Yang, Zeyi Wen

    Abstract: Although existing machine learning-based methods for traffic accident analysis can provide good quality results to downstream tasks, they lack interpretability which is crucial for this critical problem. This paper proposes an interpretable framework based on Bayesian Networks for traffic accident prediction. To enable the ease of interpretability, we design a dataset construction pipeline to feed… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: 11 pages, 7 figures

    MSC Class: 62F15 ACM Class: G.3

  43. arXiv:2310.05388  [pdf, other

    cs.CL

    GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence

    Authors: Zhihua Wen, Zhiliang Tian, Wei Wu, Yuxin Yang, Yanqi Shi, Zhen Huang, Dongsheng Li

    Abstract: Conditional story generation is significant in human-machine interaction, particularly in producing stories with complex plots. While Large language models (LLMs) perform well on multiple NLP tasks, including story generation, it is challenging to generate stories with both complex and creative plots. Existing methods often rely on detailed prompts to guide LLMs to meet target conditions, which in… ▽ More

    Submitted 23 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023

  44. arXiv:2310.01419  [pdf, other

    cs.IR cs.LG

    Design Principles of Robust Multi-Armed Bandit Framework in Video Recommendations

    Authors: Belhassen Bayar, Phanideep Gampa, Ainur Yessenalina, Zhen Wen

    Abstract: Current multi-armed bandit approaches in recommender systems (RS) have focused more on devising effective exploration techniques, while not adequately addressing common exploitation challenges related to distributional changes and item cannibalization. Little work exists to guide the design of robust bandit frameworks that can address these frequent challenges in RS. In this paper, we propose a ne… ▽ More

    Submitted 24 September, 2023; originally announced October 2023.

    Comments: RecSys CARS 2023 Workshop paper

  45. arXiv:2310.00212  [pdf, other

    cs.LG cs.AI cs.CL

    Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment

    Authors: Tianhao Wu, Banghua Zhu, Ruoyu Zhang, Zhaojin Wen, Kannan Ramchandran, Jiantao Jiao

    Abstract: Large Language Models (LLMs) can acquire extensive world knowledge through pre-training on large corpora. However, due to exposure to low-quality data, LLMs may exhibit harmful behavior without aligning with human values. The dominant approach for steering LLMs towards beneficial behavior involves Reinforcement Learning with Human Feedback (RLHF), with Proximal Policy Optimization (PPO) serving as… ▽ More

    Submitted 9 October, 2023; v1 submitted 29 September, 2023; originally announced October 2023.

    Comments: 19 pages, 5 figures

  46. arXiv:2309.17409  [pdf, ps, other

    math.OC

    Sharper Convergence Guarantees for Federated Learning with Partial Model Personalization

    Authors: Yiming Chen, Liyuan Cao, Kun Yuan, Zaiwen Wen

    Abstract: Partial model personalization, which encompasses both shared and personal variables in its formulation, is a critical optimization problem in federated learning. It balances individual client needs with collective knowledge utilization, and serves as a general formulation covering various key scenarios, ranging from fully shared to fully personalized federated learning. This paper introduces two e… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

  47. arXiv:2309.08166  [pdf, other

    cs.SD eess.AS

    Residual Speaker Representation for One-Shot Voice Conversion

    Authors: Le Xu, Jiangyan Yi, Tao Wang, Yong Ren, Rongxiu Zhong, Zhengqi Wen, Jianhua Tao

    Abstract: Recently, there have been significant advancements in voice conversion, resulting in high-quality performance. However, there are still two critical challenges in this field. Firstly, current voice conversion methods have limited robustness when encountering unseen speakers. Secondly, they also have limited ability to control timbre representation. To address these challenges, this paper presents… ▽ More

    Submitted 11 August, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: Accepted by INTERSPEECH2024

  48. arXiv:2308.13295  [pdf

    math.NA

    Resolution-independent generative models based on operator learning for physics-constrained Bayesian inverse problems

    Authors: Xinchao Jiang, Xin Wang, Ziming Wen, Hu Wang

    Abstract: The Bayesian inference approach is widely used to tackle inverse problems due to its versatile and natural ability to handle ill-posedness. However, it often faces challenges when dealing with situations involving continuous fields or large-resolution discrete representations (high-dimensional). Moreover, the prior distribution of unknown parameters is commonly difficult to be determined. In this… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  49. Voucher Abuse Detection with Prompt-based Fine-tuning on Graph Neural Networks

    Authors: Zhihao Wen, Yuan Fang, Yihan Liu, Yang Guo, Shuji Hao

    Abstract: Voucher abuse detection is an important anomaly detection problem in E-commerce. While many GNN-based solutions have emerged, the supervised paradigm depends on a large quantity of labeled data. A popular alternative is to adopt self-supervised pre-training using label-free data, and further fine-tune on a downstream task with limited labels. Nevertheless, the "pre-train, fine-tune" paradigm is of… ▽ More

    Submitted 30 August, 2023; v1 submitted 19 August, 2023; originally announced August 2023.

    Comments: 7 pages, Accepted by CIKM23 Applied Research Track

  50. arXiv:2308.06470  [pdf, ps, other

    math.OC

    On the Optimal Lower and Upper Complexity Bounds for a Class of Composite Optimization Problems

    Authors: Zhenyuan Zhu, Fan Chen, Junyu Zhang, Zaiwen Wen

    Abstract: We study the optimal lower and upper complexity bounds for finding approximate solutions to the composite problem $\min_x\ f(x)+h(Ax-b)$, where $f$ is smooth and $h$ is convex. Given access to the proximal operator of $h$, for strongly convex, convex, and nonconvex $f$, we design efficient first order algorithms with complexities $\tilde{O}\left(κ_A\sqrt{κ_f}\log\left(1/ε\right)\right)$,… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    MSC Class: 90C25; 90C26; 90C46; 90C60