Zum Hauptinhalt springen

Showing 1–50 of 355 results for author: Xie, Q

.
  1. arXiv:2408.13518  [pdf, other

    cs.CL cs.AI cs.LG

    Selective Preference Optimization via Token-Level Reward Function Estimation

    Authors: Kailai Yang, Zhiwei Liu, Qianqian Xie, Jimin Huang, Erxue Min, Sophia Ananiadou

    Abstract: Recent advancements in large language model alignment leverage token-level supervisions to perform fine-grained preference optimization. However, existing token-level alignment methods either optimize on all available tokens, which can be noisy and inefficient, or perform selective training with complex and expensive key token selection strategies. In this work, we propose Selective Preference Opt… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

    Comments: Work in progress

  2. arXiv:2408.11878  [pdf, other

    cs.CL cs.CE q-fin.CP

    Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

    Authors: Qianqian Xie, Dong Li, Mengxi Xiao, Zihao Jiang, Ruoyu Xiang, Xiao Zhang, Zhengyu Chen, Yueru He, Weiguang Han, Yuzhe Yang, Shunian Chen, Yifei Zhang, Lihang Shen, Daniel Kim, Zhiwei Liu, Zheheng Luo, Yangyang Yu, Yupeng Cao, Zhiyang Deng, Zhiyuan Yao, Haohang Li, Duanyu Feng, Yongfu Dai, VijayaSai Somasundaram, Peng Lu , et al. (14 additional authors not shown)

    Abstract: Large language models (LLMs) have advanced financial applications, yet they often lack sufficient financial knowledge and struggle with tasks involving multi-modal inputs like tables and time series data. To address these limitations, we introduce \textit{Open-FinLLMs}, a series of Financial LLMs. We begin with FinLLaMA, pre-trained on a 52 billion token financial corpus, incorporating text, table… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 33 pages, 13 figures

  3. arXiv:2408.07482  [pdf, other

    cs.DC cs.AI

    Training Overhead Ratio: A Practical Reliability Metric for Large Language Model Training Systems

    Authors: Ning Lu, Qian Xie, Hao Zhang, Wenyi Fang, Yang Zheng, Jiantao Ma

    Abstract: Large Language Models (LLMs) are revolutionizing the AI industry with their superior capabilities. Training these models requires large-scale GPU clusters and significant computing time, leading to frequent failures that significantly increase training costs. Despite its significance, this field lacks a metric for evaluating reliability. In this work, we introduce a novel reliability metric called… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: preprint, under review

  4. arXiv:2408.06197  [pdf, other

    cs.CR cs.DC

    Lancelot: Towards Efficient and Privacy-Preserving Byzantine-Robust Federated Learning within Fully Homomorphic Encryption

    Authors: Siyang Jiang, Hao Yang, Qipeng Xie, Chuan Ma, Sen Wang, Guoliang Xing

    Abstract: In sectors such as finance and healthcare, where data governance is subject to rigorous regulatory requirements, the exchange and utilization of data are particularly challenging. Federated Learning (FL) has risen as a pioneering distributed machine learning paradigm that enables collaborative model training across multiple institutions while maintaining data decentralization. Despite its advantag… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: 26 pages

  5. arXiv:2408.02927  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    HARMONIC: Harnessing LLMs for Tabular Data Synthesis and Privacy Protection

    Authors: Yuxin Wang, Duanyu Feng, Yongfu Dai, Zhengyu Chen, Jimin Huang, Sophia Ananiadou, Qianqian Xie, Hao Wang

    Abstract: Data serves as the fundamental foundation for advancing deep learning, particularly tabular data presented in a structured format, which is highly conducive to modeling. However, even in the era of LLM, obtaining tabular data from sensitive domains remains a challenge due to privacy or copyright concerns. Hence, exploring how to effectively use models like LLMs to generate realistic and privacy-pr… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  6. arXiv:2407.16541  [pdf, other

    cs.CV cs.MM

    QPT V2: Masked Image Modeling Advances Visual Scoring

    Authors: Qizhi Xie, Kun Yuan, Yunpeng Qu, Mingda Wu, Ming Sun, Chao Zhou, Jihong Zhu

    Abstract: Quality assessment and aesthetics assessment aim to evaluate the perceived quality and aesthetics of visual content. Current learning-based methods suffer greatly from the scarcity of labeled data and usually perform sub-optimally in terms of generalization. Although masked image modeling (MIM) has achieved noteworthy advancements across various high-level tasks (e.g., classification, detection et… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: 8 pages, 6 figures

  7. arXiv:2407.08986  [pdf

    cs.CY

    Exploring Generative AI Policies in Higher Education: A Comparative Perspective from China, Japan, Mongolia, and the USA

    Authors: Qin Xie, Ming Li, Ariunaa Enkhtur

    Abstract: This study conducts a comparative analysis of national policies on Generative AI across four countries: China, Japan, Mongolia, and the USA. Employing the Qualitative Comparative Analysis (QCA) method, it examines the responses of these nations to Generative AI in higher education settings, scrutinizing the diversity in their approaches within this group. While all four countries exhibit a positiv… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 14 pages, 1 table

  8. arXiv:2407.06567  [pdf, other

    cs.CL

    FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision Making

    Authors: Yangyang Yu, Zhiyuan Yao, Haohang Li, Zhiyang Deng, Yupeng Cao, Zhi Chen, Jordan W. Suchow, Rong Liu, Zhenyu Cui, Denghui Zhang, Koduvayur Subbalakshmi, Guojun Xiong, Yueru He, Jimin Huang, Dong Li, Qianqian Xie

    Abstract: Large language models (LLMs) have demonstrated notable potential in conducting complex tasks and are increasingly utilized in various financial applications. However, high-quality sequential financial investment decision-making remains challenging. These tasks require multiple interactions with a volatile environment for every decision, demanding sufficient intelligence to maximize returns and man… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: LLM Applications, LLM Agents, Financial Technology, Quantitative Finance, Algorithmic Trading, Cognitive Science

  9. arXiv:2407.00559  [pdf

    physics.optics physics.app-ph

    Neural Network-Assisted End-to-End Design for Dispersive Full-Parameter Control of Meta-Optics

    Authors: Hanbin Chi, Yueqiang Hu, Xiangnian Ou, Yuting Jiang, Dian Yu, Shaozhen Lou, Quan Wang, Qiong Xie, Cheng-Wei Qiu, Huigao Duan

    Abstract: Flexible control light field across multiple parameters is the cornerstone of versatile and miniaturized optical devices. Metasurfaces, comprising subwavelength scatterers, offer a potent platform for executing such precise manipulations. However, the inherent mutual constraints between parameters of metasurfaces make it challenging for traditional approaches to achieve full-parameter control acro… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  10. arXiv:2406.20062  [pdf, other

    cs.LG stat.ML

    Cost-aware Bayesian optimization via the Pandora's Box Gittins index

    Authors: Qian Xie, Raul Astudillo, Peter Frazier, Ziv Scully, Alexander Terenin

    Abstract: Bayesian optimization is a technique for efficiently optimizing unknown functions in a black-box manner. To handle practical settings where gathering data requires use of finite resources, it is desirable to explicitly incorporate function evaluation costs into Bayesian optimization policies. To understand how to do so, we develop a previously-unexplored connection between cost-aware Bayesian opti… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  11. arXiv:2406.18884  [pdf, other

    cs.AI

    Sequential three-way group decision-making for double hierarchy hesitant fuzzy linguistic term set

    Authors: Nanfang Luo, Qinghua Zhang, Qin Xie, Yutai Wang, Longjun Yin, Guoyin Wang

    Abstract: Group decision-making (GDM) characterized by complexity and uncertainty is an essential part of various life scenarios. Most existing researches lack tools to fuse information quickly and interpret decision results for partially formed decisions. This limitation is particularly noticeable when there is a need to improve the efficiency of GDM. To address this issue, a novel multi-level sequential t… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  12. arXiv:2406.17114  [pdf, other

    cs.LG cs.CR cs.GT

    Inception: Efficiently Computable Misinformation Attacks on Markov Games

    Authors: Jeremy McMahan, Young Wu, Yudong Chen, Xiaojin Zhu, Qiaomin Xie

    Abstract: We study security threats to Markov games due to information asymmetry and misinformation. We consider an attacker player who can spread misinformation about its reward function to influence the robust victim player's behavior. Given a fixed fake reward function, we derive the victim's policy under worst-case rationality and present polynomial-time algorithms to compute the attacker's optimal wors… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted to Reinforcement Learning Conference (RLC) 2024

  13. arXiv:2406.17100  [pdf, other

    cs.CV

    Fine-tuning Diffusion Models for Enhancing Face Quality in Text-to-image Generation

    Authors: Zhenyi Liao, Qingsong Xie, Chen Chen, Hannan Lu, Zhijie Deng

    Abstract: Diffusion models (DMs) have achieved significant success in generating imaginative images given textual descriptions. However, they are likely to fall short when it comes to real-life scenarios with intricate details.The low-quality, unrealistic human faces in text-to-image generation are one of the most prominent issues, hindering the wide application of DMs in practice. Targeting addressing such… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Under review

  14. arXiv:2406.11328  [pdf, other

    cs.CL

    Are Large Language Models True Healthcare Jacks-of-All-Trades? Benchmarking Across Health Professions Beyond Physician Exams

    Authors: Zheheng Luo, Chenhan Yuan, Qianqian Xie, Sophia Ananiadou

    Abstract: Recent advancements in Large Language Models (LLMs) have demonstrated their potential in delivering accurate answers to questions about world knowledge. Despite this, existing benchmarks for evaluating LLMs in healthcare predominantly focus on medical doctors, leaving other critical healthcare professions underrepresented. To fill this research gap, we introduce the Examinations for Medical Person… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 15 pages, 4 figures

  15. arXiv:2406.11093  [pdf, other

    cs.CL

    RAEmoLLM: Retrieval Augmented LLMs for Cross-Domain Misinformation Detection Using In-Context Learning based on Emotional Information

    Authors: Zhiwei Liu, Kailai Yang, Qianqian Xie, Christine de Kock, Sophia Ananiadou, Eduard Hovy

    Abstract: Misinformation is prevalent in various fields such as education, politics, health, etc., causing significant harm to society. However, current methods for cross-domain misinformation detection rely on time and resources consuming fine-tuning and complex model structures. With the outstanding performance of LLMs, many studies have employed them for misinformation detection. Unfortunately, they focu… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  16. arXiv:2406.10816  [pdf, ps, other

    cs.PL cs.AI cs.AR cs.PF

    Optimization of Armv9 architecture general large language model inference performance based on Llama.cpp

    Authors: Longhao Chen, Yina Zhao, Qiangjun Xie, Qinghua Sheng

    Abstract: This article optimizes the inference performance of the Qwen-1.8B model by performing Int8 quantization, vectorizing some operators in llama.cpp, and modifying the compilation script to improve the compiler optimization level. On the Yitian 710 experimental platform, the prefill performance is increased by 1.6 times, the decoding performance is increased by 24 times, the memory usage is reduced to… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  17. arXiv:2406.08847  [pdf, other

    cs.GT cs.DS cs.LG

    Roping in Uncertainty: Robustness and Regularization in Markov Games

    Authors: Jeremy McMahan, Giovanni Artiglio, Qiaomin Xie

    Abstract: We study robust Markov games (RMG) with $s$-rectangular uncertainty. We show a general equivalence between computing a robust Nash equilibrium (RNE) of a $s$-rectangular RMG and computing a Nash equilibrium (NE) of an appropriately constructed regularized MG. The equivalence result yields a planning algorithm for solving $s$-rectangular RMGs, as well as provable robustness guarantees for policies… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024

  18. arXiv:2406.05768  [pdf, other

    cs.CV cs.AI

    MLCM: Multistep Consistency Distillation of Latent Diffusion Model

    Authors: Qingsong Xie, Zhenyi Liao, Chen chen, Zhijie Deng, Shixiang Tang, Haonan Lu

    Abstract: Distilling large latent diffusion models (LDMs) into ones that are fast to sample from is attracting growing research interest. However, the majority of existing methods face a dilemma where they either (i) depend on multiple individual distilled models for different sampling budgets, or (ii) sacrifice generation quality with limited (e.g., 2-4) and/or moderate (e.g., 5-8) sampling steps. To addre… ▽ More

    Submitted 11 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

  19. arXiv:2406.05064  [pdf, other

    cs.LG

    Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning

    Authors: Subhojyoti Mukherjee, Josiah P. Hanna, Qiaomin Xie, Robert Nowak

    Abstract: In this paper, we study multi-task structured bandit problem where the goal is to learn a near-optimal algorithm that minimizes cumulative regret. The tasks share a common structure and the algorithm exploits the shared structure to minimize the cumulative regret for an unseen but related test task. We use a transformer as a decision-making algorithm to learn this shared structure so as to general… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  20. arXiv:2406.01007  [pdf, other

    hep-ex

    Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay

    Authors: Daya Bay collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, J. Cheng, Y. -C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng , et al. (177 additional authors not shown)

    Abstract: This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  21. arXiv:2406.00341  [pdf, other

    eess.IV cs.CV

    DSCA: A Digital Subtraction Angiography Sequence Dataset and Spatio-Temporal Model for Cerebral Artery Segmentation

    Authors: Qihang Xie, Mengguo Guo, Lei Mou, Dan Zhang, Da Chen, Caifeng Shan, Yitian Zhao, Ruisheng Su, Jiong Zhang

    Abstract: Cerebrovascular diseases (CVDs) remain a leading cause of global disability and mortality. Digital Subtraction Angiography (DSA) sequences, recognized as the golden standard for diagnosing CVDs, can clearly visualize the dynamic flow and reveal pathological conditions within the cerebrovasculature. Therefore, precise segmentation of cerebral arteries (CAs) and classification between their main tru… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  22. arXiv:2406.00296  [pdf

    quant-ph physics.comp-ph

    A Novel Quantum-Classical Hybrid Algorithm for Determining Eigenstate Energies in Quantum Systems

    Authors: Qing-Xing Xie, Yan Zhao

    Abstract: Developing efficient quantum computing algorithms is crucial for addressing computationally challenging problems across various fields. In this paper, we introduce a novel quantum XZ24 algorithm, designed for efficiently computing the eigen-energy spectra of any quantum systems. The algorithm employs an auxiliary qubit as a control qubit to execute a pair of time-reversing real-time evolutions of… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 20 pages, 5 figures

  23. arXiv:2405.21013  [pdf, other

    cs.CV

    StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond

    Authors: Pengyuan Lyu, Yulin Li, Hao Zhou, Weihong Ma, Xingyu Wan, Qunyi Xie, Liang Wu, Chengquan Zhang, Kun Yao, Errui Ding, Jingdong Wang

    Abstract: Text-rich images have significant and extensive value, deeply integrated into various aspects of human life. Notably, both visual cues and linguistic symbols in text-rich images play crucial roles in information transmission but are accompanied by diverse challenges. Therefore, the efficient and effective understanding of text-rich images is a crucial litmus test for the capability of Vision-Langu… ▽ More

    Submitted 4 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

  24. arXiv:2405.17882  [pdf, ps, other

    cs.LG math.OC math.PR

    When is exponential asymptotic optimality achievable in average-reward restless bandits?

    Authors: Yige Hong, Qiaomin Xie, Yudong Chen, Weina Wang

    Abstract: We consider the discrete-time infinite-horizon average-reward restless bandit problem. We propose a novel policy that maintains two dynamic subsets of arms: one subset of arms has a nearly optimal state distribution and takes actions according to an Optimal Local Control routine; the other subset of arms is driven towards the optimal state distribution and gradually merged into the first subset. W… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 46 pages, 1 figure

    MSC Class: 90C40 ACM Class: G.3; I.6

  25. arXiv:2405.17790  [pdf, other

    cs.CV

    Instruct-ReID++: Towards Universal Purpose Instruction-Guided Person Re-identification

    Authors: Weizhen He, Yiheng Deng, Yunfeng Yan, Feng Zhu, Yizhou Wang, Lei Bai, Qingsong Xie, Donglian Qi, Wanli Ouyang, Shixiang Tang

    Abstract: Human intelligence can retrieve any person according to both visual and language descriptions. However, the current computer vision community studies specific person re-identification (ReID) tasks in different scenarios separately, which limits the applications in the real world. This paper strives to resolve this problem by proposing a novel instruct-ReID task that requires the model to retrieve… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2306.07520

  26. arXiv:2405.16732  [pdf, ps, other

    stat.ML cs.LG math.OC math.ST

    The Collusion of Memory and Nonlinearity in Stochastic Approximation With Constant Stepsize

    Authors: Dongyan Huo, Yixuan Zhang, Yudong Chen, Qiaomin Xie

    Abstract: In this work, we investigate stochastic approximation (SA) with Markovian data and nonlinear updates under constant stepsize $α>0$. Existing work has primarily focused on either i.i.d. data or linear update rules. We take a new perspective and carefully examine the simultaneous presence of Markovian dependency of data and nonlinear update rules, delineating how the interplay between these two stru… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  27. arXiv:2405.12408  [pdf, other

    cs.RO eess.SY

    Flexible Active Safety Motion Control for Robotic Obstacle Avoidance: A CBF-Guided MPC Approach

    Authors: Jinhao Liu, Jun Yang, Jianliang Mao, Tianqi Zhu, Qihang Xie, Yimeng Li, Xiangyu Wang, Shihua Li

    Abstract: A flexible active safety motion (FASM) control approach is proposed for the avoidance of dynamic obstacles and the reference tracking in robot manipulators. The distinctive feature of the proposed method lies in its utilization of control barrier functions (CBF) to design flexible CBF-guided safety criteria (CBFSC) with dynamically optimized decay rates, thereby offering flexibility and active saf… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 11 pages, 11 figures

  28. arXiv:2404.11098  [pdf, other

    cs.CV

    LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models

    Authors: Dingkun Zhang, Sijia Li, Chen Chen, Qingsong Xie, Haonan Lu

    Abstract: In the era of AIGC, the demand for low-budget or even on-device applications of diffusion models emerged. In terms of compressing the Stable Diffusion models (SDMs), several approaches have been proposed, and most of them leveraged the handcrafted layer removal methods to obtain smaller U-Nets, along with knowledge distillation to recover the network performance. However, such a handcrafting manne… ▽ More

    Submitted 18 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  29. arXiv:2404.06756  [pdf, other

    cs.LG cs.AI

    CrimeAlarm: Towards Intensive Intent Dynamics in Fine-grained Crime Prediction

    Authors: Kaixi Hu, Lin Li, Qing Xie, Xiaohui Tao, Guandong Xu

    Abstract: Granularity and accuracy are two crucial factors for crime event prediction. Within fine-grained event classification, multiple criminal intents may alternately exhibit in preceding sequential events, and progress differently in next. Such intensive intent dynamics makes training models hard to capture unobserved intents, and thus leads to sub-optimal generalization performance, especially in the… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted by DASFAA 2024

  30. arXiv:2404.06023  [pdf, other

    stat.ML cs.LG math.OC math.PR

    Prelimit Coupling and Steady-State Convergence of Constant-stepsize Nonsmooth Contractive SA

    Authors: Yixuan Zhang, Dongyan Huo, Yudong Chen, Qiaomin Xie

    Abstract: Motivated by Q-learning, we study nonsmooth contractive stochastic approximation (SA) with constant stepsize. We focus on two important classes of dynamics: 1) nonsmooth contractive SA with additive noise, and 2) synchronous and asynchronous Q-learning, which features both additive and multiplicative noise. For both dynamics, we establish weak convergence of the iterates to a stationary limit dist… ▽ More

    Submitted 24 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: ACM SIGMETRICS 2024. 71 pages, 3 figures

  31. arXiv:2404.01687  [pdf, other

    hep-ex

    Search for a sub-eV sterile neutrino using Daya Bay's full dataset

    Authors: F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding, Y. Y. Ding , et al. (176 additional authors not shown)

    Abstract: This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis… ▽ More

    Submitted 20 August, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 7 pages, 4 figures, 1 table

  32. arXiv:2404.00236  [pdf, other

    cs.IR cs.CL

    Enhancing Content-based Recommendation via Large Language Model

    Authors: Wentao Xu, Qianqian Xie, Shuo Yang, Jiangxia Cao, Shuchao Pang

    Abstract: In real-world applications, users express different behaviors when they interact with different items, including implicit click/like interactions, and explicit comments/reviews interactions. Nevertheless, almost all recommender works are focused on how to describe user preferences by the implicit click/like interactions, to find the synergy of people. For the content-based explicit comments/review… ▽ More

    Submitted 27 July, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

    Comments: Accepted at CIKM 2024

  33. arXiv:2403.20041  [pdf

    cs.CL

    Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

    Authors: Luchang Li, Sheng Qian, Jie Lu, Lunxi Yuan, Rui Wang, Qin Xie

    Abstract: The Large Language Model (LLM) is widely employed for tasks such as intelligent assistants, text summarization, translation, and multi-modality on mobile phones. However, the current methods for on-device LLM deployment maintain slow inference speed, which causes poor user experience. To facilitate high-efficiency LLM deployment on device GPUs, we propose four optimization techniques: (a) a symbol… ▽ More

    Submitted 5 July, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: 21 pages, 6 figures, fix "E0M4" spell mistake, fix FLOPS to TFLOPS

  34. arXiv:2403.17141  [pdf, other

    cs.CL cs.AI

    MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models

    Authors: Kailai Yang, Zhiwei Liu, Qianqian Xie, Jimin Huang, Tianlin Zhang, Sophia Ananiadou

    Abstract: Recent advancements in large language models (LLMs) aim to tackle heterogeneous human expectations and values via multi-objective preference alignment. However, existing methods are parameter-adherent to the policy model, leading to two key limitations: (1) the high-cost repetition of their alignment algorithms for each new target model; (2) they cannot expand to unseen objectives due to their sta… ▽ More

    Submitted 6 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: Work in progress

  35. arXiv:2403.09993  [pdf, other

    cs.CV eess.IV

    TRG-Net: An Interpretable and Controllable Rain Generator

    Authors: Zhiqiang Pang, Hong Wang, Qi Xie, Deyu Meng, Zongben Xu

    Abstract: Exploring and modeling rain generation mechanism is critical for augmenting paired data to ease training of rainy image processing models. Against this task, this study proposes a novel deep learning based rain generator, which fully takes the physical generation mechanism underlying rains into consideration and well encodes the learning of the fundamental rain factors (i.e., shape, orientation, l… ▽ More

    Submitted 29 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  36. arXiv:2403.06249  [pdf, other

    cs.CE cs.CL

    No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks

    Authors: Gang Hu, Ke Qin, Chenhan Yuan, Min Peng, Alejandro Lopez-Lira, Benyou Wang, Sophia Ananiadou, Jimin Huang, Qianqian Xie

    Abstract: While the progression of Large Language Models (LLMs) has notably propelled financial analysis, their application has largely been confined to singular language realms, leaving untapped the potential of bilingual Chinese-English capacity. To bridge this chasm, we introduce ICE-PIXIU, seamlessly amalgamating the ICE-INTENT model and ICE-FLARE benchmark for bilingual financial analysis. ICE-PIXIU un… ▽ More

    Submitted 16 August, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: 19 pages, 3 figures, 12 tables, including Appendix

  37. arXiv:2403.05574  [pdf, other

    cs.HC cs.AI cs.CL

    HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy

    Authors: Mengxi Xiao, Qianqian Xie, Ziyan Kuang, Zhicheng Liu, Kailai Yang, Min Peng, Weiguang Han, Jimin Huang

    Abstract: Large Language Models (LLMs) can play a vital role in psychotherapy by adeptly handling the crucial task of cognitive reframing and overcoming challenges such as shame, distrust, therapist skill variability, and resource scarcity. Previous LLMs in cognitive reframing mainly converted negative emotions to positive ones, but these approaches have limited efficacy, often not promoting clients' self-d… ▽ More

    Submitted 29 July, 2024; v1 submitted 26 February, 2024; originally announced March 2024.

    Comments: 19 pages, 4 figures

    ACM Class: J.4

  38. arXiv:2403.05049  [pdf, other

    cs.CV

    XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution

    Authors: Yunpeng Qu, Kun Yuan, Kai Zhao, Qizhi Xie, Jinhua Hao, Ming Sun, Chao Zhou

    Abstract: Diffusion-based methods, endowed with a formidable generative prior, have received increasing attention in Image Super-Resolution (ISR) recently. However, as low-resolution (LR) images often undergo severe degradation, it is challenging for ISR models to perceive the semantic and degradation information, resulting in restoration images with incorrect content or unrealistic artifacts. To address th… ▽ More

    Submitted 19 July, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: 19 pages, 7 figures; including supplementary material

  39. arXiv:2403.01505  [pdf, other

    cs.CV

    SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation

    Authors: Hongjian Liu, Qingsong Xie, Zhijie Deng, Chen Chen, Shixiang Tang, Fueyang Fu, Zheng-jun Zha, Haonan Lu

    Abstract: The iterative sampling procedure employed by diffusion models (DMs) often leads to significant inference latency. To address this, we propose Stochastic Consistency Distillation (SCott) to enable accelerated text-to-image generation, where high-quality generations can be achieved with just 1-2 sampling steps, and further improvements can be obtained by adding additional steps. In contrast to vanil… ▽ More

    Submitted 15 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: 22 pages, 16 figures

  40. arXiv:2402.18180  [pdf, other

    cs.CY

    Human Simulacra: Benchmarking the Personification of Large Language Models

    Authors: Qiuejie Xie, Qiming Feng, Tianqi Zhang, Qingqiu Li, Linyi Yang, Yuejie Zhang, Rui Feng, Liang He, Shang Gao, Yue Zhang

    Abstract: Large language models (LLMs) are recognized as systems that closely mimic aspects of human intelligence. This capability has attracted attention from the social science community, who see the potential in leveraging LLMs to replace human participants in experiments, thereby reducing research costs and complexity. In this paper, we introduce a framework for large language models personification, in… ▽ More

    Submitted 9 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

  41. arXiv:2402.13758  [pdf, other

    cs.CL

    Factual Consistency Evaluation of Summarisation in the Era of Large Language Models

    Authors: Zheheng Luo, Qianqian Xie, Sophia Ananiadou

    Abstract: Factual inconsistency with source documents in automatically generated summaries can lead to misinformation or pose risks. Existing factual consistency(FC) metrics are constrained by their performance, efficiency, and explainability. Recent advances in Large language models (LLMs) have demonstrated remarkable potential in text evaluation but their effectiveness in assessing FC in summarisation rem… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 5 figures

  42. arXiv:2402.13498  [pdf, other

    cs.CL

    The Lay Person's Guide to Biomedicine: Orchestrating Large Language Models

    Authors: Zheheng Luo, Qianqian Xie, Sophia Ananiadou

    Abstract: Automated lay summarisation (LS) aims to simplify complex technical documents into a more accessible format to non-experts. Existing approaches using pre-trained language models, possibly augmented with external background knowledge, tend to struggle with effective simplification and explanation. Moreover, automated methods that can effectively assess the `layness' of generated summaries are lacki… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 18 pages, 4 figures

  43. arXiv:2402.12749  [pdf

    cs.CL cs.AI

    Me LLaMA: Foundation Large Language Models for Medical Applications

    Authors: Qianqian Xie, Qingyu Chen, Aokun Chen, Cheng Peng, Yan Hu, Fongci Lin, Xueqing Peng, Jimin Huang, Jeffrey Zhang, Vipina Keloth, Xinyu Zhou, Huan He, Lucila Ohno-Machado, Yonghui Wu, Hua Xu, Jiang Bian

    Abstract: Recent advancements in large language models (LLMs) such as ChatGPT and LLaMA have hinted at their potential to revolutionize medical applications, yet their application in clinical settings often reveals limitations due to a lack of specialized training on medical-specific data. In response to this challenge, this study introduces Me-LLaMA, a novel medical LLM family that includes foundation mode… ▽ More

    Submitted 11 April, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: 21 pages, 3 figures, 8 tables

  44. arXiv:2402.12659  [pdf, other

    cs.CL cs.AI cs.CE

    FinBen: A Holistic Financial Benchmark for Large Language Models

    Authors: Qianqian Xie, Weiguang Han, Zhengyu Chen, Ruoyu Xiang, Xiao Zhang, Yueru He, Mengxi Xiao, Dong Li, Yongfu Dai, Duanyu Feng, Yijing Xu, Haoqiang Kang, Ziyan Kuang, Chenhan Yuan, Kailai Yang, Zheheng Luo, Tianlin Zhang, Zhiwei Liu, Guojun Xiong, Zhiyang Deng, Yuechen Jiang, Zhiyuan Yao, Haohang Li, Yangyang Yu, Gang Hu , et al. (9 additional authors not shown)

    Abstract: LLMs have transformed NLP and shown promise in various fields, yet their potential in finance is underexplored due to a lack of comprehensive evaluation benchmarks, the rapid development of LLMs, and the complexity of financial tasks. In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical… ▽ More

    Submitted 18 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: 26 pages, 11 figures

  45. arXiv:2402.07405  [pdf, other

    cs.CL

    Dólares or Dollars? Unraveling the Bilingual Prowess of Financial LLMs Between Spanish and English

    Authors: Xiao Zhang, Ruoyu Xiang, Chenhan Yuan, Duanyu Feng, Weiguang Han, Alejandro Lopez-Lira, Xiao-Yang Liu, Sophia Ananiadou, Min Peng, Jimin Huang, Qianqian Xie

    Abstract: Despite Spanish's pivotal role in the global finance industry, a pronounced gap exists in Spanish financial natural language processing (NLP) and application studies compared to English, especially in the era of large language models (LLMs). To bridge this gap, we unveil Toisón de Oro, the first bilingual framework that establishes instruction datasets, finetuned LLMs, and evaluation benchmark for… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

    Comments: 10 pages, 2 figures

  46. arXiv:2402.07220  [pdf, other

    eess.IV cs.CV

    KVQ: Kwai Video Quality Assessment for Short-form Videos

    Authors: Yiting Lu, Xin Li, Yajing Pei, Kun Yuan, Qizhi Xie, Yunpeng Qu, Ming Sun, Chao Zhou, Zhibo Chen

    Abstract: Short-form UGC video platforms, like Kwai and TikTok, have been an emerging and irreplaceable mainstream media form, thriving on user-friendly engagement, and kaleidoscope creation, etc. However, the advancing content-generation modes, e.g., special effects, and sophisticated processing workflows, e.g., de-artifacts, have introduced significant challenges to recent UGC video quality assessment: (i… ▽ More

    Submitted 20 February, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: 19 pages

  47. arXiv:2402.05689  [pdf, other

    cs.LG math.OC math.PR

    Unichain and Aperiodicity are Sufficient for Asymptotic Optimality of Average-Reward Restless Bandits

    Authors: Yige Hong, Qiaomin Xie, Yudong Chen, Weina Wang

    Abstract: We consider the infinite-horizon, average-reward restless bandit problem in discrete time. We propose a new class of policies that are designed to drive a progressively larger subset of arms toward the optimal distribution. We show that our policies are asymptotically optimal with an $O(1/\sqrt{N})$ optimality gap for an $N$-armed problem, provided that the single-armed MDP is unichain and aperiod… ▽ More

    Submitted 13 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: 49 pages, 3 figures. This version adds details on the unichain condition, stationary distribution, and long-run time average

    MSC Class: 90C40 ACM Class: G.3; I.6

  48. arXiv:2402.05383  [pdf, other

    nucl-ex hep-ex

    First measurement of the yield of $^8$He isotopes produced in liquid scintillator by cosmic-ray muons at Daya Bay

    Authors: Daya Bay Collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding , et al. (177 additional authors not shown)

    Abstract: Daya Bay presents the first measurement of cosmogenic $^8$He isotope production in liquid scintillator, using an innovative method for identifying cascade decays of $^8$He and its child isotope, $^8$Li. We also measure the production yield of $^9$Li isotopes using well-established methodology. The results, in units of 10$^{-8}μ^{-1}$g$^{-1}$cm$^{2}$, are 0.307$\pm$0.042, 0.341$\pm$0.040, and 0.546… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  49. arXiv:2402.03193  [pdf, other

    hep-th gr-qc hep-ph

    Spinning $Q$-ball Superradiance in 3+1D

    Authors: Guo-Dong Zhang, Fu-Ming Chang, Paul M. Saffin, Qi-Xin Xie, Shuang-Yong Zhou

    Abstract: Recently, it has been found that a $Q$-ball can amplify waves incident upon it, due to rotation in the internal space and the interaction of the two modes in the complex scalar field. While the spherically symmetric 3D case has been investigated previously, here we explore the 3D axi-symmetric case, which is numerically much more challenging. The difficulty comes because a partial wave expansion i… ▽ More

    Submitted 26 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 20 pages, 14 figures

    Report number: USTC-ICTS/PCFT-23-17

  50. arXiv:2401.14758  [pdf, other

    cs.LG

    Off-Policy Primal-Dual Safe Reinforcement Learning

    Authors: Zifan Wu, Bo Tang, Qian Lin, Chao Yu, Shangqin Mao, Qianlong Xie, Xingxing Wang, Dong Wang

    Abstract: Primal-dual safe RL methods commonly perform iterations between the primal update of the policy and the dual update of the Lagrange Multiplier. Such a training paradigm is highly susceptible to the error in cumulative cost estimation since this estimation serves as the key bond connecting the primal and dual update processes. We show that this problem causes significant underestimation of cost whe… ▽ More

    Submitted 15 April, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: ICLR 2024 Poster