Zum Hauptinhalt springen

Showing 1–50 of 713 results for author: Wei, F

.
  1. arXiv:2408.17224  [pdf, other

    hep-ex

    Hadronic cross section measurements with the DAMPE space mission using 20GeV-10TeV cosmic-ray protons and $^4$He

    Authors: F. Alemanno, Q. An, P. Azzarello, F. C. T. Barbato, P. Bernardini, X. J. Bi, I. Cagnoli, M. S. Cai, E. Casilli, E. Catanzani, J. Chang, D. Y. Chen, J. L. Chen, Z. F. Chen, P. Coppin, M. Y. Cui, T. S. Cui, Y. X. Cui, H. T. Dai, A. De Benedittis, I. De Mitri, F. de Palma, A. Di Giovanni, Q. Ding, T. K. Dong , et al. (126 additional authors not shown)

    Abstract: Precise direct cosmic-ray (CR) measurements provide an important probe to study the energetic particle sources in our Galaxy, and the interstellar environment through which these particles propagate. Uncertainties on hadronic models, ion-nucleon cross sections in particular, are currently the limiting factor towards obtaining more accurate CR ion flux measurements with calorimetric space-based exp… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: 17 pages, submitted to PRD

  2. arXiv:2408.16229  [pdf, ps, other

    hep-ex

    Upgrading the existing Haloscope-type detector for sensitive axion detection

    Authors: L. Gao, H. Zheng, X. N. Feng, L. B. Zhao, L. F. Wei

    Abstract: Haloscope is one of the typical installations to detect the electromagnetic responses (EMRs) of axion field in radio-frequency (rf) band. Given what the detection by the existing Haloscope-type detector (HTD) biased only by a high stationary magnetic field, is just the second axion-photon energy converted effect and thus the detectable signal is still significantly weak, here we propose a feasible… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: 22 pages,3 figures

  3. arXiv:2408.12082  [pdf, other

    math.CO

    Extremal number of cliques of given orders in graphs with a forbidden clique minor

    Authors: Ruilin Shi, Fan Wei

    Abstract: Alon and Shikhelman initiated the systematic study of a generalization of the extremal function. Motivated by algorithmic applications, the study of the extremal function $\text{ex}(n, K_k, K_t\text{-minor})$, i.e., the number of cliques of order $k$ in $K_t$-minor free graphs on $n$ vertices, has received much attention. In this paper, we determine essentially sharp bounds on the maximum possible… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  4. arXiv:2408.11144  [pdf, other

    hep-ex nucl-ex

    Measurement of inclusive jet cross section and substructure in $p$$+$$p$ collisions at $\sqrt{s_{_{NN}}}=200$ GeV

    Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, C. Aidala, N. N. Ajitanand, Y. Akiba, R. Akimoto, J. Alexander, M. Alfred, V. Andrieux, S. Antsupov, K. Aoki, N. Apadula, H. Asano, E. T. Atomssa, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, X. Bai, N. S. Bandara, B. Bannier, E. Bannikov, K. N. Barish, S. Bathe , et al. (422 additional authors not shown)

    Abstract: The jet cross-section and jet-substructure observables in $p$$+$$p$ collisions at $\sqrt{s}=200$ GeV were measured by the PHENIX Collaboration at the Relativistic Heavy Ion Collider (RHIC). Jets are reconstructed from charged-particle tracks and electromagnetic-calorimeter clusters using the anti-$k_{t}$ algorithm with a jet radius $R=0.3$ for jets with transverse momentum within $8.0<p_T<40.0$ Ge… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 446 authors from 77 institutions, 11 pages, 8 figures. v1 is version submitted to Physical Review D. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

  5. arXiv:2408.10926  [pdf, other

    astro-ph.IM hep-ex hep-ph

    GRANDlib: A simulation pipeline for the Giant Radio Array for Neutrino Detection (GRAND)

    Authors: GRAND Collaboration, Rafael Alves Batista, Aurélien Benoit-Lévy, Teresa Bister, Martina Bohacova, Mauricio Bustamante, Washington Carvalho, Yiren Chen, LingMei Cheng, Simon Chiche, Jean-Marc Colley, Pablo Correa, Nicoleta Cucu Laurenciu, Zigao Dai, Rogerio M. de Almeida, Beatriz de Errico, Sijbrand de Jong, João R. T. de Mello Neto, Krijn D. de Vries, Valentin Decoene, Peter B. Denton, Bohao Duan, Kaikai Duan, Ralph Engel, William Erba , et al. (90 additional authors not shown)

    Abstract: The operation of upcoming ultra-high-energy cosmic-ray, gamma-ray, and neutrino radio-detection experiments, like the Giant Radio Array for Neutrino Detection (GRAND), poses significant computational challenges involving the production of numerous simulations of particle showers and their detection, and a high data throughput. GRANDlib is an open-source software tool designed to meet these challen… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 11 pages, 9 figures, plus appendices

  6. arXiv:2407.19499  [pdf, other

    quant-ph

    Optimization for expectation value estimation with shallow quantum circuits

    Authors: Bujiao Wu, Yuxuan Yan, Fuchuan Wei, Zhenhuan Liu

    Abstract: Estimating linear properties of quantum states, such as fidelities, molecular energies, and correlation functions, is a fundamental task in quantum information science. The classical shadow has emerged as a prevalent tool due to its efficiency in estimating many independent observables simultaneously. However, it does not utilize the information of the target observable and the constraints of quan… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: 14 pages, 4 figures

  7. arXiv:2407.19233  [pdf, ps, other

    math.PR math-ph math.NT

    Exchangeable arrays and integrable systems for characteristic polynomials of random matrices

    Authors: Theodoros Assiotis, Mustafa Alper Gunes, Jonathan P. Keating, Fei Wei

    Abstract: The joint moments of the derivatives of the characteristic polynomial of a random unitary matrix, and also a variant of the characteristic polynomial that is real on the unit circle, in the large matrix size limit, have been studied intensively in the past twenty five years, partly in relation to conjectural connections to the Riemann zeta-function and Hardy's function. We completely settle the mo… ▽ More

    Submitted 20 August, 2024; v1 submitted 27 July, 2024; originally announced July 2024.

    Comments: 64 pages, typos corrected and more details given in Section 4

  8. arXiv:2407.16429  [pdf, other

    hep-th

    Soft theorems based on differential operators from gravity to Yang-Mills and BAS

    Authors: Fang-Stars Wei

    Abstract: This note study the soft behavior of Yang-Mills (YM) and bi-adjoint scalar (BAS) amplitudes at tree level, by using transmutation operators proposed by Cheung, Shen and Wen. By acting such transmutation operators to gravity amplitudes in the soft limit, we reproduce universal soft factors of YM amplitudes at the leading and sub-leading orders, and explain that the analogous universal soft behavior… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  9. arXiv:2407.15024  [pdf, ps, other

    math.NT

    Algebraic relations of special $v$-adic arithmetic gamma values and their period interpretations

    Authors: Chieh-Yu Chang, Fu-Tsun Wei, Jing Yu

    Abstract: Let $v$ be a finite place of $\mathbb{F}_q(θ)$. We show that algebraic relations of $v$-adic arithmetic gamma values over $\mathbb{F}_q(θ)$ are explained by the standard functional equations together with Thakur's analogue of the Gross-Koblitz formula. A key step is working out a formula expressing $v$-adic crystalline-de Rham periods of Carlitz motives with Complex Multiplication as products of t… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    MSC Class: 11R58; 11J93

  10. arXiv:2407.11473  [pdf, other

    cs.LG quant-ph

    Quantum Maximum Entropy Inference and Hamiltonian Learning

    Authors: Minbo Gao, Zhengfeng Ji, Fuchao Wei

    Abstract: Maximum entropy inference and learning of graphical models are pivotal tasks in learning theory and optimization. This work extends algorithms for these problems, including generalized iterative scaling (GIS) and gradient descent (GD), to the quantum realm. While the generalization, known as quantum iterative scaling (QIS), is straightforward, the key challenge lies in the non-commutative nature o… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 27 pages, 7 figures

  11. arXiv:2407.10969  [pdf, other

    cs.CL cs.LG

    Q-Sparse: All Large Language Models can be Fully Sparsely-Activated

    Authors: Hongyu Wang, Shuming Ma, Ruiping Wang, Furu Wei

    Abstract: We introduce, Q-Sparse, a simple yet effective approach to training sparsely-activated large language models (LLMs). Q-Sparse enables full sparsity of activations in LLMs which can bring significant efficiency gains in inference. This is achieved by applying top-K sparsification to the activations and the straight-through-estimator to the training. We also introduce Block Q-Sparse for batch traini… ▽ More

    Submitted 24 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: Work in progress

  12. arXiv:2407.08586  [pdf, other

    nucl-ex

    Centrality dependence of Lévy-stable two-pion Bose-Einstein correlations in $\sqrt{s_{_{NN}}}=200$ GeV Au$+$Au collisions

    Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, C. Aidala, N. N. Ajitanand, Y. Akiba, R. Akimoto, H. Al-Ta'ani, J. Alexander, A. Angerami, K. Aoki, N. Apadula, Y. Aramaki, H. Asano, E. C. Aschenauer, E. T. Atomssa, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, B. Bannier, K. N. Barish, B. Bassalleck, S. Bathe , et al. (377 additional authors not shown)

    Abstract: The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 401 authors from 75 institutions, 20 pages, 15 figures, 2 tables. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

  13. arXiv:2407.08551  [pdf, other

    cs.CL cs.SD eess.AS

    Autoregressive Speech Synthesis without Vector Quantization

    Authors: Lingwei Meng, Long Zhou, Shujie Liu, Sanyuan Chen, Bing Han, Shujie Hu, Yanqing Liu, Jinyu Li, Sheng Zhao, Xixin Wu, Helen Meng, Furu Wei

    Abstract: We present MELLE, a novel continuous-valued tokens based language modeling approach for text to speech synthesis (TTS). MELLE autoregressively generates continuous mel-spectrogram frames directly from text condition, bypassing the need for vector quantization, which are originally designed for audio compression and sacrifice fidelity compared to mel-spectrograms. Specifically, (i) instead of cross… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  14. arXiv:2407.06642  [pdf, other

    cs.CV

    Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning

    Authors: Fanyue Wei, Wei Zeng, Zhenyang Li, Dawei Yin, Lixin Duan, Wen Li

    Abstract: Personalized text-to-image models allow users to generate varied styles of images (specified with a sentence) for an object (specified with a set of reference images). While remarkable results have been achieved using diffusion-based generation models, the visual structure and details of the object are often unexpectedly changed during the diffusion process. One major reason is that these diffusio… ▽ More

    Submitted 18 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  15. arXiv:2407.06112  [pdf, other

    cs.CL

    Enhancing Language Model Rationality with Bi-Directional Deliberation Reasoning

    Authors: Yadong Zhang, Shaoguang Mao, Wenshan Wu, Yan Xia, Tao Ge, Man Lan, Furu Wei

    Abstract: This paper introduces BI-Directional DEliberation Reasoning (BIDDER), a novel reasoning approach to enhance the decision rationality of language models. Traditional reasoning methods typically rely on historical information and employ uni-directional (left-to-right) reasoning strategy. This lack of bi-directional deliberation reasoning results in limited awareness of potential future outcomes and… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  16. arXiv:2407.03088  [pdf, other

    quant-ph

    The sudden death of quantum advantage in correlation generations

    Authors: Weixiao Sun, Fuchuan Wei, Yuguo Shao, Zhaohui Wei

    Abstract: As quantum error corrections still cannot be realized physically, quantum noise is the most profound obstacle to the implementations of large-scale quantum algorithms or quantum schemes. It has been well-known that if a quantum computer suffers from too strong quantum noise, its running can be easily simulated by a classical computer, making the quantum advantage impossible. Generally speaking, ho… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 21 pages, 1 figure. Comments are welcome

  17. arXiv:2407.01491  [pdf, other

    cs.CL cs.CV

    Expressive and Generalizable Low-rank Adaptation for Large Models via Slow Cascaded Learning

    Authors: Siwei Li, Yifan Yang, Yifei Shen, Fangyun Wei, Zongqing Lu, Lili Qiu, Yuqing Yang

    Abstract: Efficient fine-tuning plays a fundamental role in modern large models, with low-rank adaptation emerging as a particularly promising approach. However, the existing variants of LoRA are hampered by limited expressiveness, a tendency to overfit, and sensitivity to hyperparameter settings. This paper presents LoRA Slow Cascade Learning (LoRASC), an innovative technique designed to enhance LoRA's exp… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  18. arXiv:2406.19776  [pdf, other

    cs.MM cs.IR

    MDF: A Dynamic Fusion Model for Multi-modal Fake News Detection

    Authors: Hongzhen Lv, Wenzhong Yang, Fuyuan Wei, Jiaren Peng, Haokun Geng

    Abstract: Fake news detection has received increasing attention from researchers in recent years, especially multi-modal fake news detection containing both text and images. However, many previous works have fed two modal features, text and image, into a binary classifier after a simple concatenation or attention mechanism, in which the features contain a large amount of noise inherent in the data,which in… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  19. arXiv:2406.19774  [pdf, other

    cs.CL

    Direct Preference Knowledge Distillation for Large Language Models

    Authors: Yixing Li, Yuxian Gu, Li Dong, Dequan Wang, Yu Cheng, Furu Wei

    Abstract: In the field of large language models (LLMs), Knowledge Distillation (KD) is a critical technique for transferring capabilities from teacher models to student models. However, existing KD methods face limitations and challenges in distillation of LLMs, including efficiency and insufficient measurement capabilities of traditional KL divergence. It is shown that LLMs can serve as an implicit reward… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  20. arXiv:2406.17404  [pdf, other

    cs.CL cs.LG

    Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training

    Authors: Yixuan Wang, Xianzhen Luo, Fuxuan Wei, Yijun Liu, Qingfu Zhu, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che

    Abstract: Existing speculative decoding methods typically require additional model structure and training processes to assist the model for draft token generation. This makes the migration of acceleration methods to the new model more costly and more demanding on device memory. To address this problem, we propose the Make Some Noise (MSN) training framework as a replacement for the supervised fine-tuning st… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 11 pages, 6 figures

  21. arXiv:2406.16866  [pdf, other

    cs.CV

    Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models

    Authors: Jierun Chen, Fangyun Wei, Jinjing Zhao, Sizhe Song, Bohuai Wu, Zhuoxuan Peng, S. -H. Gary Chan, Hongyang Zhang

    Abstract: Referring expression comprehension (REC) involves localizing a target instance based on a textual description. Recent advancements in REC have been driven by large multimodal models (LMMs) like CogVLM, which achieved 92.44% accuracy on RefCOCO. However, this study questions whether existing benchmarks such as RefCOCO, RefCOCO+, and RefCOCOg, capture LMMs' comprehensive capabilities. We begin with… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  22. arXiv:2406.16858  [pdf, other

    cs.CL cs.LG

    EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees

    Authors: Yuhui Li, Fangyun Wei, Chao Zhang, Hongyang Zhang

    Abstract: Inference with modern Large Language Models (LLMs) is expensive and time-consuming, and speculative sampling has proven to be an effective solution. Most speculative sampling methods such as EAGLE use a static draft tree, implicitly assuming that the acceptance rate of draft tokens depends only on their position. Interestingly, we found that the acceptance rate of draft tokens is also context-depe… ▽ More

    Submitted 30 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  23. arXiv:2406.14491  [pdf, other

    cs.CL

    Instruction Pre-Training: Language Models are Supervised Multitask Learners

    Authors: Daixuan Cheng, Yuxian Gu, Shaohan Huang, Junyu Bi, Minlie Huang, Furu Wei

    Abstract: Unsupervised multitask pre-training has been the critical method behind the recent success of language models (LMs). However, supervised multitask learning still holds significant promise, as scaling it in the post-training stage trends towards better generalization. In this paper, we explore supervised multitask pre-training by proposing Instruction Pre-Training, a framework that scalably augment… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  24. arXiv:2406.11837  [pdf, other

    cs.CV

    Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%

    Authors: Lei Zhu, Fangyun Wei, Yanye Lu, Dong Chen

    Abstract: In the realm of image quantization exemplified by VQGAN, the process encodes images into discrete tokens drawn from a codebook with a predefined size. Recent advancements, particularly with LLAMA 3, reveal that enlarging the codebook significantly enhances model performance. However, VQGAN and its derivatives, such as VQGAN-FC (Factorized Codes) and VQGAN-EMA, continue to grapple with challenges r… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  25. arXiv:2406.11698  [pdf, other

    cs.CL

    Meta Reasoning for Large Language Models

    Authors: Peizhong Gao, Ao Xie, Shaoguang Mao, Wenshan Wu, Yan Xia, Haipeng Mi, Furu Wei

    Abstract: We introduce Meta-Reasoning Prompting (MRP), a novel and efficient system prompting method for large language models (LLMs) inspired by human meta-reasoning. Traditional in-context learning-based reasoning techniques, such as Tree-of-Thoughts, show promise but lack consistent state-of-the-art performance across diverse tasks due to their specialized nature. MRP addresses this limitation by guiding… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  26. arXiv:2406.10881  [pdf, other

    cs.CL

    Teaching Large Language Models to Express Knowledge Boundary from Their Own Signals

    Authors: Lida Chen, Zujie Liang, Xintao Wang, Jiaqing Liang, Yanghua Xiao, Feng Wei, Jinglei Chen, Zhenghong Hao, Bing Han, Wei Wang

    Abstract: Large language models (LLMs) have achieved great success, but their occasional content fabrication, or hallucination, limits their practical application. Hallucination arises because LLMs struggle to admit ignorance due to inadequate training on knowledge boundaries. We call it a limitation of LLMs that they can not accurately express their knowledge boundary, answering questions they know while a… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  27. arXiv:2406.10505  [pdf, other

    cs.CL

    CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding

    Authors: Libo Qin, Fuxuan Wei, Qiguang Chen, Jingxuan Zhou, Shijue Huang, Jiasheng Si, Wenpeng Lu, Wanxiang Che

    Abstract: Slot filling and intent detection are two highly correlated tasks in spoken language understanding (SLU). Recent SLU research attempts to explore zero-shot prompting techniques in large language models to alleviate the data scarcity problem. Nevertheless, the existing prompting work ignores the cross-task interaction information for SLU, which leads to sub-optimal performance. To solve this proble… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  28. arXiv:2406.08301  [pdf, other

    nucl-ex

    Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV

    Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, S. Afanasiev, C. Aidala, N. N. Ajitanand, Y. Akiba, H. Al-Bataineh, J. Alexander, M. Alfred, K. Aoki, N. Apadula, L. Aphecetche, J. Asai, H. Asano, E. T. Atomssa, R. Averbeck, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, G. Baksay, L. Baksay, A. Baldisseri , et al. (510 additional authors not shown)

    Abstract: High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 534 authors from 83 institutions, 12 pages, 7 figures. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

  29. arXiv:2406.07855  [pdf, other

    cs.CL cs.SD eess.AS

    VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment

    Authors: Bing Han, Long Zhou, Shujie Liu, Sanyuan Chen, Lingwei Meng, Yanming Qian, Yanqing Liu, Sheng Zhao, Jinyu Li, Furu Wei

    Abstract: With the help of discrete neural audio codecs, large language models (LLM) have increasingly been recognized as a promising methodology for zero-shot Text-to-Speech (TTS) synthesis. However, sampling based decoding strategies bring astonishing diversity to generation, but also pose robustness issues such as typos, omissions and repetition. In addition, the high sampling rate of audio also brings h… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 15 pages, 5 figures

  30. arXiv:2406.05774  [pdf, other

    cs.CV

    VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction

    Authors: Hanlin Chen, Fangyin Wei, Chen Li, Tianxin Huang, Yunsong Wang, Gim Hee Lee

    Abstract: Although 3D Gaussian Splatting has been widely studied because of its realistic and efficient novel-view synthesis, it is still challenging to extract a high-quality surface from the point-based representation. Previous works improve the surface by incorporating geometric priors from the off-the-shelf normal estimator. However, there are two main limitations: 1) Supervising normal rendered from 3D… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  31. arXiv:2406.05370  [pdf, other

    cs.CL cs.SD eess.AS

    VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers

    Authors: Sanyuan Chen, Shujie Liu, Long Zhou, Yanqing Liu, Xu Tan, Jinyu Li, Sheng Zhao, Yao Qian, Furu Wei

    Abstract: This paper introduces VALL-E 2, the latest advancement in neural codec language models that marks a milestone in zero-shot text-to-speech synthesis (TTS), achieving human parity for the first time. Based on its predecessor, VALL-E, the new iteration introduces two significant enhancements: Repetition Aware Sampling refines the original nucleus sampling process by accounting for token repetition in… ▽ More

    Submitted 17 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

    Comments: Demo posted

  32. arXiv:2406.04622  [pdf, other

    hep-th

    On soft factors and transmutation operators

    Authors: Fang-Stars Wei, Kang Zhou

    Abstract: The well known soft theorems state the specific factorizations of tree level gravitational (GR) amplitudes at leading, sub-leading and sub-sub-leading orders, with universal soft factors. For Yang-Mills (YM) amplitudes, similar factorizations and universal soft factors are found at leading and sub-leading orders. Then it is natural to ask if the similar factorizations and soft factors exist at hig… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 28 pages, 1 figure

  33. arXiv:2405.17890  [pdf, other

    cs.IR cs.CL cs.LG

    SLMRec: Empowering Small Language Models for Sequential Recommendation

    Authors: Wujiang Xu, Zujie Liang, Jiaojiao Han, Xuying Ning, Wenfang Lin, Linxun Chen, Feng Wei, Yongfeng Zhang

    Abstract: The sequential Recommendation (SR) task involves predicting the next item a user is likely to interact with, given their past interactions. The SR models examine the sequence of a user's actions to discern more complex behavioral patterns and temporal dynamics. Recent research demonstrates the great impact of LLMs on sequential recommendation systems, either viewing sequential recommendation as la… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  34. arXiv:2405.16803  [pdf, other

    cs.CV

    TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing

    Authors: Xinyu Zhang, Mengxue Kang, Fei Wei, Shuang Xu, Yuhe Liu, Lin Ma

    Abstract: As the field of image generation rapidly advances, traditional diffusion models and those integrated with multimodal large language models (LLMs) still encounter limitations in interpreting complex prompts and preserving image consistency pre and post-editing. To tackle these challenges, we present an innovative image editing framework that employs the robust Chain-of-Thought (CoT) reasoning and l… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  35. arXiv:2405.13792  [pdf, other

    cs.CL cs.AI cs.IR

    xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token

    Authors: Xin Cheng, Xun Wang, Xingxing Zhang, Tao Ge, Si-Qing Chen, Furu Wei, Huishuai Zhang, Dongyan Zhao

    Abstract: This paper introduces xRAG, an innovative context compression method tailored for retrieval-augmented generation. xRAG reinterprets document embeddings in dense retrieval--traditionally used solely for retrieval--as features from the retrieval modality. By employing a modality fusion methodology, xRAG seamlessly integrates these embeddings into the language model representation space, effectively… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  36. arXiv:2405.12130  [pdf, other

    cs.CL cs.LG

    MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

    Authors: Ting Jiang, Shaohan Huang, Shengyue Luo, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang, Deqing Wang, Fuzhen Zhuang

    Abstract: Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models. In this paper, we analyze the impact of low-rank updating, as implemented in LoRA. Our findings suggest that the low-rank updating mechanism may limit the ability of LLMs to effectively learn and memorize new knowledge. Inspired by this observation, we propose a new method called MoRA, which employs… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Work in Progress

  37. arXiv:2405.08399  [pdf

    cond-mat.mtrl-sci

    Exploring material compositions for synthesis using oxidation states

    Authors: Maung Thway, Andy Paul Chen, Haiwen Dai, Jose Recatala-Gomez, Siyu Isaac Parker Tian, Ruiming Zhu, Wenhao Zhai, Fengxia Wei, D. V. Maheshwar Repaka, Tonio Buonassisi, Pieremanuele Canepa, Kedar Hippalgaonkar

    Abstract: Recent advances in machine learning techniques have made it possible to use high-throughput screening to identify novel materials with specific properties. However, the large number of potential candidates produced by these techniques can make it difficult to select the most promising ones. In this study, we develop the oxidation state probability (OSP) method which evaluates ternary compounds bas… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 11 pages, 3 figures

  38. arXiv:2405.05254  [pdf, other

    cs.CL

    You Only Cache Once: Decoder-Decoder Architectures for Language Models

    Authors: Yutao Sun, Li Dong, Yi Zhu, Shaohan Huang, Wenhui Wang, Shuming Ma, Quanlu Zhang, Jianyong Wang, Furu Wei

    Abstract: We introduce a decoder-decoder architecture, YOCO, for large language models, which only caches key-value pairs once. It consists of two components, i.e., a cross-decoder stacked upon a self-decoder. The self-decoder efficiently encodes global key-value (KV) caches that are reused by the cross-decoder via cross-attention. The overall model behaves like a decoder-only Transformer, although YOCO onl… ▽ More

    Submitted 9 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  39. arXiv:2405.01924  [pdf, other

    cs.CL cs.AI cs.IR

    Semi-Parametric Retrieval via Binary Token Index

    Authors: Jiawei Zhou, Li Dong, Furu Wei, Lei Chen

    Abstract: The landscape of information retrieval has broadened from search services to a critical component in various advanced applications, where indexing efficiency, cost-effectiveness, and freshness are increasingly important yet remain less explored. To address these demands, we introduce Semi-parametric Vocabulary Disentangled Retrieval (SVDR). SVDR is a novel semi-parametric retrieval framework that… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  40. arXiv:2405.00980  [pdf, other

    cs.CL cs.CV

    A Hong Kong Sign Language Corpus Collected from Sign-interpreted TV News

    Authors: Zhe Niu, Ronglai Zuo, Brian Mak, Fangyun Wei

    Abstract: This paper introduces TVB-HKSL-News, a new Hong Kong sign language (HKSL) dataset collected from a TV news program over a period of 7 months. The dataset is collected to enrich resources for HKSL and support research in large-vocabulary continuous sign language recognition (SLR) and translation (SLT). It consists of 16.07 hours of sign videos of two signers with a vocabulary of 6,515 glosses (for… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: Accepted by LREC-COLING 2024

  41. arXiv:2404.15100  [pdf, other

    cs.CV cs.MM

    Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation

    Authors: Xun Wu, Shaohan Huang, Furu Wei

    Abstract: Recent studies have demonstrated the exceptional potentials of leveraging human preference datasets to refine text-to-image generative models, enhancing the alignment between generated images and textual prompts. Despite these advances, current human preference datasets are either prohibitively expensive to construct or suffer from a lack of diversity in preference dimensions, resulting in limited… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  42. arXiv:2404.15045  [pdf, other

    cs.CL cs.AI cs.LG

    Multi-Head Mixture-of-Experts

    Authors: Xun Wu, Shaohan Huang, Wenhui Wang, Furu Wei

    Abstract: Sparse Mixtures of Experts (SMoE) scales model capacity without significant increases in training and inference costs, but exhibits the following two issues: (1) Low expert activation, where only a small subset of experts are activated for optimization. (2) Lacking fine-grained analytical capabilities for multiple semantic concepts within individual tokens. We propose Multi-Head Mixture-of-Experts… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  43. arXiv:2404.13628  [pdf, other

    cs.CL cs.LG cs.MM

    Mixture of LoRA Experts

    Authors: Xun Wu, Shaohan Huang, Furu Wei

    Abstract: LoRA has gained widespread acceptance in the fine-tuning of large pre-trained models to cater to a diverse array of downstream tasks, showcasing notable effectiveness and efficiency, thereby solidifying its position as one of the most prevalent fine-tuning techniques. Due to the modular nature of LoRA's plug-and-play plugins, researchers have delved into the amalgamation of multiple LoRAs to empow… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 17 pages, 11 figures

  44. arXiv:2404.12096  [pdf, other

    cs.CL cs.LG

    LongEmbed: Extending Embedding Models for Long Context Retrieval

    Authors: Dawei Zhu, Liang Wang, Nan Yang, Yifan Song, Wenhao Wu, Furu Wei, Sujian Li

    Abstract: Embedding models play a pivot role in modern NLP applications such as IR and RAG. While the context limit of LLMs has been pushed beyond 1 million tokens, embedding models are still confined to a narrow context window not exceeding 8k tokens, refrained from application scenarios requiring long inputs such as legal contracts. This paper explores context window extension of existing embedding models… ▽ More

    Submitted 24 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: Fix results for Nomic

  45. arXiv:2404.03622  [pdf, other

    cs.CL

    Mind's Eye of LLMs: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

    Authors: Wenshan Wu, Shaoguang Mao, Yadong Zhang, Yan Xia, Li Dong, Lei Cui, Furu Wei

    Abstract: Large language models (LLMs) have exhibited impressive performance in language comprehension and various reasoning tasks. However, their abilities in spatial reasoning, a crucial aspect of human cognition, remain relatively unexplored. Human possess a remarkable ability to create mental images of unseen objects and actions through a process known as the Mind's Eye, enabling the imagination of the… ▽ More

    Submitted 24 May, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  46. arXiv:2404.01625  [pdf

    cs.CR

    AAA: an Adaptive Mechanism for Locally Differential Private Mean Estimation

    Authors: Fei Wei, Ergute Bao, Xiaokui Xiao, Yin Yang, Bolin Ding

    Abstract: Local differential privacy (LDP) is a strong privacy standard that has been adopted by popular software systems. The main idea is that each individual perturbs their own data locally, and only submits the resulting noisy version to a data aggregator. Although much effort has been devoted to computing various types of aggregates and building machine learning applications under LDP, research on fund… ▽ More

    Submitted 3 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  47. arXiv:2404.01230  [pdf, other

    cs.CL

    LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models

    Authors: Yadong Zhang, Shaoguang Mao, Tao Ge, Xun Wang, Adrian de Wynter, Yan Xia, Wenshan Wu, Ting Song, Man Lan, Furu Wei

    Abstract: This paper presents a comprehensive survey of the current status and opportunities for Large Language Models (LLMs) in strategic reasoning, a sophisticated form of reasoning that necessitates understanding and predicting adversary actions in multi-agent settings while adjusting strategies accordingly. Strategic reasoning is distinguished by its focus on the dynamic and uncertain nature of interact… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 9 pages, 5 figures

  48. arXiv:2404.00656  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    WavLLM: Towards Robust and Adaptive Speech Large Language Model

    Authors: Shujie Hu, Long Zhou, Shujie Liu, Sanyuan Chen, Lingwei Meng, Hongkun Hao, Jing Pan, Xunying Liu, Jinyu Li, Sunit Sivasankaran, Linquan Liu, Furu Wei

    Abstract: The recent advancements in large language models (LLMs) have revolutionized the field of natural language processing, progressively broadening their scope to multimodal perception and generation. However, effectively integrating listening capabilities into LLMs poses significant challenges, particularly with respect to generalizing across varied contexts and executing complex auditory tasks. In th… ▽ More

    Submitted 14 August, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

  49. arXiv:2403.13213  [pdf, other

    cs.LG cs.CL cs.CY

    From Representational Harms to Quality-of-Service Harms: A Case Study on Llama 2 Safety Safeguards

    Authors: Khaoula Chehbouni, Megha Roshan, Emmanuel Ma, Futian Andrew Wei, Afaf Taik, Jackie CK Cheung, Golnoosh Farnadi

    Abstract: Recent progress in large language models (LLMs) has led to their widespread adoption in various domains. However, these advancements have also introduced additional safety risks and raised concerns regarding their detrimental impact on already marginalized populations. Despite growing mitigation efforts to develop safety safeguards, such as supervised safety-oriented fine-tuning and leveraging saf… ▽ More

    Submitted 5 July, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 9 pages, 4 figures. Accepted to Findings of the Association for Computational Linguistics: ACL 2024

  50. arXiv:2403.07874  [pdf, other

    cs.CV

    Beyond Text: Frozen Large Language Models in Visual Signal Comprehension

    Authors: Lei Zhu, Fangyun Wei, Yanye Lu

    Abstract: In this work, we investigate the potential of a large language model (LLM) to directly comprehend visual signals without the necessity of fine-tuning on multi-modal datasets. The foundational concept of our method views an image as a linguistic entity, and translates it to a set of discrete words derived from the LLM's vocabulary. To achieve this, we present the Vision-to-Language Tokenizer, abbre… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024