Zum Hauptinhalt springen

Showing 1–50 of 287 results for author: Geng, X

.
  1. arXiv:2408.07966  [pdf, other

    cs.LG cs.DC

    Addressing Skewed Heterogeneity via Federated Prototype Rectification with Personalization

    Authors: Shunxin Guo, Hongsong Wang, Shuxia Lin, Zhiqiang Kou, Xin Geng

    Abstract: Federated learning is an efficient framework designed to facilitate collaborative model training across multiple distributed devices while preserving user data privacy. A significant challenge of federated learning is data-level heterogeneity, i.e., skewed or long-tailed distribution of private data. Although various methods have been proposed to address this challenge, most of them assume that th… ▽ More

    Submitted 22 August, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

  2. arXiv:2408.07337  [pdf, other

    cs.CV

    KIND: Knowledge Integration and Diversion in Diffusion Models

    Authors: Yucheng Xie, Fu Feng, Jing Wang, Xin Geng, Yong Rui

    Abstract: Pre-trained models have become the preferred backbone due to the expansion of model parameters, with techniques like Parameter-Efficient Fine-Tuning (PEFTs) typically fixing the parameters of these models. However, pre-trained models may not always be optimal, especially when there are discrepancies between training tasks and target tasks, potentially resulting in negative transfer. To address thi… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  3. arXiv:2408.02599  [pdf, other

    cs.CL cs.AI

    Progressively Selective Label Enhancement for Language Model Alignment

    Authors: Biao Liu, Ning Xu, Xin Geng

    Abstract: Large Language Models have demonstrated impressive capabilities in various language tasks but may produce content that misaligns with human expectations, raising ethical and legal concerns. Therefore, it is important to explore the limitations and implement restrictions on the models to ensure safety and compliance, with Reinforcement Learning from Human Feedback (RLHF) being the primary method. D… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  4. arXiv:2408.00804  [pdf, other

    cs.AR cs.AI cs.LG

    ChipExpert: The Open-Source Integrated-Circuit-Design-Specific Large Language Model

    Authors: Ning Xu, Zhaoyang Zhang, Lei Qi, Wensuo Wang, Chao Zhang, Zihao Ren, Huaiyuan Zhang, Xin Cheng, Yanqi Zhang, Zhichao Liu, Qingwen Wei, Shiyang Wu, Lanlan Yang, Qianfeng Lu, Yiqun Ma, Mengyao Zhao, Junbo Liu, Yufan Song, Xin Geng, Jun Yang

    Abstract: The field of integrated circuit (IC) design is highly specialized, presenting significant barriers to entry and research and development challenges. Although large language models (LLMs) have achieved remarkable success in various domains, existing LLMs often fail to meet the specific needs of students, engineers, and researchers. Consequently, the potential of LLMs in the IC design domain remains… ▽ More

    Submitted 26 July, 2024; originally announced August 2024.

  5. arXiv:2408.00277  [pdf, ps, other

    math.PR

    Stochastic Domination of Exit Times for Random Walks and Brownian Motion with Drift

    Authors: Xi Geng, Greg Markowsky

    Abstract: In this note, by an elementary use of Girsanov's transform we show that the exit time for either a biased random walk or a drifted Brownian motion on a symmetric interval is stochastically monotone with respect to the drift parameter. In the random walk case, this gives an alternative proof of a recent result of E. Peköz and R. Righter in 2024. Our arguments in both discrete and continuous cases a… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    MSC Class: 60J65; 60G50

  6. arXiv:2407.20439  [pdf, other

    cs.RO cs.HC eess.SY

    Haptic feedback of front car motion can improve driving control

    Authors: Xiaoxiao Cheng, Xianzhe Geng, Yanpei Huang, Etienne Burdet

    Abstract: This study investigates the role of haptic feedback in a car-following scenario, where information about the motion of the front vehicle is provided through a virtual elastic connection with it. Using a robotic interface in a simulated driving environment, we examined the impact of varying levels of such haptic feedback on the driver's ability to follow the road while avoiding obstacles. The resul… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  7. arXiv:2407.13086  [pdf, ps, other

    math.PR

    Expected Signature on a Riemannian Manifold and Its Geometric Implications

    Authors: Xi Geng, Hao Ni, Chaorui Wang

    Abstract: On a compact Riemannian manifold $M,$ we show that the Riemannian distance function $d(x,y)$ can be explicitly reconstructed from suitable asymptotics of the expected signature of Brownian bridge from $x$ to $y$. In addition, by looking into the asymptotic expansion of the fourth level expected signature of the Brownian loop based at $x\in M$, one can explicitly reconstruct both intrinsic (Ricci c… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  8. arXiv:2407.02098  [pdf, other

    cs.CV

    DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection

    Authors: Kaixin Xu, Qingtian Feng, Hao Chen, Zhe Wang, Xue Geng, Xulei Yang, Min Wu, Xiaoli Li, Weisi Lin

    Abstract: Applying deep neural networks to 3D point cloud processing has attracted increasing attention due to its advanced performance in many areas, such as AR/VR, autonomous driving, and robotics. However, as neural network models and 3D point clouds expand in size, it becomes a crucial challenge to reduce the computational and memory overhead to meet latency and energy constraints in real-world applicat… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  9. arXiv:2407.02068  [pdf, other

    cs.CV

    LPViT: Low-Power Semi-structured Pruning for Vision Transformers

    Authors: Kaixin Xu, Zhe Wang, Chunyun Chen, Xue Geng, Jie Lin, Xulei Yang, Min Wu, Xiaoli Li, Weisi Lin

    Abstract: Vision transformers have emerged as a promising alternative to convolutional neural networks for various image analysis tasks, offering comparable or superior performance. However, one significant drawback of ViTs is their resource-intensive nature, leading to increased memory footprint, computation complexity, and power consumption. To democratize this high-performance technology and make it more… ▽ More

    Submitted 12 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  10. arXiv:2406.17922  [pdf, ps, other

    math-ph

    On induced L-infinity action of diffeomorphisms on Cochains

    Authors: Andrey Losev, Dmitrii Sheptunov, Xin Geng

    Abstract: One of the approaches to quantum gravity is to formulate it in terms of De Rham algebra, choose a triangulation of space-time, and replace differential forms by cochains (that form a finite dimensional vector space). The key issue of general relativity is the action of diffeomorphisms of space-time on fields. In this paper, we induce the action of diffeomorphisms on cochains by homotopy transfer… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  11. arXiv:2406.17503  [pdf, other

    cs.LG

    WAVE: Weight Template for Adaptive Initialization of Variable-sized Models

    Authors: Fu Feng, Yucheng Xie, Jing Wang, Xin Geng

    Abstract: The expansion of model parameters underscores the significance of pre-trained models; however, the constraints encountered during model deployment necessitate models of variable sizes. Consequently, the traditional pre-training and fine-tuning paradigm fails to address the initialization problem when target models are incompatible with pre-trained models. We tackle this issue from a multitasking p… ▽ More

    Submitted 15 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  12. arXiv:2406.14532  [pdf, other

    cs.LG cs.CL

    RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold

    Authors: Amrith Setlur, Saurabh Garg, Xinyang Geng, Naman Garg, Virginia Smith, Aviral Kumar

    Abstract: Training on model-generated synthetic data is a promising approach for finetuning LLMs, but it remains unclear when it helps or hurts. In this paper, we investigate this question for math reasoning via an empirical study, followed by building a conceptual understanding of our observations. First, we find that while the typical approach of finetuning a model on synthetic correct or positive problem… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  13. arXiv:2406.13185  [pdf, other

    cs.CL

    Learnable In-Context Vector for Visual Question Answering

    Authors: Yingzhe Peng, Chenduo Hao, Xu Yang, Jiawei Peng, Xinting Hu, Xin Geng

    Abstract: As language models continue to scale, Large Language Models (LLMs) have exhibited emerging capabilities in In-Context Learning (ICL), enabling them to solve language tasks by prefixing a few in-context demonstrations (ICDs) as context. Inspired by these advancements, researchers have extended these techniques to develop Large Multimodal Models (LMMs) with ICL capabilities. However, applying ICL us… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  14. arXiv:2406.12199  [pdf, other

    cs.LG cs.AI

    Time Series Modeling for Heart Rate Prediction: From ARIMA to Transformers

    Authors: Haowei Ni, Shuchen Meng, Xieming Geng, Panfeng Li, Zhuoying Li, Xupeng Chen, Xiaotong Wang, Shiyao Zhang

    Abstract: Cardiovascular disease (CVD) is a leading cause of death globally, necessitating precise forecasting models for monitoring vital signs like heart rate, blood pressure, and ECG. Traditional models, such as ARIMA and Prophet, are limited by their need for manual parameter tuning and challenges in handling noisy, sparse, and highly variable medical data. This study investigates advanced deep learning… ▽ More

    Submitted 27 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by 2024 6th International Conference on Electronic Engineering and Informatics

  15. arXiv:2406.09397  [pdf, other

    cs.CV cs.AI

    Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms

    Authors: Miaosen Zhang, Yixuan Wei, Zhen Xing, Yifei Ma, Zuxuan Wu, Ji Li, Zheng Zhang, Qi Dai, Chong Luo, Xin Geng, Baining Guo

    Abstract: Modern vision models are trained on very large noisy datasets. While these models acquire strong capabilities, they may not follow the user's intent to output the desired results in certain aspects, e.g., visual aesthetic, preferred style, and responsibility. In this paper, we target the realm of visual aesthetics and aim to align vision models with human aesthetic standards in a retrieval system.… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 28 pages, 26 figures, under review

  16. arXiv:2406.07871  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Flexible Music-Conditioned Dance Generation with Style Description Prompts

    Authors: Hongsong Wang, Yin Zhu, Xin Geng

    Abstract: Dance plays an important role as an artistic form and expression in human culture, yet the creation of dance remains a challenging task. Most dance generation methods primarily rely solely on music, seldom taking into consideration intrinsic attributes such as music style or genre. In this work, we introduce Flexible Dance Generation with Style Description Prompts (DGSDP), a diffusion-based framew… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  17. arXiv:2405.16474  [pdf, other

    cs.LG

    Inaccurate Label Distribution Learning with Dependency Noise

    Authors: Zhiqiang Kou, Jing Wang, Yuheng Jia, Xin Geng

    Abstract: In this paper, we introduce the Dependent Noise-based Inaccurate Label Distribution Learning (DN-ILDL) framework to tackle the challenges posed by noise in label distribution learning, which arise from dependencies on instances and labels. We start by modeling the inaccurate label distribution matrix as a combination of the true label distribution and a noise matrix influenced by specific instance… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  18. arXiv:2405.13923  [pdf, other

    cs.CL

    Why Not Transform Chat Large Language Models to Non-English?

    Authors: Xiang Geng, Ming Zhu, Jiahuan Li, Zhejian Lai, Wei Zou, Shuaijie She, Jiaxin Guo, Xiaofeng Zhao, Yinglu Li, Yuang Li, Chang Su, Yanqing Zhao, Xinglin Lyu, Min Zhang, Jiajun Chen, Hao Yang, Shujian Huang

    Abstract: The scarcity of non-English data limits the development of non-English large language models (LLMs). Transforming English-centric LLMs to non-English has been identified as an effective and resource-efficient method. Previous works start from base LLMs and perform knowledge distillation (KD) with data generated by stronger LLMs, e.g. GPT-4. Compared to base LLMs, chat LLMs are further optimized fo… ▽ More

    Submitted 31 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  19. MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

    Authors: Qi Chen, Xiubo Geng, Corby Rosset, Carolyn Buractaon, Jingwen Lu, Tao Shen, Kun Zhou, Chenyan Xiong, Yeyun Gong, Paul Bennett, Nick Craswell, Xing Xie, Fan Yang, Bryan Tower, Nikhil Rao, Anlei Dong, Wenqi Jiang, Zheng Liu, Mingqin Li, Chuanjie Liu, Zengzhong Li, Rangan Majumder, Jennifer Neville, Andy Oakley, Knut Magne Risvik , et al. (6 additional authors not shown)

    Abstract: Recent breakthroughs in large models have highlighted the critical significance of data scale, labels and modals. In this paper, we introduce MS MARCO Web Search, the first large-scale information-rich web dataset, featuring millions of real clicked query-document labels. This dataset closely mimics real-world web document and query distribution, provides rich information for various kinds of down… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 10 pages, 6 figures, for associated dataset, see http://github.com/microsoft/MS-MARCO-Web-Search

  20. arXiv:2405.07303  [pdf, other

    hep-ex hep-ph physics.ins-det

    Search for solar axions by Primakoff effect with the full dataset of the CDEX-1B Experiment

    Authors: L. T. Yang, S. K. Liu, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, J. R. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (61 additional authors not shown)

    Abstract: We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China Jinping Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axio… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures

  21. arXiv:2405.06038  [pdf, other

    cs.LG cs.AI

    From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks

    Authors: Xue Geng, Zhe Wang, Chunyun Chen, Qing Xu, Kaixin Xu, Chao Jin, Manas Gupta, Xulei Yang, Zhenghua Chen, Mohamed M. Sabry Aly, Jie Lin, Min Wu, Xiaoli Li

    Abstract: Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI) tasks. However, deploying them brings significant challenges due to the huge cost of memory, energy, and computation. To address these challenges, researchers have developed various model compression techniques such as model quantization and model pruning. Recently, there has been a surge in research of compress… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: This manuscript is the accepted version for TNNLS(IEEE Transactions on Neural Networks and Learning Systems)

  22. arXiv:2405.02132  [pdf, other

    cs.SD cs.CL eess.AS

    Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets

    Authors: Xuelong Geng, Tianyi Xu, Kun Wei, Bingshen Mu, Hongfei Xue, He Wang, Yangze Li, Pengcheng Guo, Yuhang Dai, Longhao Li, Mingchen Shao, Lei Xie

    Abstract: Large Language Models (LLMs) have demonstrated unparalleled effectiveness in various NLP tasks, and integrating LLMs with automatic speech recognition (ASR) is becoming a mainstream paradigm. Building upon this momentum, our research delves into an in-depth examination of this paradigm on a large open-source Chinese dataset. Specifically, our research aims to evaluate the impact of various configu… ▽ More

    Submitted 6 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  23. arXiv:2404.16897  [pdf, other

    cs.LG cs.AI cs.CV

    Exploring Learngene via Stage-wise Weight Sharing for Initializing Variable-sized Models

    Authors: Shi-Yu Xia, Wenxuan Zhu, Xu Yang, Xin Geng

    Abstract: In practice, we usually need to build variable-sized models adapting for diverse resource constraints in different application scenarios, where weight initialization is an important step prior to training. The Learngene framework, introduced recently, firstly learns one compact part termed as learngene from a large well-trained model, after which learngene is expanded to initialize variable-sized… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  24. arXiv:2404.13565  [pdf, other

    cs.CV cs.AI cs.CL

    Exploring Diverse Methods in Visual Question Answering

    Authors: Panfeng Li, Qikai Yang, Xieming Geng, Wenjing Zhou, Zhicheng Ding, Yi Nian

    Abstract: This study explores innovative methods for improving Visual Question Answering (VQA) using Generative Adversarial Networks (GANs), autoencoders, and attention mechanisms. Leveraging a balanced VQA dataset, we investigate three distinct strategies. Firstly, GAN-based approaches aim to generate answer embeddings conditioned on image and question inputs, showing potential but struggling with more com… ▽ More

    Submitted 20 May, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted by 2024 5th International Conference on Electronic Communication and Artificial Intelligence

  25. arXiv:2404.09793  [pdf, other

    hep-ex hep-ph physics.ins-det

    First Search for Light Fermionic Dark Matter Absorption on Electrons Using Germanium Detector in CDEX-10 Experiment

    Authors: J. X. Liu, L. T. Yang, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, J. R. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (61 additional authors not shown)

    Abstract: We present the first results of the search for sub-MeV fermionic dark matter absorbed by electron targets of Germanium using the 205.4~kg$\cdot$day data collected by the CDEX-10 experiment, with the analysis threshold of 160~eVee. No significant dark matter (DM) signals over the background are observed. Results are presented as limits on the cross section of DM--electron interaction. We present ne… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 6 pages, 4 figures

  26. arXiv:2404.03915  [pdf, other

    eess.SY

    Nonlinear Kalman Filtering based on Self-Attention Mechanism and Lattice Trajectory Piecewise Linear Approximation

    Authors: Jiaming Wang, Xinyu Geng, Jun Xu

    Abstract: The traditional Kalman filter (KF) is widely applied in control systems, but it relies heavily on the accuracy of the system model and noise parameters, leading to potential performance degradation when facing inaccuracies. To address this issue, introducing neural networks into the KF framework offers a data-driven solution to compensate for these inaccuracies, improving the filter's performance… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 7 pages, 4 figures

  27. arXiv:2403.20276  [pdf, other

    hep-ex hep-ph physics.ins-det

    Constraints on the Blazar-Boosted Dark Matter from the CDEX-10 Experiment

    Authors: R. Xu, L. T. Yang, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, S. M. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (59 additional authors not shown)

    Abstract: We report new constraints on light dark matter (DM) boosted by blazars using the 205.4 kg day data from the CDEX-10 experiment located at the China Jinping Underground Laboratory. Two representative blazars, TXS 0506+56 and BL Lacertae are studied. The results derived from TXS 0506+56 exclude DM-nucleon elastic scattering cross sections from $4.6\times 10^{-33}\ \rm cm^2$ to… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 7 pages, 4 figures

  28. arXiv:2403.20263  [pdf, other

    hep-ex hep-ph physics.ins-det

    Probing Dark Matter Particles from Evaporating Primordial Black Holes via Electron Scattering in the CDEX-10 Experiment

    Authors: Z. H. Zhang, L. T. Yang, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, S. M. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (59 additional authors not shown)

    Abstract: Dark matter (DM) is a major constituent of the Universe. However, no definite evidence of DM particles (denoted as ``$χ$") has been found in DM direct detection (DD) experiments to date. There is a novel concept that detecting $χ$ from evaporating primordial black holes (PBHs). We search for $χ$ emitted from PBHs by investigating their interaction with target electrons. The examined PBH masses ran… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 8 pages, 6 figures

  29. arXiv:2403.16697  [pdf, other

    cs.CV

    DPStyler: Dynamic PromptStyler for Source-Free Domain Generalization

    Authors: Yunlong Tang, Yuxuan Wan, Lei Qi, Xin Geng

    Abstract: Source-Free Domain Generalization (SFDG) aims to develop a model that works for unseen target domains without relying on any source domain. Research in SFDG primarily bulids upon the existing knowledge of large-scale vision-language models and utilizes the pre-trained model's joint vision-language space to simulate style transfer across domains, thus eliminating the dependency on source domain ima… ▽ More

    Submitted 14 July, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE TMM

  30. arXiv:2403.14118  [pdf, other

    cs.CL

    From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation

    Authors: Haofei Zhao, Yilun Liu, Shimin Tao, Weibin Meng, Yimeng Chen, Xiang Geng, Chang Su, Min Zhang, Hao Yang

    Abstract: Machine Translation Quality Estimation (MTQE) is the task of estimating the quality of machine-translated text in real time without the need for reference translations, which is of great importance for the development of MT. After two decades of evolution, QE has yielded a wealth of results. This article provides a comprehensive overview of QE datasets, annotation methods, shared tasks, methodolog… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted by IJCNN 2024

  31. arXiv:2403.13351  [pdf, other

    cs.CV

    OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning

    Authors: Xinyu Geng, Jiaming Wang, Jiawei Gong, Yuerong Xue, Jun Xu, Fanglin Chen, Xiaolin Huang

    Abstract: Redundancy is a persistent challenge in Capsule Networks (CapsNet),leading to high computational costs and parameter counts. Although previous works have introduced pruning after the initial capsule layer, dynamic routing's fully connected nature and non-orthogonal weight matrices reintroduce redundancy in deeper layers. Besides, dynamic routing requires iterating to converge, further increasing c… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: 8 pages

  32. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  33. arXiv:2402.19145  [pdf, other

    cs.CV

    A SAM-guided Two-stream Lightweight Model for Anomaly Detection

    Authors: Chenghao Li, Lei Qi, Xin Geng

    Abstract: In industrial anomaly detection, model efficiency and mobile-friendliness become the primary concerns in real-world applications. Simultaneously, the impressive generalization capabilities of Segment Anything (SAM) have garnered broad academic attention, making it an ideal choice for localizing unseen anomalies and diverse real-world patterns. In this paper, considering these two critical factors,… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  34. arXiv:2402.13483  [pdf, other

    hep-ex hep-ph physics.app-ph physics.ins-det

    A proposed PKU-Muon experiment for muon tomography and dark matter search

    Authors: Xudong Yu, Zijian Wang, Cheng-en Liu, Yiqing Feng, Jinning Li, Xinyue Geng, Yimeng Zhang, Leyun Gao, Ruobing Jiang, Youpeng Wu, Chen Zhou, Qite Li, Siguang Wang, Yong Ban, Yajun Mao, Qiang Li

    Abstract: We propose here a set of new methods to directly detect light mass dark matter through its scattering with abundant atmospheric muons or accelerator beams. Firstly, we plan to use the free cosmic-ray muons interacting with dark matter in a volume surrounded by tracking detectors, to trace possible interaction between dark matter and muons. Secondly, we will interface our device with domestic or in… ▽ More

    Submitted 23 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Added a few sentences to highlight that our methods can have advantages over exotic dark matters which are either muon-philic or slowed down due to some mechanism

  35. arXiv:2402.08362  [pdf, other

    nlin.SI nlin.PS

    Multiple higher-order poles solutions in spinor Bose-Einstein condensates

    Authors: Huan Liu, Jing Shen, Xianguo Geng

    Abstract: In this study, we explore multiple higher-order pole solutions in spinor Bose--Einstein condensates. These solutions are associated with different pairs of higher-order poles of the transmission coefficient in the inverse scattering transform, and they represent solutions of the spin-1 Gross--Pitaevskii equation. We introduce a direct scattering map that maps initial data to scattering data, which… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  36. arXiv:2402.08352  [pdf, ps, other

    nlin.SI nlin.PS

    Riemann--Hilbert method to the Ablowitz--Ladik equation: higher-order case

    Authors: Huan Liu, Jing Shen, Xianguo Geng

    Abstract: We focused on the Ablowitz--Ladik equation on a zero background, specifically considering the scenario of $N$ pairs of multiple poles. Our first goal was to establish a mapping between the initial data and the scattering data. This allowed us to introduce a direct problem by analyzing the discrete spectrum associated with $N$ pairs of higher-order zeros. Next, we constructed another mapping from t… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 26 pages, 2 figures

  37. arXiv:2401.13011  [pdf, other

    cs.CV

    CCA: Collaborative Competitive Agents for Image Editing

    Authors: Tiankai Hang, Shuyang Gu, Dong Chen, Xin Geng, Baining Guo

    Abstract: This paper presents a novel generative model, Collaborative Competitive Agents (CCA), which leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks. Drawing inspiration from Generative Adversarial Networks (GANs), the CCA system employs two equal-status generator agents and a discriminator agent. The generators independently process user instructio… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  38. arXiv:2401.08139  [pdf, other

    cs.LG cs.NE

    Transferring Core Knowledge via Learngenes

    Authors: Fu Feng, Jing Wang, Xin Geng

    Abstract: The pre-training paradigm fine-tunes the models trained on large-scale datasets to downstream tasks with enhanced performance. It transfers all knowledge to downstream tasks without discriminating which part is necessary or unnecessary, which may lead to negative transfer. In comparison, knowledge transfer in nature is much more efficient. When passing genetic information to descendants, ancestors… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  39. arXiv:2401.06838  [pdf, other

    cs.CL

    MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization

    Authors: Shuaijie She, Wei Zou, Shujian Huang, Wenhao Zhu, Xiang Liu, Xiang Geng, Jiajun Chen

    Abstract: Though reasoning abilities are considered language-agnostic, existing LLMs exhibit inconsistent reasoning abilities across different languages, e.g., reasoning in the dominant language like English is superior to other languages due to the imbalance of multilingual training data. To enhance reasoning abilities in non-dominant languages, we propose a Multilingual-Alignment-as-Preference Optimizatio… ▽ More

    Submitted 13 April, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: The project is available at https://github.com/NJUNLP/MAPO

  40. arXiv:2401.06568  [pdf, other

    cs.CL cs.AI

    Lost in the Source Language: How Large Language Models Evaluate the Quality of Machine Translation

    Authors: Xu Huang, Zhirui Zhang, Xiang Geng, Yichao Du, Jiajun Chen, Shujian Huang

    Abstract: This study investigates how Large Language Models (LLMs) leverage source and reference data in machine translation evaluation task, aiming to better understand the mechanisms behind their remarkable performance in this task. We design the controlled experiments across various input modes and model types, and employ both coarse-grained and fine-grained prompts to discern the utility of source versu… ▽ More

    Submitted 6 June, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: Accepted by ACL2024 Findings

  41. arXiv:2312.15156  [pdf, other

    cs.CL

    Large Language Models as Zero-Shot Keyphrase Extractors: A Preliminary Empirical Study

    Authors: Mingyang Song, Xuelian Geng, Songfang Yao, Shilong Lu, Yi Feng, Liping Jing

    Abstract: Zero-shot keyphrase extraction aims to build a keyphrase extractor without training by human-annotated data, which is challenging due to the limited human intervention involved. Challenging but worthwhile, zero-shot setting efficiently reduces the time and effort that data labeling takes. Recent efforts on pre-trained large language models (e.g., ChatGPT and ChatGLM) show promising performance on… ▽ More

    Submitted 10 January, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: Technical Report, 6 pages

  42. arXiv:2312.09881  [pdf, other

    cs.LG cs.AI

    Dynamic Heterogeneous Federated Learning with Multi-Level Prototypes

    Authors: Shunxin Guo, Hongsong Wang, Xin Geng

    Abstract: Federated learning shows promise as a privacy-preserving collaborative learning technique. Existing heterogeneous federated learning mainly focuses on skewing the label distribution across clients. However, most approaches suffer from catastrophic forgetting and concept drift, mainly when the global distribution of all classes is extremely unbalanced and the data distribution of the client dynamic… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  43. arXiv:2312.06343  [pdf, other

    cs.LG

    RankMatch: A Novel Approach to Semi-Supervised Label Distribution Learning Leveraging Inter-label Correlations

    Authors: Kouzhiqiang Yucheng Xie, Jing Wang, Yuheng Jia, Boyu Shi, Xin Geng

    Abstract: This paper introduces RankMatch, an innovative approach for Semi-Supervised Label Distribution Learning (SSLDL). Addressing the challenge of limited labeled data, RankMatch effectively utilizes a small number of labeled examples in conjunction with a larger quantity of unlabeled data, reducing the need for extensive manual labeling in Deep Neural Network (DNN) applications. Specifically, RankMatch… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  44. arXiv:2312.05743  [pdf, other

    cs.LG cs.CV

    Building Variable-sized Models via Learngene Pool

    Authors: Boyu Shi, Shiyu Xia, Xu Yang, Haokun Chen, Zhiqiang Kou, Xin Geng

    Abstract: Recently, Stitchable Neural Networks (SN-Net) is proposed to stitch some pre-trained networks for quickly building numerous networks with different complexity and performance trade-offs. In this way, the burdens of designing or training the variable-sized networks, which can be used in application scenarios with diverse resource constraints, are alleviated. However, SN-Net still faces a few challe… ▽ More

    Submitted 11 December, 2023; v1 submitted 9 December, 2023; originally announced December 2023.

  45. arXiv:2312.05614  [pdf, other

    cs.AI cs.LG

    Transformer as Linear Expansion of Learngene

    Authors: Shiyu Xia, Miaosen Zhang, Xu Yang, Ruiming Chen, Haokun Chen, Xin Geng

    Abstract: We propose expanding the shared Transformer module to produce and initialize Transformers of varying depths, enabling adaptation to diverse resource constraints. Drawing an analogy to genetic expansibility, we term such module as learngene. To identify the expansion mechanism, we delve into the relationship between the layer's position and its corresponding weight value, and find that linear funct… ▽ More

    Submitted 20 December, 2023; v1 submitted 9 December, 2023; originally announced December 2023.

  46. arXiv:2312.01598  [pdf, other

    cs.CV

    Good Questions Help Zero-Shot Image Reasoning

    Authors: Kaiwen Yang, Tao Shen, Xinmei Tian, Xiubo Geng, Chongyang Tao, Dacheng Tao, Tianyi Zhou

    Abstract: Aligning the recent large language models (LLMs) with computer vision models leads to large vision-language models (LVLMs), which have paved the way for zero-shot image reasoning tasks. However, LVLMs are usually trained on short high-level captions only referring to sparse focus regions in images. Such a ``tunnel vision'' limits LVLMs to exploring other relevant contexts in complex scenes. To add… ▽ More

    Submitted 8 December, 2023; v1 submitted 3 December, 2023; originally announced December 2023.

  47. arXiv:2312.00785  [pdf, other

    cs.CV

    Sequential Modeling Enables Scalable Learning for Large Vision Models

    Authors: Yutong Bai, Xinyang Geng, Karttikeya Mangalam, Amir Bar, Alan Yuille, Trevor Darrell, Jitendra Malik, Alexei A Efros

    Abstract: We introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM) without making use of any linguistic data. To do this, we define a common format, "visual sentences", in which we can represent raw images and videos as well as annotated data sources such as semantic segmentations and depth reconstructions without needing any meta-knowledge beyond the pixels. Once… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Website: https://yutongbai.com/lvm.html

  48. arXiv:2312.00351  [pdf, other

    cs.CV

    Manipulating the Label Space for In-Context Classification

    Authors: Haokun Chen, Xu Yang, Yuhang Huang, Zihan Wu, Jing Wang, Xin Geng

    Abstract: After pre-training by generating the next word conditional on previous words, the Language Model (LM) acquires the ability of In-Context Learning (ICL) that can learn a new task conditional on the context of the given in-context examples (ICEs). Similarly, visually-conditioned Language Modelling is also used to train Vision-Language Models (VLMs) with ICL ability. However, such VLMs typically exhi… ▽ More

    Submitted 5 December, 2023; v1 submitted 30 November, 2023; originally announced December 2023.

  49. arXiv:2311.16695  [pdf

    cond-mat.mtrl-sci physics.app-ph

    Spin-Orbital Coupling in All-Inorganic Metal-Halide Perovskites: the Hidden Force that Matters

    Authors: Pradeep Raja Anandan, Muhammad Nadeem, Chun-Ho Lin, Simrjit Singh, Xinwei Guan, Jiyun Kim, Shamim Shahroki, Md Zahidur Rahaman, Xun Geng, Jing-Kai Huang, Hien Nguyen, Hanlin Hu, Pankaj Sharma, Jan Seidel, Xiaolin Wang, Tom Wu

    Abstract: Highlighted with improved long-term thermal and environmental stability, all-inorganic metal halide perovskites exhibit tunable physical properties, cost-effective synthesis, and satisfactory optoelectronic performance, attracting increasing research interests worldwide. However, a less explored feature of these materials is their strong spin-orbit coupling (SOC), which is the hidden force influen… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: 44 pages, 5 figures

  50. arXiv:2311.16556  [pdf, other

    cs.LG

    Scalable Label Distribution Learning for Multi-Label Classification

    Authors: Xingyu Zhao, Yuexuan An, Lei Qi, Xin Geng

    Abstract: Multi-label classification (MLC) refers to the problem of tagging a given instance with a set of relevant labels. Most existing MLC methods are based on the assumption that the correlation of two labels in each label pair is symmetric, which is violated in many real-world scenarios. Moreover, most existing methods design learning processes associated with the number of labels, which makes their co… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.