Zum Hauptinhalt springen

Showing 1–50 of 1,468 results for author: Yu, W

.
  1. arXiv:2408.16500  [pdf, other

    cs.CV

    CogVLM2: Visual Language Models for Image and Video Understanding

    Authors: Wenyi Hong, Weihan Wang, Ming Ding, Wenmeng Yu, Qingsong Lv, Yan Wang, Yean Cheng, Shiyu Huang, Junhui Ji, Zhao Xue, Lei Zhao, Zhuoyi Yang, Xiaotao Gu, Xiaohan Zhang, Guanyu Feng, Da Yin, Zihan Wang, Ji Qi, Xixuan Song, Peng Zhang, Debing Liu, Bin Xu, Juanzi Li, Yuxiao Dong, Jie Tang

    Abstract: Beginning with VisualGLM and CogVLM, we are continuously exploring VLMs in pursuit of enhanced vision-language fusion, efficient higher-resolution architecture, and broader modalities and applications. Here we propose the CogVLM2 family, a new generation of visual language models for image and video understanding including CogVLM2, CogVLM2-Video and GLM-4V. As an image understanding model, CogVLM2… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  2. arXiv:2408.15583  [pdf, other

    cs.CE

    PointEMRay: A Novel Efficient SBR Framework on Point Based Geometry

    Authors: Kaiqiao Yang, Che Liu, Wenming Yu, Tie Jun Cui

    Abstract: The rapid computation of electromagnetic (EM) fields across various scenarios has long been a challenge, primarily due to the need for precise geometric models. The emergence of point cloud data offers a potential solution to this issue. However, the lack of electromagnetic simulation algorithms optimized for point-based models remains a significant limitation. In this study, we propose PointEMRay… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: 14 pages, 13 figures, and 2 tables

  3. arXiv:2408.13480  [pdf, other

    cs.DB

    Towards a Converged Relational-Graph Optimization Framework

    Authors: Yunkai Lou, Longbin Lai, Bingqing Lyu, Yufan Yang, Xiaoli Zhou, Wenyuan Yu, Ying Zhang, Jingren Zhou

    Abstract: The recent ISO SQL:2023 standard adopts SQL/PGQ (Property Graph Queries), facilitating graph-like querying within relational databases. This advancement, however, underscores a significant gap in how to effectively optimize SQL/PGQ queries within relational database systems. To address this gap, we extend the foundational SPJ(Select-Project-Join) queries to SPJM queries, which include an additiona… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

  4. arXiv:2408.13195  [pdf, other

    cs.AR cs.LG

    NAS-Cap: Deep-Learning Driven 3-D Capacitance Extraction with Neural Architecture Search and Data Augmentation

    Authors: Haoyuan Li, Dingcheng Yang, Chunyan Pei, Wenjian Yu

    Abstract: More accurate capacitance extraction is demanded for designing integrated circuits under advanced process technology. The pattern matching approach and the field solver for capacitance extraction have the drawbacks of inaccuracy and large computational cost, respectively. Recent work \cite{yang2023cnn} proposes a grid-based data representation and a convolutional neural network (CNN) based capacit… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  5. arXiv:2408.11874  [pdf, other

    stat.AP

    Investigating Mode Effects in Interviewer Variances Using Two Representative Multi-mode Surveys

    Authors: Wenshan Yu, Michael R. Elliott, Trivellore E. Raghunathan

    Abstract: This study examines whether interviewer variances remain consistent across different modes in mixed-mode studies, using data from two distinct designs. In the first design, when interviewers are responsible for either face-to-face or telephone mode, we examine whether there are mode differences in interviewer variances for 1) sensitive political questions, 2) international items, 3) and item missi… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  6. arXiv:2408.10483  [pdf, other

    cs.LG

    PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting

    Authors: Yongbo Yu, Weizhong Yu, Feiping Nie, Xuelong Li

    Abstract: The self-attention mechanism in Transformer architecture, invariant to sequence order, necessitates positional embeddings to encode temporal order in time series prediction. We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences, particularly when employing longer lookback windows. To address this, we introduce an innova… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  7. arXiv:2408.08223  [pdf, ps, other

    cs.IT

    On the Asymptotic Rate of Optimal Codes that Correct Tandem Duplications for Nanopore Sequencing

    Authors: Wenjun Yu, Zuo Ye, Moshe Schwartz

    Abstract: We study codes that can correct backtracking errors during nanopore sequencing. In this channel, a sequence of length $n$ over an alphabet of size $q$ is being read by a sliding window of length $\ell$, where from each window we obtain only its composition. Backtracking errors cause some windows to repeat, hence manifesting as tandem-duplication errors of length $k$ in the $\ell$-read vector of wi… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  8. arXiv:2408.05728  [pdf, other

    cond-mat.mes-hall

    Frequency modulation on magnons in synthetic dimensions

    Authors: Meng Xu, Yan Chen, Weichao Yu

    Abstract: Magnons are promising candidates for next-generation computing architectures, offering the ability to manipulate their amplitude and phase for information encoding. However, the frequency degree of freedom remains largely unexploited due to the complexity of nonlinear process. In this work, we introduce the concept of synthetic frequency dimension into magnonics, treating the eigenfrequency of inh… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

    Comments: 13 pages, 7 figures, supplemental materials included

  9. arXiv:2408.05464  [pdf, other

    physics.app-ph cond-mat.dis-nn

    Physical Neural Networks with Self-Learning Capabilities

    Authors: Weichao Yu, Hangwen Guo, Jiang Xiao, Jian Shen

    Abstract: Physical neural networks are artificial neural networks that mimic synapses and neurons using physical systems or materials. These networks harness the distinctive characteristics of physical systems to carry out computations effectively, potentially surpassing the constraints of conventional digital neural networks. A recent advancement known as ``physical self-learning'' aims to achieve learning… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: 22 pages, 8 figures

    Journal ref: Sci. China Phys. Mech. Astron. 67, 287501 (2024)

  10. arXiv:2408.04568  [pdf, other

    cs.CL cs.AI

    Learning Fine-Grained Grounded Citations for Attributed Large Language Models

    Authors: Lei Huang, Xiaocheng Feng, Weitao Ma, Yuxuan Gu, Weihong Zhong, Xiachong Feng, Weijiang Yu, Weihua Peng, Duyu Tang, Dandan Tu, Bing Qin

    Abstract: Despite the impressive performance on information-seeking tasks, large language models (LLMs) still struggle with hallucinations. Attributed LLMs, which augment generated text with in-line citations, have shown potential in mitigating hallucinations and improving verifiability. However, current approaches suffer from suboptimal citation quality due to their reliance on in-context learning. Further… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Accepted by ACL 2024 Findings

  11. arXiv:2408.03113  [pdf, ps, other

    cs.IT

    Codes Correcting Two Bursts of Exactly $b$ Deletions

    Authors: Zuo Ye, Wenjun Yu, Ohad Elishco

    Abstract: In this paper, we explore constructions for codes that correct two bursts of deletions, with each burst having length exactly $b$. Previously, the best known construction, derived using the syndrome compression technique, achieved a redundancy of at most $7\log n+O\left(\log n/\log\log n\right)$ bits. In this work, we develop new idea idea to construct $q$-ary codes that achieve redundancy at most… ▽ More

    Submitted 28 August, 2024; v1 submitted 6 August, 2024; originally announced August 2024.

    Comments: Redundancy is improved to $6\log n+O(\log\log n)$

  12. arXiv:2408.01332  [pdf, other

    cs.LG

    HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate Prediction

    Authors: Xingyu Lou, Yu Yang, Kuiyao Dong, Heyuan Huang, Wenyi Yu, Ping Wang, Xiu Li, Jun Wang

    Abstract: As the recommendation service needs to address increasingly diverse distributions, such as multi-population, multi-scenario, multitarget, and multi-interest, more and more recent works have focused on multi-distribution modeling and achieved great progress. However, most of them only consider modeling in a single multi-distribution manner, ignoring that mixed multi-distributions often coexist and… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  13. arXiv:2408.00765  [pdf, other

    cs.CV cs.AI cs.CL

    MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities

    Authors: Weihao Yu, Zhengyuan Yang, Linfeng Ren, Linjie Li, Jianfeng Wang, Kevin Lin, Chung-Ching Lin, Zicheng Liu, Lijuan Wang, Xinchao Wang

    Abstract: MM-Vet, with open-ended vision-language questions targeting at evaluating integrated capabilities, has become one of the most popular benchmarks for large multimodal model evaluation. MM-Vet assesses six core vision-language (VL) capabilities: recognition, knowledge, spatial awareness, language generation, OCR, and math. However, its question format is restricted to single image-text pairs, lackin… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: Extension of MM-Vet: arXiv:2308.02490

  14. arXiv:2407.19548  [pdf, other

    cs.CV

    Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle

    Authors: Zhenyu Tang, Junwu Zhang, Xinhua Cheng, Wangbo Yu, Chaoran Feng, Yatian Pang, Bin Lin, Li Yuan

    Abstract: Recent 3D large reconstruction models typically employ a two-stage process, including first generate multi-view images by a multi-view diffusion model, and then utilize a feed-forward model to reconstruct images to 3D content.However, multi-view diffusion models often produce low-quality and inconsistent images, adversely affecting the quality of the final 3D reconstruction. To address this issue,… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: Project page: https://pku-yuangroup.github.io/Cycle3D/

  15. arXiv:2407.16924  [pdf

    cond-mat.mtrl-sci

    Real-space topology-engineering of skyrmionic spin textures in a van der Waals ferromagnet Fe3GaTe2

    Authors: Shuo Mi, Jianfeng Guo, Guojing Hu, Guangcheng Wang, Songyang Li, Zizhao Gong, Shuaizhao Jin, Rui Xu, Fei Pang, Wei Ji, Weiqiang Yu, Xiaolei Wang, Xueyun Wang, Haitao Yang, Zhihai Cheng

    Abstract: Realizing magnetic skyrmions in two-dimensional (2D) van der Waals (vdW) ferromagnets offers unparalleled prospects for future spintronic applications. The room-temperature ferromagnet Fe3GaTe2 provides an ideal platform for tailoring these magnetic solitons. Here, skyrmions of distinct topological charges are artificially introduced and spatially engineered using magnetic force microscopy (MFM).… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  16. arXiv:2407.16674  [pdf, other

    cs.LG cs.AI

    KAN or MLP: A Fairer Comparison

    Authors: Runpeng Yu, Weihao Yu, Xinchao Wang

    Abstract: This paper does not introduce a novel method. Instead, it offers a fairer and more comprehensive comparison of KAN and MLP models across various tasks, including machine learning, computer vision, audio processing, natural language processing, and symbolic formula representation. Specifically, we control the number of parameters and FLOPs to compare the performance of KAN and MLP. Our main observa… ▽ More

    Submitted 17 August, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

    Comments: Technical Report

  17. arXiv:2407.16178  [pdf, other

    physics.chem-ph

    Anomalous Water Penetration in $\text{Al}^{3+}$ Dissolution

    Authors: Minwoo Kim, Seungtae Kim, Changbong Hyeon, Ji Woon Yu, Siyoung Q. Choi, Won Bo Lee

    Abstract: The physicochemical characterization of trivalent ions is limited due to a lack of accurate force fields. By leveraging the latest machine learning force field to model aqueous $\text{AlCl}_{3}$, we discover that upon dissolution of $\text{Al}^{3+}$, water molecules beyond the second hydration shell involve in the hydration process. A combination of scissoring of coordinating water is followed by… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: 15 pages, 4 figures

  18. arXiv:2407.15331  [pdf, other

    cond-mat.str-el cond-mat.stat-mech quant-ph

    Entanglement in quenched extended Su-Schrieffer-Heeger model with anomalous dynamical quantum phase transitions

    Authors: Cheuk Yiu Wong, Tsz Hin Hui, P. D. Sacramento, Wing Chi Yu

    Abstract: Research on topological models unveils fascinating physics, especially in the realm of dynamical quantum phase transitions (DQPTs). However, the understanding of entanglement structures and properties near DQPT in models with longer-range hoppings is far from complete. In this work, we study DQPTs in the quenched extended Su-Schrieffer-Heeger (SSH) model. Anomalous DQPTs, where the number of criti… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: 18 pages, 19 figures

  19. arXiv:2407.15187  [pdf, other

    cs.CV cs.GR

    HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions

    Authors: Haiyang Zhou, Xinhua Cheng, Wangbo Yu, Yonghong Tian, Li Yuan

    Abstract: 3D scene generation is in high demand across various domains, including virtual reality, gaming, and the film industry. Owing to the powerful generative capabilities of text-to-image diffusion models that provide reliable priors, the creation of 3D scenes using only text prompts has become viable, thereby significantly advancing researches in text-driven 3D scene generation. In order to obtain mul… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: Homepage: https://zhouhyocean.github.io/holodreamer

  20. arXiv:2407.14653  [pdf, other

    cs.LG

    OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning

    Authors: Yihang Yao, Zhepeng Cen, Wenhao Ding, Haohong Lin, Shiqi Liu, Tingnan Zhang, Wenhao Yu, Ding Zhao

    Abstract: Offline safe reinforcement learning (RL) aims to train a policy that satisfies constraints using a pre-collected dataset. Most current methods struggle with the mismatch between imperfect demonstrations and the desired safe and rewarding performance. In this paper, we introduce OASIS (cOnditionAl diStributIon Shaping), a new paradigm in offline safe RL designed to overcome these critical limitatio… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  21. arXiv:2407.14497  [pdf, other

    quant-ph

    Observable-Driven Speed-ups in Quantum Simulations

    Authors: Wenjun Yu, Jue Xu, Qi Zhao

    Abstract: As quantum technology advances, quantum simulation becomes increasingly promising, with significant implications for quantum many-body physics and quantum chemistry. Despite being one of the most accessible simulation methods, the product formula encounters challenges due to the pessimistic gate count estimation. In this work, we elucidate how observable knowledge can accelerate quantum simulation… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: 37 pages, 5 figures

  22. arXiv:2407.10943  [pdf, other

    cs.RO cs.CV

    GRUtopia: Dream General Robots in a City at Scale

    Authors: Hanqing Wang, Jiahe Chen, Wensi Huang, Qingwei Ben, Tai Wang, Boyu Mi, Tao Huang, Siheng Zhao, Yilun Chen, Sizhe Yang, Peizhou Cao, Wenye Yu, Zichao Ye, Jialun Li, Junfeng Long, Zirui Wang, Huiling Wang, Ying Zhao, Zhongying Tu, Yu Qiao, Dahua Lin, Jiangmiao Pang

    Abstract: Recent works have been exploring the scaling laws in the field of Embodied AI. Given the prohibitive costs of collecting real-world data, we believe the Simulation-to-Real (Sim2Real) paradigm is a crucial step for scaling the learning of embodied models. This paper introduces project GRUtopia, the first simulated interactive 3D society designed for various robots. It features several advancements:… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  23. arXiv:2407.10701  [pdf, other

    cs.CL

    DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems

    Authors: Anni Zou, Wenhao Yu, Hongming Zhang, Kaixin Ma, Deng Cai, Zhuosheng Zhang, Hai Zhao, Dong Yu

    Abstract: Recently, there has been a growing interest among large language model (LLM) developers in LLM-based document reading systems, which enable users to upload their own documents and pose questions related to the document contents, going beyond simple reading comprehension tasks. Consequently, these systems have been carefully designed to tackle challenges such as file parsing, metadata extraction, m… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Work in progress

  24. Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning

    Authors: Jiakai Tang, Sunhao Dai, Zexu Sun, Xu Chen, Jun Xu, Wenhui Yu, Lantao Hu, Peng Jiang, Han Li

    Abstract: In recent years, graph contrastive learning (GCL) has received increasing attention in recommender systems due to its effectiveness in reducing bias caused by data sparsity. However, most existing GCL models rely on heuristic approaches and usually assume entity independence when constructing contrastive views. We argue that these methods struggle to strike a balance between semantic invariance an… ▽ More

    Submitted 21 July, 2024; v1 submitted 14 July, 2024; originally announced July 2024.

    Comments: KDD 2024

  25. arXiv:2407.09932  [pdf, other

    quant-ph

    Quantum Clock Synchronization Network with Silicon-chip Dual-Pumped Entangled Photon Source

    Authors: J. A. Li, H. Han, X. P. Huang, B. Y. Tang, K. Guo, J. Q. Huang, S. Y. Xiong, W. R. Yu, Z. J. Zhang, J. B. Yang, B. Liu, H. Chen, Z. K. Lu

    Abstract: In this paper, we propose a quantum clock synchronization (QCS) network scheme with silicon-chip dual-pumped entangled photon source. This scheme couples two pump beams into the silicon-based waveguide, where degenerate and non-degenerate spontaneous four-wave mixing (SFWM) occurs, generating entanglement between one signal channel and three idler channels. The entangled photons are distributed to… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  26. arXiv:2407.09324  [pdf, other

    cs.LG cs.AI cs.IT

    Provable Privacy Advantages of Decentralized Federated Learning via Distributed Optimization

    Authors: Wenrui Yu, Qiongxiu Li, Milan Lopuhaä-Zwakenberg, Mads Græsbøll Christensen, Richard Heusdens

    Abstract: Federated learning (FL) emerged as a paradigm designed to improve data privacy by enabling data to reside at its source, thus embedding privacy as a core consideration in FL architectures, whether centralized or decentralized. Contrasting with recent findings by Pasquini et al., which suggest that decentralized FL does not empirically offer any additional privacy or security benefits over centrali… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  27. arXiv:2407.09013  [pdf, ps, other

    cs.AI cs.LG

    Procedural Content Generation via Generative Artificial Intelligence

    Authors: Xinyu Mao, Wanli Yu, Kazunori D Yamada, Michael R. Zielewski

    Abstract: The attempt to utilize machine learning in PCG has been made in the past. In this survey paper, we investigate how generative artificial intelligence (AI), which saw a significant increase in interest in the mid-2010s, is being used for PCG. We review applications of generative AI for the creation of various types of content, including terrains, items, and even storylines. While generative AI is e… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  28. PAIL: Performance based Adversarial Imitation Learning Engine for Carbon Neutral Optimization

    Authors: Yuyang Ye, Lu-An Tang, Haoyu Wang, Runlong Yu, Wenchao Yu, Erhu He, Haifeng Chen, Hui Xiong

    Abstract: Achieving carbon neutrality within industrial operations has become increasingly imperative for sustainable development. It is both a significant challenge and a key opportunity for operational optimization in industry 4.0. In recent years, Deep Reinforcement Learning (DRL) based methods offer promising enhancements for sequential optimization processes and can be used for reducing carbon emission… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  29. arXiv:2407.08421  [pdf, other

    astro-ph.HE

    X-ray spectral and timing evolution during the 2018 outburst of MAXI J1820+070

    Authors: YaXing Li, Zhen Yan, ChenXu Gao, Wenfei Yu

    Abstract: We made use high-cadence observations from the $Insight$-HXMT and $NICER$ to scrutinize the spectral and timing evolution during the 2018 outburst of the black hole X-ray binary (BHXRB) MAXI J1820+070. It's hardness-intensity diagram (HID) displays a ''q''-like track including all the spectral states, along a unique loop in the hard state. The tracks observed in the HID is anticipated in the evolu… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 14 pages, 10 figures, submitted to MNRAS

  30. arXiv:2407.07775  [pdf, other

    cs.RO cs.AI

    Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs

    Authors: Hao-Tien Lewis Chiang, Zhuo Xu, Zipeng Fu, Mithun George Jacob, Tingnan Zhang, Tsang-Wei Edward Lee, Wenhao Yu, Connor Schenck, David Rendleman, Dhruv Shah, Fei Xia, Jasmine Hsu, Jonathan Hoech, Pete Florence, Sean Kirmani, Sumeet Singh, Vikas Sindhwani, Carolina Parada, Chelsea Finn, Peng Xu, Sergey Levine, Jie Tan

    Abstract: An elusive goal in navigation research is to build an intelligent agent that can understand multimodal instructions including natural language and image, and perform useful navigation. To achieve this, we study a widely useful category of navigation tasks we call Multimodal Instruction Navigation with demonstration Tours (MINT), in which the environment prior is provided through a previously recor… ▽ More

    Submitted 12 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

  31. arXiv:2407.07304  [pdf, other

    cs.AI

    Inference Performance Optimization for Large Language Models on CPUs

    Authors: Pujiang He, Shan Zhou, Wenhuan Huang, Changqing Li, Duyi Wang, Bin Guo, Chen Meng, Sheng Gui, Weifei Yu, Yi Xie

    Abstract: Large language models (LLMs) have shown exceptional performance and vast potential across diverse tasks. However, the deployment of LLMs with high performance in low-resource environments has garnered significant attention in the industry. When GPU hardware resources are limited, we can explore alternative options on CPUs. To mitigate the financial burden and alleviate constraints imposed by hardw… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 5 pages, 6 figure, ICML 2024 on Foundation Models in the Wild

  32. arXiv:2407.06222  [pdf, other

    math.LO

    Formalization of the Filter Extension Principle (FEP) in Coq

    Authors: Guowei Dou, Wensheng Yu

    Abstract: The Filter Extension Principle (FEP) asserts that every filter can be extended to an ultrafilter, which plays a crucial role in the quest for non-principal ultrafilters. Non-principal ultrafilters find widespread applications in logic, set theory, topology, model theory, and especially non-standard extensions of algebraic structures. Since non-principal ultrafilters are challenging to construct di… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Conference on Intelligent Networked Things, 2024 (CINT2024)

  33. arXiv:2407.05540  [pdf, other

    cs.CV

    GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation

    Authors: Chenxin Li, Xinyu Liu, Cheng Wang, Yifan Liu, Weihao Yu, Jing Shao, Yixuan Yuan

    Abstract: Recent advances in learning multi-modal representation have witnessed the success in biomedical domains. While established techniques enable handling multi-modal information, the challenges are posed when extended to various clinical modalities and practical modalitymissing setting due to the inherent modality gaps. To tackle these, we propose an innovative Modality-prompted Heterogeneous Graph fo… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  34. arXiv:2407.05413  [pdf, other

    cs.AI cs.CL cs.LG

    SBoRA: Low-Rank Adaptation with Regional Weight Updates

    Authors: Lai-Man Po, Yuyang Liu, Haoxuan Wu, Tianqi Zhang, Wing-Yin Yu, Zeyu Jiang, Kun Li

    Abstract: This paper introduces Standard Basis LoRA (SBoRA), a novel parameter-efficient fine-tuning approach for Large Language Models that builds upon the pioneering works of Low-Rank Adaptation (LoRA) and Orthogonal Adaptation. SBoRA further reduces the computational and memory requirements of LoRA while enhancing learning performance. By leveraging orthogonal standard basis vectors to initialize one of… ▽ More

    Submitted 10 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

    Comments: 15 pages, 2 figures

  35. arXiv:2407.05236  [pdf, other

    astro-ph.HE

    A timing view of the additional high-energy spectral component discovered in the black hole candidate Swift J1727.8-1613

    Authors: Zi-Xu Yang, Liang Zhang, Shuang-Nan Zhang, L. Tao, Shu Zhang, Ruican Ma, Qingcui Bu, Yue Huang, He-Xin Liu, Wei Yu, Guang C. Xiao, Peng-Ju Wang, Hua Feng, Li-Ming Song, Xiang Ma, Mingyu Ge, QingChang Zhao, J. L. Qu

    Abstract: We present an energy-dependent analysis for the type-C quasi-periodic oscillations (QPOs) observed in the black hole X-ray binary Swift J1727.8-1613 using Insight-HXMT observations. We find that the QPO fractional rms at energies above 40 keV is significantly higher than that below 20 keV. This is the first report of a high energy (HE)-rms excess in the rms spectrum of a black hole X-ray binary. I… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  36. arXiv:2407.03971  [pdf, other

    cs.CV

    MineNetCD: A Benchmark for Global Mining Change Detection on Remote Sensing Imagery

    Authors: Weikang Yu, Xiaokang Zhang, Xiao Xiang Zhu, Richard Gloaguen, Pedram Ghamisi

    Abstract: Monitoring changes triggered by mining activities is crucial for industrial controlling, environmental management and regulatory compliance, yet it poses significant challenges due to the vast and often remote locations of mining sites. Remote sensing technologies have increasingly become indispensable to detect and analyze these changes over time. We thus introduce MineNetCD, a comprehensive benc… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  37. arXiv:2407.02190  [pdf, other

    cs.RO

    I2EKF-LO: A Dual-Iteration Extended Kalman Filter Based LiDAR Odometry

    Authors: Wenlu Yu, Jie Xu, Chengwei Zhao, Lijun Zhao, Thien-Minh Nguyen, Shenghai Yuan, Mingming Bai, Lihua Xie

    Abstract: LiDAR odometry is a pivotal technology in the fields of autonomous driving and autonomous mobile robotics. However, most of the current works focus on nonlinear optimization methods, and still existing many challenges in using the traditional Iterative Extended Kalman Filter (IEKF) framework to tackle the problem: IEKF only iterates over the observation equation, relying on a rough estimate of the… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted by IROS 2024

  38. arXiv:2407.01950  [pdf, other

    cs.RO cs.AI

    LDP: A Local Diffusion Planner for Efficient Robot Navigation and Collision Avoidance

    Authors: Wenhao Yu, Jie Peng, Huanyu Yang, Junrui Zhang, Yifan Duan, Jianmin Ji, Yanyong Zhang

    Abstract: The conditional diffusion model has been demonstrated as an efficient tool for learning robot policies, owing to its advancement to accurately model the conditional distribution of policies. The intricate nature of real-world scenarios, characterized by dynamic obstacles and maze-like structures, underscores the complexity of robot local navigation decision-making as a conditional distribution pro… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 8 pages, 6 figures, accepted by IROS 2024

  39. arXiv:2407.01875  [pdf, ps, other

    cs.AI

    Spatio-Temporal Graphical Counterfactuals: An Overview

    Authors: Mingyu Kang, Duxin Chen, Ziyuan Pu, Jianxi Gao, Wenwu Yu

    Abstract: Counterfactual thinking is a critical yet challenging topic for artificial intelligence to learn knowledge from data and ultimately improve their performances for new scenarios. Many research works, including Potential Outcome Model and Structural Causal Model, have been proposed to realize it. However, their modelings, theoretical foundations and application approaches are usually different. More… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  40. arXiv:2407.01029  [pdf, other

    cs.CV

    EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting

    Authors: Chenxin Li, Brandon Y. Feng, Yifan Liu, Hengyu Liu, Cheng Wang, Weihao Yu, Yixuan Yuan

    Abstract: 3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities. Existing methods employ various advanced neural rendering techniques for photorealistic view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available, which is usually… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accpeted by MICCAI2024

  41. arXiv:2407.00946  [pdf

    cond-mat.mtrl-sci

    Atomic cluster expansion interatomic potential for defects and thermodynamics of Cu-W system

    Authors: Jiahao Pan, Huiqun Cheng, Gaosheng Yan, Lei Zhang, Wenshan Yu, Shengping Shen

    Abstract: The unique properties exhibited in immiscible metals, such as excellent strength, hardness, and radiation-damage tolerance, have stimulated the interest of many researchers. As a typical immiscible metal system, the Cu-W nano-multilayers combine the plasticity of copper and the strength of tungsten, making it a suitable candidate for applications in aerospace, nuclear fusion engineering, and elect… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 26 pages, 14 figures

  42. arXiv:2407.00029  [pdf, other

    cs.DC

    Distributed Inference Performance Optimization for LLMs on CPUs

    Authors: Pujiang He, Shan Zhou, Changqing Li, Wenhuan Huang, Weifei Yu, Duyi Wang, Chen Meng, Sheng Gui

    Abstract: Large language models (LLMs) hold tremendous potential for addressing numerous real-world challenges, yet they typically demand significant computational resources and memory. Deploying LLMs onto a resource-limited hardware device with restricted memory capacity presents considerable challenges. Distributed computing emerges as a prevalent strategy to mitigate single-node memory constraints and ex… ▽ More

    Submitted 16 May, 2024; originally announced July 2024.

    Comments: 4 pages, 3 figures, Practical ML for Low Resource Settings Workshop @ ICLR 2024

  43. arXiv:2406.20019  [pdf, other

    cs.IT

    Capacity Bounds for Broadcast Channels with Bidirectional Conferencing Decoders

    Authors: Reza K. Farsani, Wei Yu

    Abstract: The two-user broadcast channel (BC) with receivers connected by cooperative links of given capacities, known as conferencing decoders, is considered. A novel outer bound on the capacity region is established. This outer bound is derived using multiple applications of the Csiszár-Körner identity. New achievable rate regions are also presented. A first achievable rate region is derived by applying M… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  44. arXiv:2406.19820  [pdf, other

    cs.CL cs.AI

    BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering

    Authors: Zheng Chu, Jingchang Chen, Qianglong Chen, Haotian Wang, Kun Zhu, Xiyuan Du, Weijiang Yu, Ming Liu, Bing Qin

    Abstract: Large language models (LLMs) have demonstrated strong reasoning capabilities. Nevertheless, they still suffer from factual errors when tackling knowledge-intensive tasks. Retrieval-augmented reasoning represents a promising approach. However, significant challenges still persist, including inaccurate and insufficient retrieval for complex questions, as well as difficulty in integrating multi-sourc… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024

  45. arXiv:2406.19627  [pdf

    eess.SY

    Practical Power System Inertia Monitoring Based on Pumped Storage Hydropower Operation Signature

    Authors: Hongyu Li, Chang Chen, Mark Baldwin, Shutang You, Wenpeng Yu, Lin Zhu, Yilu Liu

    Abstract: This paper proposes a practical method to monitor power system inertia using Pumped Storage Hydropower (PSH) switching-off events. This approach offers real-time system-level inertia estimation with minimal expenses, no disruption, and the inclusion of behind-the-meter inertia. First, accurate inertia estimation is achieved through improved RoCoF calculation that accounts for pre-event RoCoF, redu… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 8 pages, 15 figures

  46. arXiv:2406.19483  [pdf, ps, other

    eess.SP

    Localization in Multipath Environments via Active Sensing with Reconfigurable Intelligent Surfaces

    Authors: Yinghan Li, Wei Yu

    Abstract: This letter investigates an uplink pilot-based wireless indoor localization problem in a multipath environment for a single-input single-output (SISO) narrowband communication system aided by reconfigurable intelligent surface (RIS). The indoor localization problem is challenging because the uplink channel consists of multiple overlapping propagation paths with varying amplitudes and phases, which… ▽ More

    Submitted 8 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

  47. arXiv:2406.18361  [pdf, other

    cs.CV cs.AI eess.IV

    Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process

    Authors: Tianyu Lin, Zhiguang Chen, Zhonghao Yan, Weijiang Yu, Fudan Zheng

    Abstract: Diffusion models have demonstrated their effectiveness across various generative tasks. However, when applied to medical image segmentation, these models encounter several challenges, including significant resource and time requirements. They also necessitate a multi-step reverse process and multiple samples to produce reliable predictions. To address these challenges, we introduce the first laten… ▽ More

    Submitted 9 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted at MICCAI 2024. Code and citation info see https://github.com/lin-tianyu/Stable-Diffusion-Seg

  48. arXiv:2406.18008  [pdf, other

    cs.IT

    Rate-Distortion-Perception Tradeoff for Gaussian Vector Sources

    Authors: Jingjing Qian, Sadaf Salehkalaibar, Jun Chen, Ashish Khisti, Wei Yu, Wuxian Shi, Yiqun Ge, Wen Tong

    Abstract: This paper studies the rate-distortion-perception (RDP) tradeoff for a Gaussian vector source coding problem where the goal is to compress the multi-component source subject to distortion and perception constraints. The purpose of imposing a perception constraint is to ensure visually pleasing reconstructions. This paper studies this RDP setting with either the Kullback-Leibler (KL) divergence or… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  49. arXiv:2406.17269  [pdf, other

    hep-th

    Elko as an inflaton candidate

    Authors: Xinglong Chen, Cheng-Yang Lee, Yanjiao Ma, Haomin Rao, Wenqi Yu, Siyi Zhou

    Abstract: Elko is a spin-half fermion with a two-fold Wigner degeneracy and Klein-Gordon dynamics. In this paper, we show that in a spatially flat FLRW space-time, slow-roll inflation can be initiated by the homogeneous Elko fields. The inflaton is a composite scalar field obtained by contracting the spinor field with its dual. This is possible because the background evolution as described by the Friedmann… ▽ More

    Submitted 29 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: 15 pages, 8 figures

  50. arXiv:2406.15877  [pdf, other

    cs.SE cs.AI cs.CL

    BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

    Authors: Terry Yue Zhuo, Minh Chien Vu, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, Simon Brunner, Chen Gong, Thong Hoang, Armel Randy Zebaze, Xiaoheng Hong, Wen-Ding Li, Jean Kaddour, Ming Xu, Zhihan Zhang, Prateek Yadav, Naman Jain, Alex Gu, Zhoujun Cheng, Jiawei Liu, Qian Liu , et al. (8 additional authors not shown)

    Abstract: Automated software engineering has been greatly empowered by the recent advances in Large Language Models (LLMs) for programming. While current benchmarks have shown that LLMs can perform various software engineering tasks like human developers, the majority of their evaluations are limited to short and self-contained algorithmic tasks. Solving challenging and practical programming tasks requires… ▽ More

    Submitted 26 June, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: 44 pages, 14 figures, 7 tables, built with love by the BigCode community :)