Zum Hauptinhalt springen

Showing 101–150 of 1,468 results for author: Yu, W

.
  1. arXiv:2404.14604  [pdf, other

    cs.CL

    Describe-then-Reason: Improving Multimodal Mathematical Reasoning through Visual Comprehension Training

    Authors: Mengzhao Jia, Zhihan Zhang, Wenhao Yu, Fangkai Jiao, Meng Jiang

    Abstract: Open-source multimodal large language models (MLLMs) excel in various tasks involving textual and visual inputs but still struggle with complex multimodal mathematical reasoning, lagging behind proprietary models like GPT-4V(ision) and Gemini-Pro. Although fine-tuning with intermediate steps (i.e., rationales) elicits some mathematical reasoning skills, the resulting models still fall short in vis… ▽ More

    Submitted 25 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  2. arXiv:2404.14405  [pdf, other

    cs.RO

    Learning H-Infinity Locomotion Control

    Authors: Junfeng Long, Wenye Yu, Quanyi Li, Zirui Wang, Dahua Lin, Jiangmiao Pang

    Abstract: Stable locomotion in precipitous environments is an essential task for quadruped robots, requiring the ability to resist various external disturbances. Recent neural policies enhance robustness against disturbances by learning to resist external forces sampled from a fixed distribution in the simulated environment. However, the force generation process doesn't consider the robot's current state, m… ▽ More

    Submitted 12 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: Project Page: https://junfeng-long.github.io/HINF/

  3. Discovery of a long thermonuclear X-ray burst from the ultra-compact binary 4U 1850$-$087

    Authors: Yongqi Lu, Zhaosheng Li, Wenhui Yu, Yuanyue Pan, Maurizio Falanga

    Abstract: We report the detection of a long X-ray burst triggered on MJD 60171.65 from the ultra-compact binary 4U 1850$-$087 by the Monitor of All-sky X-ray Image and Neutron Star Interior Composition Explorer (NICER). We analyse the NICER data observed in between MJD 60095.19$-$60177.43, including one observation covered part of the long X-ray burst tail, i.e., $0.15-3.8$ hr after the trigger. The persist… ▽ More

    Submitted 30 July, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 9 pages, 6 figures, to match the published version in ApJ

    Journal ref: ApJ 969, 15 (2024)

  4. arXiv:2404.13848  [pdf, other

    cs.CV

    DSDRNet: Disentangling Representation and Reconstruct Network for Domain Generalization

    Authors: Juncheng Yang, Zuchao Li, Shuai Xie, Wei Yu, Shijun Li

    Abstract: Domain generalization faces challenges due to the distribution shift between training and testing sets, and the presence of unseen target domains. Common solutions include domain alignment, meta-learning, data augmentation, or ensemble learning, all of which rely on domain labels or domain adversarial techniques. In this paper, we propose a Dual-Stream Separation and Reconstruction Network, dubbed… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: This paper is accepted to IJCNN 2024

  5. arXiv:2404.13640  [pdf, other

    cs.MM cs.CV eess.IV

    Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer

    Authors: Kepeng Xu, Li Xu, Gang He, Wenxin Yu, Yunsong Li

    Abstract: Multiple complex degradations are coupled in low-quality video faces in the real world. Therefore, blind video face restoration is a highly challenging ill-posed problem, requiring not only hallucinating high-fidelity details but also enhancing temporal coherence across diverse pose variations. Restoring each frame independently in a naive manner inevitably introduces temporal incoherence and arti… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 9 pages

  6. arXiv:2404.13392  [pdf, other

    cs.IT eess.SP math.OC

    Beamforming Design for Integrated Sensing and Communications Using Uplink-Downlink Duality

    Authors: Kareem M. Attiah, Wei Yu

    Abstract: This paper presents a novel optimization framework for beamforming design in integrated sensing and communication systems where a base station seeks to minimize the Bayesian Cramér-Rao bound of a sensing problem while satisfying quality of service constraints for the communication users. Prior approaches formulate the design problem as a semidefinite program for which acquiring a beamforming solut… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 6 pages, 2 figures, accepted at ISIT2024

  7. arXiv:2404.12879  [pdf, other

    cs.CL

    Unlocking Multi-View Insights in Knowledge-Dense Retrieval-Augmented Generation

    Authors: Guanhua Chen, Wenhan Yu, Lei Sha

    Abstract: While Retrieval-Augmented Generation (RAG) plays a crucial role in the application of Large Language Models (LLMs), existing retrieval methods in knowledge-dense domains like law and medicine still suffer from a lack of multi-perspective views, which are essential for improving interpretability and reliability. Previous research on multi-view retrieval often focused solely on different semantic fo… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  8. arXiv:2404.12588  [pdf, other

    cs.CV cs.LG

    Cross-Modal Adapter: Parameter-Efficient Transfer Learning Approach for Vision-Language Models

    Authors: Juncheng Yang, Zuchao Li, Shuai Xie, Weiping Zhu, Wei Yu, Shijun Li

    Abstract: Adapter-based parameter-efficient transfer learning has achieved exciting results in vision-language models. Traditional adapter methods often require training or fine-tuning, facing challenges such as insufficient samples or resource limitations. While some methods overcome the need for training by leveraging image modality cache and retrieval, they overlook the text modality's importance and cro… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: This paper is accepted to ICME 2024

  9. MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification

    Authors: Weikang Yu, Xiaokang Zhang, Samiran Das, Xiao Xiang Zhu, Pedram Ghamisi

    Abstract: Change detection (CD) from remote sensing (RS) images using deep learning has been widely investigated in the literature. It is typically regarded as a pixel-wise labeling task that aims to classify each pixel as changed or unchanged. Although per-pixel classification networks in encoder-decoder structures have shown dominance, they still suffer from imprecise boundaries and incomplete object deli… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  10. arXiv:2404.11979  [pdf, other

    cs.CV

    MTGA: Multi-view Temporal Granularity aligned Aggregation for Event-based Lip-reading

    Authors: Wenhao Zhang, Jun Wang, Yong Luo, Lei Yu, Wei Yu, Zheng He

    Abstract: Lip-reading is to utilize the visual information of the speaker's lip movements to recognize words and sentences. Existing event-based lip-reading solutions integrate different frame rate branches to learn spatio-temporal features of varying granularities. However, aggregating events into event frames inevitably leads to the loss of fine-grained temporal information within frames. To remedy this d… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  11. arXiv:2404.11249  [pdf, other

    cs.CV

    A Progressive Framework of Vision-language Knowledge Distillation and Alignment for Multilingual Scene

    Authors: Wenbo Zhang, Yifan Zhang, Jianfeng Lin, Binqiang Huang, Jinlu Zhang, Wenhao Yu

    Abstract: Pre-trained vision-language (V-L) models such as CLIP have shown excellent performance in many downstream cross-modal tasks. However, most of them are only applicable to the English context. Subsequent research has focused on this problem and proposed improved models, such as CN-CLIP and AltCLIP, to facilitate their applicability to Chinese and even other languages. Nevertheless, these models suff… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  12. arXiv:2404.10948  [pdf, other

    hep-ex

    First double-differential cross section measurement of neutral-current $π^0$ production in neutrino-argon scattering in the MicroBooNE detector

    Authors: MicroBooNE collaboration, P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, A. Barnard, G. Barr, D. Barrow, J. Barrow, V. Basque, J. Bateman, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J. Y. Book , et al. (166 additional authors not shown)

    Abstract: We report the first double-differential cross section measurement of neutral-current neutral pion (NC$π^0$) production in neutrino-argon scattering, as well as single-differential measurements of the same channel in terms of final states with and without protons. The kinematic variables of interest for these measurements are the $π^0$ momentum and the $π^0$ scattering angle with respect to the neu… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Report number: FERMILAB-PUB-24-0125

  13. arXiv:2404.09949  [pdf, other

    hep-ex physics.ins-det

    Measurement of the differential cross section for neutral pion production in charged-current muon neutrino interactions on argon with the MicroBooNE detector

    Authors: MicroBooNE collaboration, P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, G. Barr, D. Barrow, J. Barrow, V. Basque, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J. Y. Book, M. B. Brunetti, L. Camilleri , et al. (163 additional authors not shown)

    Abstract: We present a measurement of neutral pion production in charged-current interactions using data recorded with the MicroBooNE detector exposed to Fermilab's booster neutrino beam. The signal comprises one muon, one neutral pion, any number of nucleons, and no charged pions. Studying neutral pion production in the MicroBooNE detector provides an opportunity to better understand neutrino-argon interac… ▽ More

    Submitted 6 May, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Report number: FERMILAB-PUB-24-0142-CSAID-PPD

  14. arXiv:2404.09276  [pdf, other

    cs.MS math.NA

    Algorithm xxx: Faster Randomized SVD with Dynamic Shifts

    Authors: Xu Feng, Wenjian Yu, Yuyang Xie, Jie Tang

    Abstract: Aiming to provide a faster and convenient truncated SVD algorithm for large sparse matrices from real applications (i.e. for computing a few of largest singular values and the corresponding singular vectors), a dynamically shifted power iteration technique is applied to improve the accuracy of the randomized SVD method. This results in a dynamic shifts based randomized SVD (dashSVD) algorithm, whi… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 26 pages, accepted by ACM Transactions on Mathematical Software

  15. arXiv:2404.08675  [pdf, other

    cs.IR cs.AI cs.CL

    RecGPT: Generative Personalized Prompts for Sequential Recommendation via ChatGPT Training Paradigm

    Authors: Yabin Zhang, Wenhui Yu, Erhan Zhang, Xu Chen, Lantao Hu, Peng Jiang, Kun Gai

    Abstract: ChatGPT has achieved remarkable success in natural language understanding. Considering that recommendation is indeed a conversation between users and the system with items as words, which has similar underlying pattern with ChatGPT, we design a new chat framework in item index level for the recommendation task. Our novelty mainly contains three parts: model, training and inference. For the model p… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  16. arXiv:2404.07545  [pdf, other

    cs.CV

    Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion

    Authors: Ang Li, Anning Hu, Wei Xi, Wenxian Yu, Danping Zou

    Abstract: Accurate and dense depth estimation with stereo cameras and LiDAR is an important task for automatic driving and robotic perception. While sparse hints from LiDAR points have improved cost aggregation in stereo matching, their effectiveness is limited by the low density and non-uniform distribution. To address this issue, we propose a novel stereo-LiDAR depth estimation network with Semi-Dense hin… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Accepted in ICRA 2024. 8 pages, 6 figures

  17. arXiv:2404.07490  [pdf, other

    cond-mat.str-el

    Low-energy spin dynamics in a Kitaev material Na3Ni2BiO6 investigated by NMR

    Authors: Xinyu Shi, Yi Cui, Yanyan Shangguan, Xiaoyu Xu, Zhanlong Wu, Ze Hu, Shuo Li, Kefan Du, Ying Chen, Long Ma, Zhengxin Liu, Jinsheng Wen, Jinshan Zhang, Weiqiang Yu

    Abstract: We performed 23Na NMR and magnetization measurements on an S = 1, quasi-2D honeycomb lattice antiferromagnet Na3Ni2BiO6. A large positive Curie-Weiss constant of 22.9 K is observed. The NMR spectra at low fields are consistent with a "zigzag" magnetic order, indicating a large easy-axis anisotropy. With field applied along the c* axis, the NMR spectra confirm the existence of a 1/3-magnetization p… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 7 pages, 7 figures

  18. arXiv:2404.06037  [pdf, other

    cs.DC

    A Survey of Distributed Graph Algorithms on Massive Graphs

    Authors: Lingkai Meng, Yu Shao, Long Yuan, Longbin Lai, Peng Cheng, Xue Li, Wenyuan Yu, Wenjie Zhang, Xuemin Lin, Jingren Zhou

    Abstract: Distributed processing of large-scale graph data has many practical applications and has been widely studied. In recent years, a lot of distributed graph processing frameworks and algorithms have been proposed. While many efforts have been devoted to analyzing these, with most analyzing them based on programming models, less research focuses on understanding their challenges in distributed environ… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  19. arXiv:2404.05260  [pdf, other

    cs.AR

    SRAM-PG: Power Delivery Network Benchmarks from SRAM Circuits

    Authors: Shan Shen, Zhiqiang Liu, Wenjian Yu

    Abstract: Designing the power delivery network (PDN) in very large-scale integrated (VLSI) circuits is increasingly important, especially for nowadays low-power integrated circuit (IC) design. In order to ensure that the designed PDN enables a low level of voltage drop and noise which is required for the success of IC design, accurate analysis of PDN is largely demanded and brings a challenge of computation… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Oral presentation at ISQED'24

  20. arXiv:2404.04538  [pdf, other

    cs.AI cs.CL

    Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning

    Authors: Juncheng Yang, Zuchao Li, Shuai Xie, Wei Yu, Shijun Li, Bo Du

    Abstract: The chain-of-thought technique has been received well in multi-modal tasks. It is a step-by-step linear reasoning process that adjusts the length of the chain to improve the performance of generated prompts. However, human thought processes are predominantly non-linear, as they encompass multiple aspects simultaneously and employ dynamic adjustment and updating mechanisms. Therefore, we propose a… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: This paper is accepted to LREC-COLING 2024

  21. arXiv:2404.03411  [pdf, ps, other

    cs.LG cs.CL cs.CR

    Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?

    Authors: Shuo Chen, Zhen Han, Bailan He, Zifeng Ding, Wenqian Yu, Philip Torr, Volker Tresp, Jindong Gu

    Abstract: Various jailbreak attacks have been proposed to red-team Large Language Models (LLMs) and revealed the vulnerable safeguards of LLMs. Besides, some methods are not limited to the textual modality and extend the jailbreak attack to Multimodal Large Language Models (MLLMs) by perturbing the visual input. However, the absence of a universal evaluation benchmark complicates the performance reproductio… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: technical report

  22. arXiv:2404.00505  [pdf, other

    cs.LG cs.AI cs.NI stat.ML

    Transfer Learning with Reconstruction Loss

    Authors: Wei Cui, Wei Yu

    Abstract: In most applications of utilizing neural networks for mathematical optimization, a dedicated model is trained for each specific optimization objective. However, in many scenarios, several distinct yet correlated objectives or tasks often need to be optimized on the same set of problem inputs. Instead of independently training a different neural network for each problem separately, it would be more… ▽ More

    Submitted 11 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: 16 pages, 5 figures. To appear in IEEE Transactions on Machine Learning in Communications and Networking (TMLCN)

  23. arXiv:2403.19574  [pdf, other

    hep-ex

    Measurement of double-differential cross sections for mesonless charged-current muon neutrino interactions on argon with final-state protons using the MicroBooNE detector

    Authors: MicroBooNE collaboration, P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, G. Barr, D. Barrow, J. Barrow, V. Basque, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J. Y. Book, M. B. Brunetti, L. Camilleri , et al. (163 additional authors not shown)

    Abstract: Charged-current neutrino interactions with final states containing zero mesons and at least one proton are of high interest for current and future accelerator-based neutrino oscillation experiments. Using the Booster Neutrino Beam and the MicroBooNE detector at Fermi National Accelerator Laboratory, we have obtained the first double-differential cross section measurements of this channel for muon… ▽ More

    Submitted 16 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: 83 pages, 67 figures (including supplemental material). For v2, added oversized files in extended data release

    Report number: FERMILAB-PUB-24-0120-AD-CSAID-LBNF-PPD-TD

  24. arXiv:2403.19128  [pdf, other

    cs.CV

    OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

    Authors: Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo Yang

    Abstract: Recently, visually-situated text parsing (VsTP) has experienced notable advancements, driven by the increasing demand for automated document understanding and the emergence of Generative Large Language Models (LLMs) capable of processing document-based questions. Various methods have been proposed to address the challenging problem of VsTP. However, due to the diversified targets and heterogeneous… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  25. arXiv:2403.18272  [pdf, other

    astro-ph.HE

    Recovery of High-energy Low-frequency Quasi-periodic Oscillations from Black Hole X-ray Binary MAXI J1535-571 with a Hilbert-Huang Transform Method

    Authors: Qingcang Shui, Shu Zhang, Shuangnan Zhang, Yupeng Chen, Lingda Kong, Jingqiang Peng, Long Ji, Pengju Wang, Zhi Chang, Zhuoli Yu, Hongxing Yin, Jinlu Qu, Lian Tao, Mingyu Ge, Xiang Ma, Liang Zhang, Wei Yu, Jian Li

    Abstract: We propose a method based on the Hilbert-Huang transform (HHT) to recover the high-energy waveform of low-frequency quasi-periodic oscillations (LFQPOs). Based on the method, we successfully obtain the modulation of the phase-folded light curve above 170 keV using the QPO phase reconstructed at lower energies in MAXI J1535-571 with Insight-HXMT observations. A comprehensive simulation study is con… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 21 pages, 15 figures, accepted for publication in ApJL

  26. arXiv:2403.18197  [pdf, other

    cs.RO

    LocoMan: Advancing Versatile Quadrupedal Dexterity with Lightweight Loco-Manipulators

    Authors: Changyi Lin, Xingyu Liu, Yuxiang Yang, Yaru Niu, Wenhao Yu, Tingnan Zhang, Jie Tan, Byron Boots, Ding Zhao

    Abstract: Quadrupedal robots have emerged as versatile agents capable of locomoting and manipulating in complex environments. Traditional designs typically rely on the robot's inherent body parts or incorporate top-mounted arms for manipulation tasks. However, these configurations may limit the robot's operational dexterity, efficiency and adaptability, particularly in cluttered or constrained spaces. In th… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Project page: https://linchangyi1.github.io/LocoMan

  27. arXiv:2403.15637  [pdf, other

    cs.RO

    CoNVOI: Context-aware Navigation using Vision Language Models in Outdoor and Indoor Environments

    Authors: Adarsh Jagan Sathyamoorthy, Kasun Weerakoon, Mohamed Elnoor, Anuj Zore, Brian Ichter, Fei Xia, Jie Tan, Wenhao Yu, Dinesh Manocha

    Abstract: We present ConVOI, a novel method for autonomous robot navigation in real-world indoor and outdoor environments using Vision Language Models (VLMs). We employ VLMs in two ways: first, we leverage their zero-shot image classification capability to identify the context or scenario (e.g., indoor corridor, outdoor terrain, crosswalk, etc) of the robot's surroundings, and formulate context-based naviga… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 9 pages, 4 figures

  28. arXiv:2403.14240  [pdf, other

    cs.CV

    Weak Supervision with Arbitrary Single Frame for Micro- and Macro-expression Spotting

    Authors: Wang-Wang Yu, Xian-Shi Zhang, Fu-Ya Luo, Yijun Cao, Kai-Fu Yang, Hong-Mei Yan, Yong-Jie Li

    Abstract: Frame-level micro- and macro-expression spotting methods require time-consuming frame-by-frame observation during annotation. Meanwhile, video-level spotting lacks sufficient information about the location and number of expressions during training, resulting in significantly inferior performance compared with fully-supervised spotting. To bridge this gap, we propose a point-level weakly-supervised… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  29. arXiv:2403.14168  [pdf, other

    cs.CL

    M$^3$AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset

    Authors: Zhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang

    Abstract: Publishing open-source academic video recordings is an emergent and prevalent approach to sharing knowledge online. Such videos carry rich multimodal information including speech, the facial and body movements of the speakers, as well as the texts and pictures in the slides and possibly even the papers. Although multiple academic video datasets have been constructed and released, few of them suppo… ▽ More

    Submitted 4 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: ACL 2024 Main Conference. Project website: https://jack-zc8.github.io/M3AV-dataset-page

  30. arXiv:2403.13127  [pdf, other

    astro-ph.HE

    Timing analysis of the newly discovered black hole candidate Swift J1727.8-1613 with Insight-HXMT

    Authors: Wei Yu, Qing-Cui Bu, Shuang-Nan Zhang, He-Xin Liu, Liang Zhang, Lorenzo Ducci, Lian Tao, Andrea Santangelo, Victor Doroshenko, Yue Huang, Zi-Xu Yang, Jin-Lu Qu

    Abstract: We present the results obtained from an X-ray timing study of the new black hole candidate (BHC) Swift J1727.8-1613. The work is based on Hard X-ray Modulation Telescope (Insight-HXMT) observations carried out during the 2023 outburst. Prominent type-C low-frequency Quasi-periodic Oscillations (LFQPOs) are detected throughout the observations. With the substantial effective area of the Insight-HXM… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  31. arXiv:2403.12471  [pdf, other

    cs.RO

    Theoretical Modeling and Bio-inspired Trajectory Optimization of A Multiple-locomotion Origami Robot

    Authors: Keqi Zhu, Haotian Guo, Wei Yu, Hassen Nigatu, Tong Li, Huixu Dong

    Abstract: Recent research on mobile robots has focused on increasing their adaptability to unpredictable and unstructured environments using soft materials and structures. However, the determination of key design parameters and control over these compliant robots are predominantly iterated through experiments, lacking a solid theoretical foundation. To improve their efficiency, this paper aims to provide ma… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 8 pages

  32. arXiv:2403.10340  [pdf, other

    cs.CV cs.RO

    Thermal-NeRF: Neural Radiance Fields from an Infrared Camera

    Authors: Tianxiang Ye, Qi Wu, Junyuan Deng, Guoqing Liu, Liu Liu, Songpengcheng Xia, Liang Pang, Wenxian Yu, Ling Pei

    Abstract: In recent years, Neural Radiance Fields (NeRFs) have demonstrated significant potential in encoding highly-detailed 3D geometry and environmental appearance, positioning themselves as a promising alternative to traditional explicit representation for 3D scene reconstruction. However, the predominant reliance on RGB imaging presupposes ideal lighting conditions: a premise frequently unmet in roboti… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  33. arXiv:2403.09004  [pdf, ps, other

    cs.IT eess.SP

    Meta-Learning-Based Fronthaul Compression for Cloud Radio Access Networks

    Authors: Ruihua Qiao, Tao Jiang, Wei Yu

    Abstract: This paper investigates the fronthaul compression problem in a user-centric cloud radio access network, in which single-antenna users are served by a central processor (CP) cooperatively via a cluster of remote radio heads (RRHs). To satisfy the fronthaul capacity constraint, this paper proposes a transform-compress-forward scheme, which consists of well-designed transformation matrices and unifor… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 15 Pages, 13 Figures; accepted in IEEE Transactions on Wireless Communications

  34. arXiv:2403.08651  [pdf, other

    cs.CV

    HAIFIT: Human-to-AI Fashion Image Translation

    Authors: Jianan Jiang, Xinglin Li, Weiren Yu, Di Wu

    Abstract: In the realm of fashion design, sketches serve as the canvas for expressing an artist's distinctive drawing style and creative vision, capturing intricate details like stroke variations and texture nuances. The advent of sketch-to-image cross-modal translation technology has notably aided designers. However, existing methods often compromise these sketch details during image generation, resulting… ▽ More

    Submitted 13 August, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 10 pages,8 figures

  35. arXiv:2403.07698  [pdf, ps, other

    math.DG

    The Kazdan-Warner problem on compact Kähler surfaces

    Authors: Weike Yu

    Abstract: In this paper, we investigate a Kazdan-Warner problem on compact Kähler surfaces with negative Gauduchon degree, which corresponds to prescribing sign-changing Chern scalar curvatures. By the method of our recent paper [J. Funt. Anal. 285 (2023): 109948], we establish a Chen-Li type existence theorem on compact Kähler surfaces when the candidate curvature function is of negative average. Moreover,… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 12 pages

    MSC Class: 32Q15; 35J60

  36. arXiv:2403.05262  [pdf, other

    cs.CV

    Debiasing Multimodal Large Language Models

    Authors: Yi-Fan Zhang, Weichen Yu, Qingsong Wen, Xue Wang, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan

    Abstract: In the realms of computer vision and natural language processing, Large Vision-Language Models (LVLMs) have become indispensable tools, proficient in generating textual descriptions based on visual inputs. Despite their advancements, our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior ra… ▽ More

    Submitted 27 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: 38 pages, 17 figures

  37. arXiv:2403.01714  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Molecular intercalation in the van der Waals antiferromagnets FePS3 and NiPS3

    Authors: Cong Li, Ze Hu, Xiaofei Hou, Sheng Xu, Zhanlong Wu, Kefan Du, Shuo Li, Xiaoyu Xu, Ying Chen, Zeyu Wang, Tiancheng Mu, Tian-Long Xia, Yanfeng Guo, B. Normand, Weiqiang Yu, Yi Cui

    Abstract: We have performed electrochemical treatment of the van der Waals antiferromagnetic materials FePS$_3$ and NiPS$_3$ with the ionic liquid EMIM-BF$_4$, achieving significant molecular intercalation. Mass analysis of the intercalated compounds, EMIM$_x$-FePS$_3$ and EMIM$_x$-NiPS$_3$, indicated respective intercalation levels, $x$, of approximately 27\% and 37\%, and X-ray diffraction measurements de… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Journal ref: Physical Review B 109, 184407(2024)

  38. arXiv:2403.01457  [pdf, other

    cs.IR cs.CL

    Logic Rules as Explanations for Legal Case Retrieval

    Authors: Zhongxiang Sun, Kepu Zhang, Weijie Yu, Haoyu Wang, Jun Xu

    Abstract: In this paper, we address the issue of using logic rules to explain the results from legal case retrieval. The task is critical to legal case retrieval because the users (e.g., lawyers or judges) are highly specialized and require the system to provide logical, faithful, and interpretable explanations before making legal decisions. Recently, research efforts have been made to learn explainable leg… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Comments: accepted by lrec-coling 2024

  39. arXiv:2403.00134  [pdf, other

    cs.IT eess.SP

    Active Sensing for Reciprocal MIMO Channels

    Authors: Tao Jiang, Wei Yu

    Abstract: This paper addresses the design of transmit precoder and receive combiner matrices to support $N_{\rm s}$ independent data streams over a time-division duplex (TDD) point-to-point massive multiple-input multiple-output (MIMO) channel with either a fully digital or a hybrid structure. The optimal precoder and combiner design amounts to finding the top-$N_{\rm s}$ singular vectors of the channel mat… ▽ More

    Submitted 6 June, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

    Comments: This paper is accepted in IEEE Transactions on Signal Processing

  40. arXiv:2402.19385  [pdf, other

    cs.RO cs.CV

    Towards Safe and Reliable Autonomous Driving: Dynamic Occupancy Set Prediction

    Authors: Wenbo Shao, Jiahui Xu, Wenhao Yu, Jun Li, Hong Wang

    Abstract: In the rapidly evolving field of autonomous driving, reliable prediction is pivotal for vehicular safety. However, trajectory predictions often deviate from actual paths, particularly in complex and challenging environments, leading to significant errors. To address this issue, our study introduces a novel method for Dynamic Occupancy Set (DOS) prediction, it effectively combines advanced trajecto… ▽ More

    Submitted 2 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE IV 2024

  41. First simultaneous measurement of differential muon-neutrino charged-current cross sections on argon for final states with and without protons using MicroBooNE data

    Authors: MicroBooNE collaboration, P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, G. Barr, D. Barrow, J. Barrow, V. Basque, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J. Y. Book, M. B. Brunetti, L. Camilleri , et al. (163 additional authors not shown)

    Abstract: We report the first double-differential neutrino-argon cross section measurement made simultaneously for final states with and without protons for the inclusive muon neutrino charged-current interaction channel. The proton kinematics of this channel are further explored with a differential cross section measurement as a function of the leading proton's kinetic energy that extends across the detect… ▽ More

    Submitted 27 July, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Report number: FERMILAB-PUB-24-0045

    Journal ref: Phys. Rev. Lett. 133, 041801 (2024)

  42. Inclusive cross section measurements in final states with and without protons for charged-current $ν_μ$-Ar scattering in MicroBooNE

    Authors: MicroBooNE collaboration, P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, G. Barr, D. Barrow, J. Barrow, V. Basque, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J. Y. Book, M. B. Brunetti, L. Camilleri , et al. (164 additional authors not shown)

    Abstract: A detailed understanding of inclusive muon neutrino charged-current interactions on argon is crucial to the study of neutrino oscillations in current and future experiments using liquid argon time projection chambers. To that end, we report a comprehensive set of differential cross section measurements for this channel that simultaneously probe the leptonic and hadronic systems by dividing the cha… ▽ More

    Submitted 27 July, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Report number: FERMILAB-PUB-24-0044

    Journal ref: Phys. Rev. D 110, 013006 (2024)

  43. arXiv:2402.19173  [pdf, other

    cs.SE cs.AI

    StarCoder 2 and The Stack v2: The Next Generation

    Authors: Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo , et al. (41 additional authors not shown)

    Abstract: The BigCode project, an open-scientific collaboration focused on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In partnership with Software Heritage (SWH), we build The Stack v2 on top of the digital commons of their source code archive. Alongside the SWH repositories spanning 619 programming languages, we carefully select other high-quality data… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  44. Dual-Context Aggregation for Universal Image Matting

    Authors: Qinglin Liu, Xiaoqian Lv, Wei Yu, Changyong Guo, Shengping Zhang

    Abstract: Natural image matting aims to estimate the alpha matte of the foreground from a given image. Various approaches have been explored to address this problem, such as interactive matting methods that use guidance such as click or trimap, and automatic matting methods tailored to specific objects. However, existing matting methods are designed for specific objects or guidance, neglecting the common re… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Journal ref: Multimed Tools Appl (2023)

  45. Semiclassical approach to spin dynamics of a ferromagnetic S=1 chain

    Authors: Chengchen Li, Yi Cui, Weiqiang Yu, Rong Yu

    Abstract: Motivated by recent experimental progress in the quasi-one-dimensional quantum magnet NiNb$_2$O$_6$, we study the spin dynamics of an S=1 ferromagnetic Heisenberg chain with single-ion anisotropy by using a semiclassical molecular dynamics approach. This system undergoes a quantum phase transition from a ferromagnetic to a paramagnetic state under a transverse magnetic field, and the magnetic resp… ▽ More

    Submitted 31 July, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Journal ref: Chinese Phys. B 33 (2024) 067501

  46. arXiv:2402.15365  [pdf, other

    stat.ML cs.LG

    Efficient semi-supervised inference for logistic regression under case-control studies

    Authors: Zhuojun Quan, Yuanyuan Lin, Kani Chen, Wen Yu

    Abstract: Semi-supervised learning has received increasingly attention in statistics and machine learning. In semi-supervised learning settings, a labeled data set with both outcomes and covariates and an unlabeled data set with covariates only are collected. We consider an inference problem in semi-supervised settings where the outcome in the labeled data is binary and the labeled data is collected by case… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  47. arXiv:2402.14308  [pdf, other

    cs.RO

    Ground-Fusion: A Low-cost Ground SLAM System Robust to Corner Cases

    Authors: Jie Yin, Ang Li, Wei Xi, Wenxian Yu, Danping Zou

    Abstract: We introduce Ground-Fusion, a low-cost sensor fusion simultaneous localization and mapping (SLAM) system for ground vehicles. Our system features efficient initialization, effective sensor anomaly detection and handling, real-time dense color mapping, and robust localization in diverse environments. We tightly integrate RGB-D images, inertial measurements, wheel odometer and GNSS signals within a… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  48. arXiv:2402.12084  [pdf, other

    astro-ph.HE

    Simultaneous multi-wavelength observations of the repeating fast radio burst FRB 20190520B with Swift and FAST

    Authors: Zhen Yan, Wenfei Yu, Kim L. Page, Jie Lin, Di Li, Chenhui Niu, Casey Law, Bing Zhang, Shami Chatterjee, Xian Zhang, Reshma Anna-Thomas

    Abstract: Fast radio bursts (FRBs) are bright, millisecond-duration radio bursts of cosmic origin. There have been several dozen FRBs found to repeat. Among them, those precisely localized provide the best opportunity to probe their multi-wavelength counterparts, local environment, and host galaxy that would reveal their origins. Here we report our X-ray, ultraviolet (UV) and optical observations with the… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 12 pages, 7 figures, submitted to ApJ

  49. Hybrid Online-Offline Learning for Task Offloading in Mobile Edge Computing Systems

    Authors: Muhammad Sohaib, Sang-Woon Jeon, Wei Yu

    Abstract: We consider a multi-user multi-server mobile edge computing (MEC) system, in which users arrive on a network randomly over time and generate computation tasks, which will be computed either locally on their own computing devices or be offloaded to one of the MEC servers. Under such a dynamic network environment, we propose a novel task offloading policy based on hybrid online-offline learning, whi… ▽ More

    Submitted 27 February, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: accepted by IEEE Transactions on Wireless Communications

    Journal ref: IEEE Transactions on Wireless Communications (2023)

  50. arXiv:2402.11450  [pdf, other

    cs.RO

    Learning to Learn Faster from Human Feedback with Language Model Predictive Control

    Authors: Jacky Liang, Fei Xia, Wenhao Yu, Andy Zeng, Montserrat Gonzalez Arenas, Maria Attarian, Maria Bauza, Matthew Bennice, Alex Bewley, Adil Dostmohamed, Chuyuan Kelly Fu, Nimrod Gileadi, Marissa Giustina, Keerthana Gopalakrishnan, Leonard Hasenclever, Jan Humplik, Jasmine Hsu, Nikhil Joshi, Ben Jyenis, Chase Kew, Sean Kirmani, Tsang-Wei Edward Lee, Kuang-Huei Lee, Assaf Hurwitz Michaely, Joss Moore , et al. (25 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for o… ▽ More

    Submitted 31 May, 2024; v1 submitted 17 February, 2024; originally announced February 2024.