Zum Hauptinhalt springen

Showing 101–150 of 1,330 results for author: He, H

.
  1. arXiv:2404.03384  [pdf, other

    cs.CV

    LongVLM: Efficient Long Video Understanding via Large Language Models

    Authors: Yuetian Weng, Mingfei Han, Haoyu He, Xiaojun Chang, Bohan Zhuang

    Abstract: Empowered by Large Language Models (LLMs), recent advancements in Video-based LLMs (VideoLLMs) have driven progress in various video understanding tasks. These models encode video representations through pooling or query aggregation over a vast number of visual tokens, making computational and memory costs affordable. Despite successfully providing an overall comprehension of video content, existi… ▽ More

    Submitted 20 July, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted by ECCV 2024

  2. arXiv:2404.03176  [pdf, other

    cs.LG cs.IT

    Information-Theoretic Generalization Bounds for Deep Neural Networks

    Authors: Haiyun He, Christina Lee Yu, Ziv Goldfeld

    Abstract: Deep neural networks (DNNs) exhibit an exceptional capacity for generalization in practical applications. This work aims to capture the effect and benefits of depth for supervised learning via information-theoretic generalization bounds. We first derive two hierarchical bounds on the generalization error in terms of the Kullback-Leibler (KL) divergence or the 1-Wasserstein distance between the tra… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 25 pages, 5 figures

  3. arXiv:2404.02101  [pdf, other

    cs.CV

    CameraCtrl: Enabling Camera Control for Text-to-Video Generation

    Authors: Hao He, Yinghao Xu, Yuwei Guo, Gordon Wetzstein, Bo Dai, Hongsheng Li, Ceyuan Yang

    Abstract: Controllability plays a crucial role in video generation since it allows users to create desired content. However, existing models largely overlooked the precise control of camera pose that serves as a cinematic language to express deeper narrative nuances. To alleviate this issue, we introduce CameraCtrl, enabling accurate camera pose control for text-to-video(T2V) models. After precisely paramet… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Project page: https://hehao13.github.io/projects-CameraCtrl/ Code: https://github.com/hehao13/CameraCtrl

  4. arXiv:2404.01664  [pdf, other

    physics.soc-ph nlin.AO nlin.PS physics.bio-ph

    Nonreciprocal interactions in crowd dynamics: investigating the impact of moving threats on pedestrian speed preferences

    Authors: Shaocong Xie, Rui Ye, Xiaolian Li, Zhongyi Huang, Shuchao Cao, Wei Lv, Hong He, Ping Zhang, Zhiming Fang, Jun Zhang, Weiguo Song

    Abstract: Nonreciprocal interaction crowd systems, such as human-human, human-vehicle, and human-robot systems, often have serious impacts on pedestrian safety and social order. A more comprehensive understanding of these systems is needed to optimize system stability and efficiency. Despite the importance of these interactions, empirical research in this area remains limited. Thus, in our study we explore… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  5. arXiv:2404.01453  [pdf, other

    cs.CL cs.AI

    Unveiling Divergent Inductive Biases of LLMs on Temporal Data

    Authors: Sindhu Kishore, Hangfeng He

    Abstract: Unraveling the intricate details of events in natural language necessitates a subtle understanding of temporal dynamics. Despite the adeptness of Large Language Models (LLMs) in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  6. arXiv:2404.00309  [pdf, other

    cs.IT eess.SP

    Model-Driven Deep Learning for Distributed Detection with Binary Quantization

    Authors: Wei Guo, Meng He, Chuan Huang, Hengtao He, Shenghui Song, Jun Zhang, Khaled B. Letaief

    Abstract: Within the realm of rapidly advancing wireless sensor networks (WSNs), distributed detection assumes a significant role in various practical applications. However, critical challenge lies in maintaining robust detection performance while operating within the constraints of limited bandwidth and energy resources. This paper introduces a novel approach that combines model-driven deep learning (DL) w… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  7. arXiv:2404.00246  [pdf, other

    cs.CL cs.AI cs.HC

    Your Co-Workers Matter: Evaluating Collaborative Capabilities of Language Models in Blocks World

    Authors: Guande Wu, Chen Zhao, Claudio Silva, He He

    Abstract: Language agents that interact with the world on their own have great potential for automating digital tasks. While large language model (LLM) agents have made progress in understanding and executing tasks such as textual games and webpage control, many real-world tasks also require collaboration with humans or other LLMs in equal roles, which involves intent understanding, task coordination, and c… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  8. arXiv:2404.00057  [pdf, other

    cs.HC cs.AI cs.CR cs.OS

    PerOS: Personalized Self-Adapting Operating Systems in the Cloud

    Authors: Hongyu Hè

    Abstract: Operating systems (OSes) are foundational to computer systems, managing hardware resources and ensuring secure environments for diverse applications. However, despite their enduring importance, the fundamental design objectives of OSes have seen minimal evolution over decades. Traditionally prioritizing aspects like speed, memory efficiency, security, and scalability, these objectives often overlo… ▽ More

    Submitted 26 March, 2024; originally announced April 2024.

    Comments: 29 pages, 3 figures

  9. arXiv:2403.15172  [pdf, other

    astro-ph.CO astro-ph.GA astro-ph.HE

    Magnetically arrested disks in FR I radio galaxies

    Authors: Han He, Bei You, Ning Jiang, Xinwu Cao, Jingfu Hu, Zhenfeng Sheng, Su Yao, Bozena Czerny

    Abstract: A sample of 17 FR I radio galaxies constructed from the 3CR catalog, which is characterized by edge-darkened radio structures, is studied. The optical core luminosities derived from Hubble Space Telescope observation are used to estimate the Eddington ratios which are found to be below $10^{-3.4}$ for this sample. This is supported by the Baldwin-Phillips-Terlevich optical diagnostic diagrams deri… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 10 pages, 10 figures, 3 tables, Accepted for publication in MNRAS

  10. arXiv:2403.14961  [pdf, ps, other

    math.NA

    Anderson Acceleration with Truncated Gram-Schmidt

    Authors: Ziyuan Tang, Tianshi Xu, Huan He, Yousef Saad, Yuanzhe Xi

    Abstract: Anderson Acceleration (AA) is a popular algorithm designed to enhance the convergence of fixed-point iterations. In this paper, we introduce a variant of AA based on a Truncated Gram-Schmidt process (AATGS) which has a few advantages over the classical AA. In particular, an attractive feature of AATGS is that its iterates obey a three-term recurrence in the situation when it is applied to solving… ▽ More

    Submitted 16 July, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    MSC Class: 65F10; 68W25; 65B99; 65N22

  11. arXiv:2403.13250  [pdf, other

    cs.CL

    Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models

    Authors: Huachuan Qiu, Shuai Zhang, Hongliang He, Anqi Li, Zhenzhong Lan

    Abstract: Pornographic content occurring in human-machine interaction dialogues can cause severe side effects for users in open-domain dialogue systems. However, research on detecting pornographic language within human-machine interaction dialogues is an important subject that is rarely studied. To advance in this direction, we introduce CensorChat, a dialogue monitoring dataset aimed at detecting whether t… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted to CSCWD 2024 (27th International Conference on Computer Supported Cooperative Work in Design). arXiv admin note: text overlap with arXiv:2309.09749

  12. arXiv:2403.11832  [pdf, other

    astro-ph.HE hep-ph

    Precise measurement of the cosmic-ray spectrum and $\left \langle \ln A \right \rangle$ by LHAASO -- connecting the Galactic to the extragalactic components

    Authors: Xing-Jian Lv, Xiao-Jun Bi, Kun Fang, Yi-Qing Guo, Hui-Hai He, Ling-Ling Ma, Peng-Fei Yin, Qiang Yuan, Meng-Jie Zhao

    Abstract: Recently LHAASO Collaboration gives precise measurements of cosmic rays (CR) all particle energy spectrum and mean logarithmic mass $\left \langle \ln A \right \rangle$ from 0.3 PeV to 30 PeV. Combining the CR measurements by AMS-02 and DAMPE in space and that by LHAASO and Auger on the ground we construct a model to recover all these measurements from tens of GeV to tens of EeV. We find the LHAAS… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 11 pages, 2 figures, 4 tables

  13. Measurements of All-Particle Energy Spectrum and Mean Logarithmic Mass of Cosmic Rays from 0.3 to 30 PeV with LHAASO-KM2A

    Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen , et al. (256 additional authors not shown)

    Abstract: We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at… ▽ More

    Submitted 26 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: 8 pages, 3 figures

    Journal ref: Physical Review Letters 132, 131002 (2024)

  14. arXiv:2403.09611  [pdf, other

    cs.CV cs.CL cs.LG

    MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

    Authors: Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Floris Weers, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Ankur Jain, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman , et al. (7 additional authors not shown)

    Abstract: In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons. For example, we demonstrate that for la… ▽ More

    Submitted 18 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  15. Field test of mode-pairing quantum key distribution

    Authors: Hao-Tao Zhu, Yizhi Huang, Wen-Xin Pan, Chao-Wu Zhou, Jianjun Tang, Hong He, Ming Cheng, Xiandu Jin, Mi Zou, Shibiao Tang, Xiongfeng Ma, Teng-Yun Chen, Jian-Wei Pan

    Abstract: Quantum key distribution is a cornerstone of quantum technology, offering information-theoretical secure keys for remote parties. With many quantum communication networks established globally, the mode-pairing protocol stands out for its efficacy over inter-city distances using simple setups, emerging as a promising solution. In this study, we employ the mode-pairing scheme into existing inter-cit… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 15 pages, 5 figures, 6 tables

    Journal ref: Optica 11, 883-888 (2024)

  16. MCFEND: A Multi-source Benchmark Dataset for Chinese Fake News Detection

    Authors: Yupeng Li, Haorui He, Jin Bai, Dacheng Wen

    Abstract: The prevalence of fake news across various online sources has had a significant influence on the public. Existing Chinese fake news detection datasets are limited to news sourced solely from Weibo. However, fake news originating from multiple sources exhibits diversity in various aspects, including its content and social context. Methods trained on purely one single news source can hardly be appli… ▽ More

    Submitted 24 July, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted by the ACM Web Conference 2024 (WWW 2024) oral, dataset available: https://github.com/TrustworthyComp

  17. arXiv:2403.07837  [pdf, other

    physics.optics

    Topological Protection of Optical Skyrmions through Complex Media

    Authors: An Aloysius Wang, Zimo Zhao, Yifei Ma, Yuxi Cai, Runchen Zhang, Xiaoyi Shang, Yunqi Zhang, Ji Qin, Zhi Kai Pong, Tade Marozsak, Binguo Chen, Honghui He, Lin Luo, Martin J Booth, Steve J Elston, Stephen M Morris, Chao He

    Abstract: Optical Skyrmions have many important properties that make them ideal units for high-density data applications, including the ability to carry digital information through a discrete topological number and the independence of spatially varying polarization to other dimensions. More importantly, the topological nature of the optical Skyrmion heuristically suggests a strong degree of robustness to pe… ▽ More

    Submitted 6 August, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  18. arXiv:2403.07581  [pdf, other

    cs.CL

    LLMvsSmall Model? Large Language Model Based Text Augmentation Enhanced Personality Detection Model

    Authors: Linmei Hu, Hongyu He, Duokang Wang, Ziwang Zhao, Yingxia Shao, Liqiang Nie

    Abstract: Personality detection aims to detect one's personality traits underlying in social media posts. One challenge of this task is the scarcity of ground-truth personality traits which are collected from self-report questionnaires. Most existing methods learn post features directly by fine-tuning the pre-trained language models under the supervision of limited personality labels. This leads to inferior… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  19. arXiv:2403.03128  [pdf, other

    hep-ph astro-ph.CO hep-ex

    Probing Light Inelastic Dark Matter from Direct Detection

    Authors: Hong-Jian He, Yu-Chen Wang, Jiaming Zheng

    Abstract: Different dark matter (DM) candidates could have different types of DM-lepton and/or DM-quark interactions. For direct detection experiments, this leads to diversity in the recoil spectra, where both DM-electron and DM-nucleus scatterings may contribute. Furthermore, kinematic effects such as those of the inelastic scattering can also play an important role in shaping the recoil spectra. In this w… ▽ More

    Submitted 11 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: 30 pages, refined version, references added

  20. arXiv:2402.17555  [pdf, other

    cs.CV

    Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class Label

    Authors: Xinliang Zhang, Lei Zhu, Hangzhou He, Lujia Jin, Yanye Lu

    Abstract: Scribble-based weakly-supervised semantic segmentation using sparse scribble supervision is gaining traction as it reduces annotation costs when compared to fully annotated alternatives. Existing methods primarily generate pseudo-labels by diffusing labeled pixels to unlabeled ones with local cues for supervision. However, this diffusion process fails to exploit global semantics and class-specific… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  21. arXiv:2402.17352  [pdf, other

    astro-ph.HE astro-ph.GA

    Search for neutrino emission from the Cygnus Bubble based on LHAASO $γ$-ray observations

    Authors: Wenlian Li, Tian-Qi Huang, Donglian Xu, Huihai He

    Abstract: The Cygnus region, which contains massive molecular and atomic clouds and young stars, is a promising Galactic neutrino source candidate. Cosmic rays transport in the region can produce neutrinos and $γ$-rays. Recently, the Large High Altitude Air Shower Observatory (LHAASO) detected an ultrahigh-energy $γ$-ray bubble (Cygnus Bubble) in this region. Using publicly available track events detected b… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  22. arXiv:2402.16901  [pdf, other

    q-bio.GN cs.AI cs.LG

    FGBERT: Function-Driven Pre-trained Gene Language Model for Metagenomics

    Authors: ChenRui Duan, Zelin Zang, Yongjie Xu, Hang He, Zihan Liu, Zijia Song, Ju-Sheng Zheng, Stan Z. Li

    Abstract: Metagenomic data, comprising mixed multi-species genomes, are prevalent in diverse environments like oceans and soils, significantly impacting human health and ecological functions. However, current research relies on K-mer representations, limiting the capture of structurally relevant gene contexts. To address these limitations and further our understanding of complex relationships between metage… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  23. arXiv:2402.16557  [pdf, other

    math.NA

    A randomized algorithm for simultaneously diagonalizing symmetric matrices by congruence

    Authors: Haoze He, Daniel Kressner

    Abstract: A family of symmetric matrices $A_1,\ldots, A_d$ is SDC (simultaneous diagonalization by congruence, also called non-orthogonal joint diagonalization) if there is an invertible matrix $X$ such that every $X^T A_k X$ is diagonal. In this work, a novel randomized SDC (RSDC) algorithm is proposed that reduces SDC to a generalized eigenvalue problem by considering two (random) linear combinations of t… ▽ More

    Submitted 15 August, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    MSC Class: 65F15; 65F30; 68W20; 15A22; 15A27

  24. arXiv:2402.15764  [pdf, other

    cs.CL cs.AI

    Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models

    Authors: Haoran Liao, Jidong Tian, Shaohua Hu, Hao He, Yaohui Jin

    Abstract: Large language models (LLMs) still grapple with complex tasks like mathematical reasoning. Despite significant efforts invested in improving prefix prompts or reasoning process, the crucial role of problem context might have been neglected. Accurate recognition of inputs is fundamental for solving mathematical tasks, as ill-formed problems could potentially mislead LLM's reasoning. In this study,… ▽ More

    Submitted 26 March, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  25. arXiv:2402.15152  [pdf, other

    cs.LG cs.AI cs.CR math.OC

    On the Duality Between Sharpness-Aware Minimization and Adversarial Training

    Authors: Yihao Zhang, Hangzhou He, Jingyu Zhu, Huanran Chen, Yifei Wang, Zeming Wei

    Abstract: Adversarial Training (AT), which adversarially perturb the input samples during training, has been acknowledged as one of the most effective defenses against adversarial attacks, yet suffers from inevitably decreased clean accuracy. Instead of perturbing the samples, Sharpness-Aware Minimization (SAM) perturbs the model weights during training to find a more flat loss landscape and improve general… ▽ More

    Submitted 5 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  26. arXiv:2402.15134  [pdf, other

    cs.LG cs.AI

    Deep Coupling Network For Multivariate Time Series Forecasting

    Authors: Kun Yi, Qi Zhang, Hui He, Kaize Shi, Liang Hu, Ning An, Zhendong Niu

    Abstract: Multivariate time series (MTS) forecasting is crucial in many real-world applications. To achieve accurate MTS forecasting, it is essential to simultaneously consider both intra- and inter-series relationships among time series data. However, previous work has typically modeled intra- and inter-series relationships separately and has disregarded multi-order interactions present within and between… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  27. arXiv:2402.14407  [pdf, other

    cs.LG cs.CV cs.RO

    Large-Scale Actionless Video Pre-Training via Discrete Diffusion for Efficient Policy Learning

    Authors: Haoran He, Chenjia Bai, Ling Pan, Weinan Zhang, Bin Zhao, Xuelong Li

    Abstract: Learning a generalist embodied agent capable of completing multiple tasks poses challenges, primarily stemming from the scarcity of action-labeled robotic datasets. In contrast, a vast amount of human videos exist, capturing intricate tasks and interactions with the physical world. Promising prospects arise for utilizing actionless human videos for pre-training and transferring the knowledge to fa… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 21 pages

  28. arXiv:2402.14036  [pdf, other

    quant-ph math.CO

    Quantum Annealing and Graph Neural Networks for Solving TSP with QUBO

    Authors: Haoqi He

    Abstract: This paper explores the application of Quadratic Unconstrained Binary Optimization (QUBO) models in solving the Travelling Salesman Problem (TSP) through Quantum Annealing algorithms and Graph Neural Networks. Quantum Annealing (QA), a quantum-inspired optimization method that exploits quantum tunneling to escape local minima, is used to solve QUBO formulations of TSP instances on Coherent Ising M… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  29. Multitier Service Migration Framework Based on Mobility Prediction in Mobile Edge Computing

    Authors: Run Yang, Hui He, Weizhe Zhang

    Abstract: Mobile edge computing (MEC) pushes computing resources to the edge of the network and distributes them at the edge of the mobile network. Offloading computing tasks to the edge instead of the cloud can reduce computing latency and backhaul load simultaneously. However, new challenges incurred by user mobility and limited coverage of MEC server service arise. Services should be dynamically migrated… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 13 pages, 9 figures

    Journal ref: Wireless Communications and Mobile Computing, 2021

  30. arXiv:2402.13471  [pdf

    cond-mat.mtrl-sci physics.app-ph

    Thermal transport in a 2D amorphous material

    Authors: Yuxi Wang, Xingxing Zhang, Wujuan Yan, Nianjie Liang, Haiyu He, Xinwei Tao, Ang Li, Fuwei Yang, Buxuan Li, Te-Huan Liu, Jia Zhu, Wu Zhou, Wei Wang, Lin Zhou, Bai Song

    Abstract: Two-dimensional (2D) crystals proved revolutionary soon after graphene was discovered in 2004. However, 2D amorphous materials only became accessible in 2020 and remain largely unexplored. In particular, the thermophysical properties of amorphous materials are of great interest upon transition from 3D to 2D. Here, we probe thermal transport in 2D amorphous carbon. A cross-plane thermal conductivit… ▽ More

    Submitted 22 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  31. arXiv:2402.12749  [pdf

    cs.CL cs.AI

    Me LLaMA: Foundation Large Language Models for Medical Applications

    Authors: Qianqian Xie, Qingyu Chen, Aokun Chen, Cheng Peng, Yan Hu, Fongci Lin, Xueqing Peng, Jimin Huang, Jeffrey Zhang, Vipina Keloth, Xinyu Zhou, Huan He, Lucila Ohno-Machado, Yonghui Wu, Hua Xu, Jiang Bian

    Abstract: Recent advancements in large language models (LLMs) such as ChatGPT and LLaMA have hinted at their potential to revolutionize medical applications, yet their application in clinical settings often reveals limitations due to a lack of specialized training on medical-specific data. In response to this challenge, this study introduces Me-LLaMA, a novel medical LLM family that includes foundation mode… ▽ More

    Submitted 11 April, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: 21 pages, 3 figures, 8 tables

  32. arXiv:2402.12530  [pdf, other

    cs.CL cs.AI cs.LG

    Parallel Structures in Pre-training Data Yield In-Context Learning

    Authors: Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He

    Abstract: Pre-trained language models (LMs) are capable of in-context learning (ICL): they can adapt to a task with only a few examples given in the prompt without any parameter update. However, it is unclear where this capability comes from as there is a stark distribution shift between pre-training text and ICL prompts. In this work, we study what patterns of the pre-training data contribute to ICL. We fi… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  33. arXiv:2402.10071  [pdf, other

    eess.SP cs.IT

    Approximate Message Passing-Enhanced Graph Neural Network for OTFS Data Detection

    Authors: Wenhao Zhuang, Yuyi Mao, Hengtao He, Lei Xie, Shenghui Song, Yao Ge, Zhi Ding

    Abstract: Orthogonal time frequency space (OTFS) modulation has emerged as a promising solution to support high-mobility wireless communications, for which, cost-effective data detectors are critical. Although graph neural network (GNN)-based data detectors can achieve decent detection accuracy at reasonable computational cost, they fail to best harness prior information of transmitted data. To further mini… ▽ More

    Submitted 14 April, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 8 pages, 7 figures, and 3 tables. Part of this article was submitted to IEEE for possible publication

  34. arXiv:2402.08994  [pdf, other

    cs.CV cs.AI

    CLIP-MUSED: CLIP-Guided Multi-Subject Visual Neural Information Semantic Decoding

    Authors: Qiongyi Zhou, Changde Du, Shengpei Wang, Huiguang He

    Abstract: The study of decoding visual neural information faces challenges in generalizing single-subject decoding models to multiple subjects, due to individual differences. Moreover, the limited availability of data from a single subject has a constraining impact on model performance. Although prior multi-subject decoding methods have made significant progress, they still suffer from several limitations,… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: Accepted by ICLR2024

  35. arXiv:2402.01795  [pdf, other

    eess.SY cs.LG cs.RO cs.SE

    Few-Shot Scenario Testing for Autonomous Vehicles Based on Neighborhood Coverage and Similarity

    Authors: Shu Li, Jingxuan Yang, Honglin He, Yi Zhang, Jianming Hu, Shuo Feng

    Abstract: Testing and evaluating the safety performance of autonomous vehicles (AVs) is essential before the large-scale deployment. Practically, the number of testing scenarios permissible for a specific AV is severely limited by tight constraints on testing budgets and time. With the restrictions imposed by strictly restricted numbers of tests, existing testing methods often lead to significant uncertaint… ▽ More

    Submitted 22 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  36. arXiv:2401.16534  [pdf, other

    cs.GR cs.CV

    Democratizing the Creation of Animatable Facial Avatars

    Authors: Yilin Zhu, Dalton Omens, Haodi He, Ron Fedkiw

    Abstract: In high-end visual effects pipelines, a customized (and expensive) light stage system is (typically) used to scan an actor in order to acquire both geometry and texture for various expressions. Aiming towards democratization, we propose a novel pipeline for obtaining geometry and texture as well as enough expression information to build a customized person-specific animation rig without using a li… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  37. arXiv:2401.16476  [pdf, other

    astro-ph.GA

    Unraveling the Mystery of the Low CO-to-H$_2$ Conversion Factor in Starburst Galaxies: RADEX Modeling of the Antennae

    Authors: Hao He, Christine D. Wilson, Jiayi Sun, Yu-Hsuan Teng, Erik Rosolowsky, Ashley R. Bemis

    Abstract: CO emission has been widely used as a tracer of molecular gas mass. However, it is a long-standing issue to accurately constrain the CO-to-H$_2$ conversion factor ($α_{\mathrm{CO}}$) that converts CO luminosity to molecular gas mass, especially in starburst galaxies. We present the first resolved $α_{\mathrm{CO}}$ modeling results with multiple ALMA CO and $^{13}$CO transition observations at both… ▽ More

    Submitted 9 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: 22 pages and 16 figures in the main text; accepted to ApJ

  38. arXiv:2401.14984  [pdf, ps, other

    math.RT

    Projection of Elliptic Orbits and Branching Laws

    Authors: Hongyu He

    Abstract: Let $G$ be a Lie group, and $H\subset G$ a closed subgroup. Let $π$ be an irreducible unitary representation of $G$. In this paper, we briefly discuss the orbit method and its application to the branching problem $π|_{H}$. We use the Gan-Gross-Prasad branching law for $(G, H)= ( U(p,q), U(p, q-1) )$ as an example to illustrate the relation between $\pro_{\f u(p, q-1)}^{\f u(p,q)} \mc O(λ)$ and the… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  39. arXiv:2401.14453  [pdf, other

    astro-ph.GA

    Hidden Gems on a Ring: Infant Massive Clusters and Their Formation Timeline Unveiled by ALMA, HST, and JWST in NGC 3351

    Authors: Jiayi Sun, Hao He, Kyle Batschkun, Rebecca C. Levy, Kimberly Emig, M. Jimena Rodriguez, Hamid Hassani, Adam K. Leroy, Eva Schinnerer, Eve C. Ostriker, Christine D. Wilson, Alberto D. Bolatto, Elisabeth A. C. Mills, Erik Rosolowsky, Janice C. Lee, Daniel A. Dale, Kirsten L. Larson, David A. Thilker, Leonardo Ubeda, Bradley C. Whitmore, Thomas G. Williams, Ashley. T. Barnes, Frank Bigiel, Melanie Chevance, Simon C. O. Glover , et al. (16 additional authors not shown)

    Abstract: We study young massive clusters (YMCs) in their embedded "infant" phase with $\sim0.\!^{\prime\prime}1$ ALMA, HST, and JWST observations targeting the central starburst ring in NGC 3351, a nearby Milky Way analog galaxy. Our new ALMA data reveal 18 bright and compact (sub-)millimeter continuum sources, of which 8 have counterparts in JWST images and only 6 have counterparts in HST images. Based on… ▽ More

    Submitted 10 April, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: 27 pages, 12 figures; ApJ accepted

  40. arXiv:2401.14039  [pdf, other

    cond-mat.mtrl-sci

    Threshold displacement energy map of Frenkel pair generation in $\rm Ga_2O_3$ from machine-learning-driven molecular dynamics simulations

    Authors: Huan He, Junlei Zhao, Jesper Byggmästar, Ru He, Kai Nordlund, Chaohui He, Flyura Djurabekova

    Abstract: $β$ phase gallium oxide ($β$-$\rm Ga_2O_3… ▽ More

    Submitted 28 February, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

  41. arXiv:2401.13986  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning

    Authors: Yanda Chen, Chandan Singh, Xiaodong Liu, Simiao Zuo, Bin Yu, He He, Jianfeng Gao

    Abstract: Large language models (LLMs) often generate convincing, fluent explanations. However, different from humans, they often generate inconsistent explanations on different inputs. For example, an LLM may generate the explanation "all birds can fly" when answering the question "Can sparrows fly?" but meanwhile answer "no" to the related question "Can penguins fly?". Explanations should be consistent ac… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: text overlap with arXiv:2307.08678

  42. arXiv:2401.13919  [pdf, other

    cs.CL cs.AI

    WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models

    Authors: Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Yong Dai, Hongming Zhang, Zhenzhong Lan, Dong Yu

    Abstract: The rapid advancement of large language models (LLMs) has led to a new era marked by the development of autonomous applications in real-world scenarios, which drives innovation in creating advanced web agents. Existing web agents typically only handle one input modality and are evaluated only in simplified web simulators or static web snapshots, greatly limiting their applicability in real-world s… ▽ More

    Submitted 6 June, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted to ACL 2024 (main). Code and data is released at https://github.com/MinorJerry/WebVoyager

  43. arXiv:2401.11566  [pdf, other

    astro-ph.HE

    A detectable ultra-high-energy cosmic ray outburst from GRB 221009A

    Authors: Hao-Ning He, B. Thoedore Zhang, Yi-Zhong Fan

    Abstract: Gamma-ray bursts (GRBs) have been proposed as one of promising sources of ultra-high-energy cosmic rays (UHECRs), but observational evidence is still lacking. The nearby B.O.A.T. (brightest of all time) GRB 221009A, an once-in-1000-year event, is able to accelerate protons to $\sim 10^{3}$ EeV. Protons arriving at the Milky Way are dominated by neutron-decay-induced protons. The inter-galactic mag… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

  44. arXiv:2401.11155  [pdf, other

    cs.IT

    Deep Learning-Based Adaptive Joint Source-Channel Coding using Hypernetworks

    Authors: Songjie Xie, Hengtao He, Hongru Li, Shenghui Song, Jun Zhang, Ying-Jun Angela Zhang, Khaled B. Letaief

    Abstract: Deep learning-based joint source-channel coding (DJSCC) is expected to be a key technique for {the} next-generation wireless networks. However, the existing DJSCC schemes still face the challenge of channel adaptability as they are typically trained under specific channel conditions. In this paper, we propose a generic framework for channel-adaptive DJSCC by utilizing hypernetworks. To tailor the… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

  45. arXiv:2401.07080  [pdf, other

    cs.CV

    GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching

    Authors: Haibin He, Maoyuan Ye, Jing Zhang, Juhua Liu, Dacheng Tao

    Abstract: Beyond the text detection and recognition tasks in image text spotting, video text spotting presents an augmented challenge with the inclusion of tracking. While advanced end-to-end trainable methods have shown commendable performance, the pursuit of multi-task optimization may pose the risk of producing sub-optimal outcomes for individual tasks. In this paper, we highlight a main bottleneck in th… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

  46. arXiv:2401.06340  [pdf, other

    cs.HC cs.AI

    A Temporal-Spectral Fusion Transformer with Subject-Specific Adapter for Enhancing RSVP-BCI Decoding

    Authors: Xujin Li, Wei Wei, Shuang Qiu, Huiguang He

    Abstract: The Rapid Serial Visual Presentation (RSVP)-based Brain-Computer Interface (BCI) is an efficient technology for target retrieval using electroencephalography (EEG) signals. The performance improvement of traditional decoding methods relies on a substantial amount of training data from new test subjects, which increases preparation time for BCI systems. Several studies introduce data from existing… ▽ More

    Submitted 11 July, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: 19 pages, 10 figures

    MSC Class: 68T07 ACM Class: I.5.4

  47. arXiv:2401.04741  [pdf, other

    cs.LG

    Masked AutoEncoder for Graph Clustering without Pre-defined Cluster Number k

    Authors: Yuanchi Ma, Hui He, Zhongxiang Lei, Zhendong Niu

    Abstract: Graph clustering algorithms with autoencoder structures have recently gained popularity due to their efficient performance and low training cost. However, for existing graph autoencoder clustering algorithms based on GCN or GAT, not only do they lack good generalization ability, but also the number of clusters clustered by such autoencoder models is difficult to determine automatically. To solve t… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  48. arXiv:2401.04502  [pdf

    cond-mat.mes-hall

    Observation of Higher Order Nodal Line Semimetal in Phononic Crystals

    Authors: Qiyun Ma, Zhenhang Pu, Liping Ye, Jiuyang Lu, Xueqin Huang, Manzhu Ke, Hailong He, Weiyin Deng, Zhengyou Liu

    Abstract: Higher-order topological insulators and semimetals, which generalize the conventional bulk-boundary correspondence, have attracted extensive research interest. Among them, higher-order Weyl semimetals feature two-fold linear crossing points in three-dimensional (3D) momentum space, 2D Fermi-arc surface states, and 1D hinge states. Higher-order nodal-point semimetals possessing Weyl points or Dirac… ▽ More

    Submitted 12 January, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: accepted for publication in PRL

  49. arXiv:2401.02038  [pdf, other

    cs.CL

    Understanding LLMs: A Comprehensive Overview from Training to Inference

    Authors: Yiheng Liu, Hao He, Tianle Han, Xu Zhang, Mengyuan Liu, Jiaming Tian, Yutong Zhang, Jiaqi Wang, Xiaohui Gao, Tianyang Zhong, Yi Pan, Shaochen Xu, Zihao Wu, Zhengliang Liu, Xin Zhang, Shu Zhang, Xintao Hu, Tuo Zhang, Ning Qiang, Tianming Liu, Bao Ge

    Abstract: The introduction of ChatGPT has led to a significant increase in the utilization of Large Language Models (LLMs) for addressing downstream tasks. There's an increasing focus on cost-efficient training and deployment within this context. Low-cost training and deployment of LLMs represent the future development trend. This paper reviews the evolution of large language model training techniques and i… ▽ More

    Submitted 5 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: 30 pages,6 figures

  50. arXiv:2401.01113  [pdf, other

    eess.SP

    CRB Minimization for RIS-aided mmWave Integrated Sensing and Communications

    Authors: Wanting Lyu, Songjie Yang, Yue Xiu, Ya Li, Hongjun He, Chau Yuen, Zhongpei Zhang

    Abstract: In this paper, reconfigurable intelligent surface (RIS) is employed in a millimeter wave (mmWave) integrated sensing and communications (ISAC) system. To alleviate the multi-hop attenuation, the semi-self sensing RIS approach is adopted, wherein sensors are configured at the RIS to receive the radar echo signal. Focusing on the estimation accuracy, the Cramer-Rao bound (CRB) for estimating the dir… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.