Zum Hauptinhalt springen

Showing 51–100 of 3,570 results for author: Mao, X

.
  1. arXiv:2407.19976  [pdf, other

    cs.HC cs.MM

    MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and Disentangled Multi-Modality Fusion

    Authors: Chencan Fu, Yabiao Wang, Jiangning Zhang, Zhengkai Jiang, Xiaofeng Mao, Jiafu Wu, Weijian Cao, Chengjie Wang, Yanhao Ge, Yong Liu

    Abstract: Co-speech gesture generation is crucial for producing synchronized and realistic human gestures that accompany speech, enhancing the animation of lifelike avatars in virtual environments. While diffusion models have shown impressive capabilities, current approaches often overlook a wide range of modalities and their interactions, resulting in less dynamic and contextually varied gestures. To addre… ▽ More

    Submitted 28 August, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

    Comments: Accepted by ACM MM 2024

  2. arXiv:2407.19533  [pdf, other

    cs.GR

    FreeShell: A Context-Free 4D Printing Technique for Fabricating Complex 3D Triangle Mesh Shells

    Authors: Chao Yuan, Nan Cao, Xuejiao Ma, Shengqi Dang

    Abstract: Freeform thin-shell surfaces are critical in various fields, but their fabrication is complex and costly. Traditional methods are wasteful and require custom molds, while 3D printing needs extensive support structures and post-processing. Thermoshrinkage actuated 4D printing is an effective method through flat structures fabricating 3D shell. However, existing research faces issues related to prec… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: This paper includes 12 pages and 19 figures

  3. arXiv:2407.19521  [pdf, other

    astro-ph.CO gr-qc hep-ph

    Ultra-light dark matter with non-canonical kinetics reopening the mass window

    Authors: Shiyun Lu, Amara Ilyas, Xiao-Han Ma, Bo Wang, Dongdong Zhang, Yi-Fu Cai

    Abstract: Fuzzy dark matter (FDM) with mass around $10^{-22}$ eV is viewed as a promising paradigm in understanding the structure formation of the local universe at small scales. Recent observations, however, begin to challenge FDM in return. We focus on the arguments between the solution to CDM small-scale curiosities and recent observations on matter power spectrum, and find its implication on an earlier… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

    Comments: 25 pages, 10 figures

  4. arXiv:2407.18937  [pdf

    cs.IR cs.LG

    Advancements in Recommender Systems: A Comprehensive Analysis Based on Data, Algorithms, and Evaluation

    Authors: Xin Ma, Mingyue Li, Xuguang Liu

    Abstract: Using 286 research papers collected from Web of Science, ScienceDirect, SpringerLink, arXiv, and Google Scholar databases, a systematic review methodology was adopted to review and summarize the current challenges and potential future developments in data, algorithms, and evaluation aspects of RSs. It was found that RSs involve five major research topics, namely algorithmic improvement, domain app… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 24 pages, 10 figures, 3 tables

  5. arXiv:2407.18210  [pdf, other

    cond-mat.soft cond-mat.dis-nn cond-mat.mtrl-sci cond-mat.stat-mech math-ph

    Statistical mechanics of frustrated assemblies and incompatible graphs

    Authors: José M. Ortiz-Tavárez, Zhen Yang, Nicholas Kotov, Xiaoming Mao

    Abstract: Geometrically frustrated assemblies where building blocks misfit have been shown to generate intriguing phenomena from self-limited growth, fiber formation, to structural complexity. We introduce a graph theory formulation of geometrically frustrated assemblies, capturing frustrated interactions through the concept of incompatible flows, providing a direct link between structural connectivity and… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: 11 pages, 9 figures

  6. arXiv:2407.17736  [pdf, ps, other

    math.AP

    $σ_k$-Yamabe measure

    Authors: Xi-Nan Ma, Wangzhe Wu

    Abstract: We found a special divergence structure for the $σ_k$-Yamabe operator and use it to get a monotonicity formula. We also get an interior $L^{\infty}$ estimate via its $L^1$ norm for the $σ_k$-Yamabe operator when $1\le k \le \frac{n}{2}$. Combining these two tools, we prove the weak continuity of the $σ_k$-Yamabe measure with respect to convergence in measure.

    Submitted 24 July, 2024; originally announced July 2024.

    MSC Class: 58C35; 28A33; 35J60

  7. arXiv:2407.17386  [pdf, other

    astro-ph.SR astro-ph.GA

    Data-driven stellar intrinsic colors and dust reddenings for spectro-photometric data: From the blue-edge method to a machine-learning approach

    Authors: He Zhao, Shu Wang, Biwei Jiang, Jun Li, Dongwei Fan, Yi Ren, Xiaoxiao Ma

    Abstract: Intrinsic colors (ICs) of stars are essential for the studies on both stellar physics and dust reddening. In this work, we developed an XGBoost model to predict the ICs with the atmospheric parameters $T_{\rm eff}$, ${\rm log}\,g$, and $\rm [M/H]$. The model was trained and tested for three colors at Gaia and 2MASS bands with 1,040,446 low-reddening sources. The atmospheric parameters were determi… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: 23 pages, 1 table, 11 figures, 2 appendices, accepted for publication in ApJ

  8. arXiv:2407.17184  [pdf, other

    hep-ex

    Search for $η_{c}(2S)\to K^+ K^- η^{\prime}$ decay

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: Using $(2.712\pm0.014)\times10^{9}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII, we find an evidence of the $η_{c}(2S)\to K^+ K^- η^{\prime}$ decay with a statistical significance of 3.1$σ$. Its decay branching fraction is measured to be $(12.24\pm4.60(\mathrm{stat.})\pm2.37(\mathrm{syst.})\pm4.68(\mathrm{extr.}))\times 10^{-4}$, where the first uncertainty is stati… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  9. arXiv:2407.16993  [pdf, other

    cs.CV

    LoFormer: Local Frequency Transformer for Image Deblurring

    Authors: Xintian Mao, Jiansheng Wang, Xingran Xie, Qingli Li, Yan Wang

    Abstract: Due to the computational complexity of self-attention (SA), prevalent techniques for image deblurring often resort to either adopting localized SA or employing coarse-grained global SA methods, both of which exhibit drawbacks such as compromising global modeling or lacking fine-grained correlation. In order to address this issue by effectively modeling long-range dependencies without sacrificing f… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  10. arXiv:2407.16216  [pdf, other

    cs.CL

    A Comprehensive Survey of LLM Alignment Techniques: RLHF, RLAIF, PPO, DPO and More

    Authors: Zhichao Wang, Bin Bi, Shiva Kumar Pentyala, Kiran Ramnath, Sougata Chaudhuri, Shubham Mehrotra, Zixu, Zhu, Xiang-Bo Mao, Sitaram Asur, Na, Cheng

    Abstract: With advancements in self-supervised learning, the availability of trillions tokens in a pre-training corpus, instruction fine-tuning, and the development of large Transformers with billions of parameters, large language models (LLMs) are now capable of generating factual and coherent responses to human queries. However, the mixed quality of training data can lead to the generation of undesired re… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  11. arXiv:2407.15768  [pdf, other

    astro-ph.SR astro-ph.HE hep-ph

    Early-Time Observations of SN 2023wrk: A Luminous Type Ia Supernova with Significant Unburned Carbon in the Outer Ejecta

    Authors: Jialian Liu, Xiaofeng Wang, Cristina Andrade, Pierre-Alexandre Duverne, Jujia Zhang, Liping Li, Zhenyu Wang, Felipe Navarete, Andrea Reguitti, Stefan Schuldt, Yongzhi Cai, Alexei V. Filippenko, Yi Yang, Thomas G. Brink, WeiKang Zheng, Ali Esamdin, Abdusamatjan Iskandar, Chunhai Bai, Jinzhong Liu, Xin Li, Maokai Hu, Gaici Li, Wenxiong Li, Xiaoran Ma, Shengyu Yan , et al. (22 additional authors not shown)

    Abstract: We present extensive photometric and spectroscopic observations of the nearby Type Ia supernova (SN) 2023wrk at a distance of about 40 Mpc. The earliest detection of this SN can be traced back to a few hours after the explosion. Within the first few days the light curve shows a bump feature, while the B - V color is blue and remains nearly constant. The overall spectral evolution is similar to tha… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: Accepted for publication in the Astrophysical Journal (27 pages, 14 figures, 7 tables)

  12. arXiv:2407.15642  [pdf, other

    cs.CV

    Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models

    Authors: Xin Ma, Yaohui Wang, Gengyun Jia, Xinyuan Chen, Yuan-Fang Li, Cunjian Chen, Yu Qiao

    Abstract: Diffusion models have achieved great progress in image animation due to powerful generative capabilities. However, maintaining spatio-temporal consistency with detailed information from the input static image over time (e.g., style, background, and object of the input static image) and ensuring smoothness in animated video narratives guided by textual prompts still remains challenging. In this pap… ▽ More

    Submitted 22 July, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: Project webpage: https://maxin-cn.github.io/cinemo_project/

  13. arXiv:2407.15582  [pdf, other

    quant-ph

    Optimizing Circuit Reusing and its Application in Randomized Benchmarking

    Authors: Zhuo Chen, Guoding Liu, Xiongfeng Ma

    Abstract: Quantum learning tasks often leverage randomly sampled quantum circuits to characterize unknown systems. An efficient approach known as "circuit reusing," where each circuit is executed multiple times, reduces the cost compared to implementing new circuits. This work investigates the optimal reusing parameter that minimizes the variance of measurement outcomes for a given experimental cost. We est… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: 19 pages, 12 figures. Comments are welcomed!

  14. arXiv:2407.15234  [pdf, other

    cs.NI

    Exploring the Design of Collaborative Applications via the Lens of NDN Workspace

    Authors: Tianyuan Yu, Xinyu Ma, Varun Patil, Yekta Kocaogullar, Lixia Zhang

    Abstract: Metaverse applications desire to communicate with semantically identified objects among a diverse set of cyberspace entities, such as cameras for collecting images from, sensors for sensing environment, and users collaborating with each other, all could be nearby or far away, in a timely and secure way. However, supporting the above function faces networking challenges. Today's metaverse implement… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  15. arXiv:2407.15221  [pdf, other

    cs.NI cs.DC

    Secure Web Objects: Building Blocks for Metaverse Interoperability and Decentralization

    Authors: Tianyuan Yu, Xinyu Ma, Varun Patil, Yekta Kocaogullar, Yulong Zhang, Jeff Burke, Dirk Kutscher, Lixia Zhang

    Abstract: This position paper explores how to support the Web's evolution through an underlying data-centric approach that better matches the data-orientedness of modern and emerging applications. We revisit the original vision of the Web as a hypermedia system that supports document composability and application interoperability via name-based data access. We propose the use of secure web objects (SWO), a… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: 9 pages

    ACM Class: H.3.5

  16. arXiv:2407.14805  [pdf, ps, other

    math.RA

    Homological properties of homologically smooth connected cochain DGAs

    Authors: X. -F. Mao

    Abstract: Assume that $\mathscr{A}$ is a connected cochain DG algebra. We show that $\mathscr{A}$ is homologically smooth and Gorenstein if and only if its $\mathrm{Ext}$-algebra $H(R\Hom_{\mathscr{A}}(\mathbbm{k},\mathbbm{k}))$ is a Frobenius graded algebra. Moreover, $\mathscr{A}$ is Calabi-Yau if and only if the $\mathrm{Ext}$-algebra $H(R\Hom_{\mathscr{A}}(\mathbbm{k},\mathbbm{k}))$ is a symmetric Frobe… ▽ More

    Submitted 7 August, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

    MSC Class: 16E10; 16E45; 16W50; 16E65

  17. arXiv:2407.14803  [pdf, ps, other

    math.RA

    Homologically smooth connected cochain DGAs

    Authors: X. -F. Mao

    Abstract: Let $\mathscr{A}$ be a connected cochain DG algebra such that $H(\mathscr{A})$ is a Noetherian graded algebra. We give some criteria for $\mathscr{A}$ to be homologically smooth in terms of the singularity category, the cone length of the canonical module $k$ and the global dimension of $\mathscr{A}$. For any cohomologically finite DG $\mathscr{A}$-module $M$, we show that it is compact when… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:1301.4382

    MSC Class: 16E10; 16E45; 16W50; 16E65

  18. arXiv:2407.14769  [pdf, other

    cs.HC

    A Two-Phase Visualization System for Continuous Human-AI Collaboration in Sequelae Analysis and Modeling

    Authors: Yang Ouyang, Chenyang Zhang, He Wang, Tianle Ma, Chang Jiang, Yuheng Yan, Zuoqin Yan, Xiaojuan Ma, Chuhan Shi, Quan Li

    Abstract: In healthcare, AI techniques are widely used for tasks like risk assessment and anomaly detection. Despite AI's potential as a valuable assistant, its role in complex medical data analysis often oversimplifies human-AI collaboration dynamics. To address this, we collaborated with a local hospital, engaging six physicians and one data scientist in a formative study. From this collaboration, we prop… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: To appear at the IEEE VIS Conference 2024

  19. arXiv:2407.14211  [pdf, other

    cs.LG

    Enhanced Mortality Prediction in ICU Stroke Patients via Deep Learning

    Authors: Armin Abdollahi, Xinghong Ma, Jiahao Zhang, Daijia Wu, Tongshou Wu, Zizheng Ye, Maryam Pishgar

    Abstract: Background: Stroke is second-leading cause of disability and death among adults. Approximately 17 million people suffer from a stroke annually, with about 85% being ischemic strokes. Predicting mortality of ischemic stroke patients in intensive care unit (ICU) is crucial for optimizing treatment strategies, allocating resources, and improving survival rates. Methods: We acquired data on ICU ischem… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  20. arXiv:2407.13596  [pdf, other

    cs.CV

    EarthMarker: Visual Prompt Learning for Region-level and Point-level Remote Sensing Imagery Comprehension

    Authors: Wei Zhang, Miaoxin Cai, Tong Zhang, Jun Li, Yin Zhuang, Xuerui Mao

    Abstract: Recent advances in visual prompting in the natural image area have allowed users to interact with artificial intelligence (AI) tools through various visual marks such as box, point, and free-form shapes. However, due to the significant difference between the natural and remote sensing (RS) images, existing visual prompting models face challenges in RS scenarios. Moreover, RS MLLMs mainly focus on… ▽ More

    Submitted 20 July, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

  21. arXiv:2407.13268  [pdf, other

    cs.AI cs.LG

    Mixture of Experts based Multi-task Supervise Learning from Crowds

    Authors: Tao Han, Huaixuan Shi, Xinyi Ding, Xiao Ma, Huamao Gu, Yili Fang

    Abstract: Existing truth inference methods in crowdsourcing aim to map redundant labels and items to the ground truth. They treat the ground truth as hidden variables and use statistical or deep learning-based worker behavior models to infer the ground truth. However, worker behavior models that rely on ground truth hidden variables overlook workers' behavior at the item feature level, leading to imprecise… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  22. arXiv:2407.12270  [pdf, other

    hep-ex

    Observation of $Λ_c^+ \to Λa_0(980)^+$ and Evidence for $Σ(1380)^+$ in $Λ_c^+ \to Λπ^+ η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Based on $6.1~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at center-of-mass energies from 4.600~GeV to 4.843~GeV with the BESIII detector at the BEPCII collider, a partial wave analysis of $Λ_c^+\toΛπ^+η$ is performed, and branching fractions and decay asymmetry parameters of intermediate processes are determined. The process $Λ_c^+\toΛa_0(980)^+$ is observed for the first time, and… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 16 pages, 8 figures

  23. arXiv:2407.12267  [pdf, other

    cs.CV cs.GR

    Generating 3D House Wireframes with Semantics

    Authors: Xueqi Ma, Yilin Liu, Wenjun Zhou, Ruowei Wang, Hui Huang

    Abstract: We present a new approach for generating 3D house wireframes with semantic enrichment using an autoregressive model. Unlike conventional generative models that independently process vertices, edges, and faces, our approach employs a unified wire-based representation for improved coherence in learning 3D wireframe structures. By re-ordering wire sequences based on semantic meanings, we facilitate s… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: European Conference on Computer Vision (Proceedings of ECCV 2024); Project page: https://vcc.tech/research/2024/3DWire; GitHub repository: https://github.com/3d-house-wireframe/3d-house-wireframe-dataset

  24. arXiv:2407.12019  [pdf, other

    cs.CL cs.AI

    DIM: Dynamic Integration of Multimodal Entity Linking with Large Language Model

    Authors: Shezheng Song, Shasha Li, Jie Yu, Shan Zhao, Xiaopeng Li, Jun Ma, Xiaodong Liu, Zhuo Li, Xiaoguang Mao

    Abstract: Our study delves into Multimodal Entity Linking, aligning the mention in multimodal information with entities in knowledge base. Existing methods are still facing challenges like ambiguous entity representations and limited image information utilization. Thus, we propose dynamic entity extraction using ChatGPT, which dynamically extracts entities and enhances datasets. We also propose a method: Dy… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Published on PRCV24

  25. arXiv:2407.11727  [pdf, ps, other

    hep-ex hep-ph

    Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

    Abstract: Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(0.547\pm0.026_{\rm stat}\pm0.016_{\rm syst})\%$ a… ▽ More

    Submitted 18 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: 27 pages, 13 figures

  26. arXiv:2407.11653  [pdf, other

    hep-ph

    Particle Conversions Beyond the WKB Approximation and Solar-Induced Gravitational Waves from Dark Photon Dark Matter

    Authors: Tengyu Ai, Yuxuan He, Jia Liu, Xiaolin Ma, Xiao-Ping Wang

    Abstract: We investigate the conversion of kinetic mixing dark photon dark matter into gravitational waves within the magnetic field of the Sun. Our study reveals that the WKB approximation is invalid in this scenario. We derive an analytic solution for the conversion probability with unitary evolution feature. This solution aligns in form with previous studies on photon-gravitational wave conversion. Inter… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 28 pages, 5 figures

  27. arXiv:2407.11626  [pdf

    cs.LG cs.NE

    Dynamic Dimension Wrapping (DDW) Algorithm: A Novel Approach for Efficient Cross-Dimensional Search in Dynamic Multidimensional Spaces

    Authors: Dongnan Jin, Yali Liu, Qiuzhi Song, Xunju Ma, Yue Liu, Dehao Wu

    Abstract: In the real world, as the complexity of optimization problems continues to increase, there is an urgent need to research more efficient optimization methods. Current optimization algorithms excel in solving problems with a fixed number of dimensions. However, their efficiency in searching dynamic multi-dimensional spaces is unsatisfactory. In response to the challenge of cross-dimensional search i… ▽ More

    Submitted 18 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

  28. arXiv:2407.10468  [pdf, other

    cs.SD cs.AI eess.AS

    LiteFocus: Accelerated Diffusion Inference for Long Audio Synthesis

    Authors: Zhenxiong Tan, Xinyin Ma, Gongfan Fang, Xinchao Wang

    Abstract: Latent diffusion models have shown promising results in audio generation, making notable advancements over traditional methods. However, their performance, while impressive with short audio clips, faces challenges when extended to longer audio sequences. These challenges are due to model's self-attention mechanism and training predominantly on 10-second clips, which complicates the extension to lo… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Interspeech 2024; Code: https://github.com/Yuanshi9815/LiteFocus

  29. arXiv:2407.10439  [pdf, other

    cs.CV

    PolyRoom: Room-aware Transformer for Floorplan Reconstruction

    Authors: Yuzhou Liu, Lingjie Zhu, Xiaodong Ma, Hanqiao Ye, Xiang Gao, Xianwei Zheng, Shuhan Shen

    Abstract: Reconstructing geometry and topology structures from raw unstructured data has always been an important research topic in indoor mapping research. In this paper, we aim to reconstruct the floorplan with a vectorized representation from point clouds. Despite significant advancements achieved in recent years, current methods still encounter several challenges, such as missing corners or edges, inacc… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  30. arXiv:2407.10366  [pdf, other

    cs.CV cs.AI cs.LG

    Accessing Vision Foundation Models at ImageNet-level Costs

    Authors: Yitian Zhang, Xu Ma, Yue Bai, Huan Wang, Yun Fu

    Abstract: Vision foundation models are renowned for their generalization ability due to massive training data. Nevertheless, they demand tremendous training resources, and the training data is often inaccessible, e.g., CLIP, DINOv2, posing great challenges to developing derivatives that could advance research in this field. In this work, we offer a very simple and general solution, named Proteus, to distill… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  31. arXiv:2407.09247  [pdf, other

    cs.AI

    Constrained Intrinsic Motivation for Reinforcement Learning

    Authors: Xiang Zheng, Xingjun Ma, Chao Shen, Cong Wang

    Abstract: This paper investigates two fundamental problems that arise when utilizing Intrinsic Motivation (IM) for reinforcement learning in Reward-Free Pre-Training (RFPT) tasks and Exploration with Intrinsic Motivation (EIM) tasks: 1) how to design an effective intrinsic objective in RFPT tasks, and 2) how to reduce the bias introduced by the intrinsic objective in EIM tasks. Existing IM methods suffer fr… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Accepted by IJCAI 2024

  32. arXiv:2407.09096  [pdf, other

    cs.LG cs.AI

    STD-PLM: Understanding Both Spatial and Temporal Properties of Spatial-Temporal Data with PLM

    Authors: YiHeng Huang, Xiaowei Mao, Shengnan Guo, Yubin Chen, Junfeng Shen, Tiankuo Li, Youfang Lin, Huaiyu Wan

    Abstract: Spatial-temporal forecasting and imputation are important for real-world intelligent systems. Most existing methods are tailored for individual forecasting or imputation tasks but are not designed for both. Additionally, they are less effective for zero-shot and few-shot learning. While pre-trained language model (PLM) have exhibited strong pattern recognition and reasoning abilities across variou… ▽ More

    Submitted 27 August, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

  33. arXiv:2407.09013  [pdf, ps, other

    cs.AI cs.LG

    Procedural Content Generation via Generative Artificial Intelligence

    Authors: Xinyu Mao, Wanli Yu, Kazunori D Yamada, Michael R. Zielewski

    Abstract: The attempt to utilize machine learning in PCG has been made in the past. In this survey paper, we investigate how generative artificial intelligence (AI), which saw a significant increase in interest in the mid-2010s, is being used for PCG. We review applications of generative AI for the creation of various types of content, including terrains, items, and even storylines. While generative AI is e… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  34. arXiv:2407.08978  [pdf, other

    cs.CL cs.LG

    Towards Chapter-to-Chapter Context-Aware Literary Translation via Large Language Models

    Authors: Linghao Jin, Li An, Xuezhe Ma

    Abstract: Discourse phenomena in existing document-level translation datasets are sparse, which has been a fundamental obstacle in the development of context-aware machine translation models. Moreover, most existing document-level corpora and context-aware machine translation methods rely on an unrealistic assumption on sentence-level alignments. To mitigate these issues, we first curate a novel dataset of… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Preprint

  35. arXiv:2407.08528  [pdf, other

    eess.IV cs.CV cs.MM

    Enhancing octree-based context models for point cloud geometry compression with attention-based child node number prediction

    Authors: Chang Sun, Hui Yuan, Xiaolong Mao, Xin Lu, Raouf Hamzaoui

    Abstract: In point cloud geometry compression, most octreebased context models use the cross-entropy between the onehot encoding of node occupancy and the probability distribution predicted by the context model as the loss. This approach converts the problem of predicting the number (a regression problem) and the position (a classification problem) of occupied child nodes into a 255-dimensional classificati… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 2 figures and 2 tables

    Journal ref: IEEE Signal Processing Letters, 2024

  36. arXiv:2407.07868  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation

    Authors: Eugene Teoh, Sumit Patidar, Xiao Ma, Stephen James

    Abstract: Generalising vision-based manipulation policies to novel environments remains a challenging area with limited exploration. Current practices involve collecting data in one location, training imitation learning or reinforcement learning policies with this data, and deploying the policy in the same location. However, this approach lacks scalability as it necessitates data collection in multiple loca… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Project website: https://greenaug.github.io/

  37. arXiv:2407.07791  [pdf, other

    cs.CL

    Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities

    Authors: Tianjie Ju, Yiting Wang, Xinbei Ma, Pengzhou Cheng, Haodong Zhao, Yulong Wang, Lifeng Liu, Jian Xie, Zhuosheng Zhang, Gongshen Liu

    Abstract: The rapid adoption of large language models (LLMs) in multi-agent systems has highlighted their impressive capabilities in various applications, such as collaborative problem-solving and autonomous negotiation. However, the security implications of these LLM-based multi-agent systems have not been thoroughly investigated, particularly concerning the spread of manipulated knowledge. In this paper,… ▽ More

    Submitted 22 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: 18 Pages, working in progress

  38. arXiv:2407.07788  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    BiGym: A Demo-Driven Mobile Bi-Manual Manipulation Benchmark

    Authors: Nikita Chernyadev, Nicholas Backshall, Xiao Ma, Yunfan Lu, Younggyo Seo, Stephen James

    Abstract: We introduce BiGym, a new benchmark and learning environment for mobile bi-manual demo-driven robotic manipulation. BiGym features 40 diverse tasks set in home environments, ranging from simple target reaching to complex kitchen cleaning. To capture the real-world performance accurately, we provide human-collected demonstrations for each task, reflecting the diverse modalities found in real-world… ▽ More

    Submitted 11 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: Project webpage: https://chernyadev.github.io/bigym/

  39. arXiv:2407.07651  [pdf, other

    hep-ex physics.data-an

    Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

    Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

    Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  40. arXiv:2407.07325  [pdf, other

    cs.CV cs.CL cs.MM eess.IV

    HiLight: Technical Report on the Motern AI Video Language Model

    Authors: Zhiting Wang, Qiangong Zhou, Kangjie Yang, Zongyang Liu, Xin Mao

    Abstract: This technical report presents the implementation of a state-of-the-art video encoder for video-text modal alignment and a video conversation framework called HiLight, which features dual visual towers. The work is divided into two main parts: 1.alignment of video and text modalities; 2.convenient and efficient way to interact with users. Our goal is to address the task of video comprehension in t… ▽ More

    Submitted 11 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

  41. arXiv:2407.06841  [pdf, other

    cs.CV

    HTD-Mamba: Efficient Hyperspectral Target Detection with Pyramid State Space Model

    Authors: Dunbin Shen, Xuanbing Zhu, Jiacheng Tian, Jianjun Liu, Zhenrong Du, Hongyu Wang, Xiaorui Ma

    Abstract: Hyperspectral target detection (HTD) identifies objects of interest from complex backgrounds at the pixel level, playing a vital role in Earth observation. However, HTD faces challenges due to limited prior knowledge and spectral variation, leading to underfitting models and unreliable performance. To address these challenges, this paper proposes an efficient self-supervised HTD method with a pyra… ▽ More

    Submitted 17 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: 13 pages,6 figures, 5 tables

  42. arXiv:2407.05681  [pdf

    cond-mat.supr-con cond-mat.str-el

    Bulk high-temperature superconductivity in the high-pressure tetragonal phase of bilayer La2PrNi2O7

    Authors: Ningning Wang, Gang Wang, Xiaoling Shen, Jun Hou, Jun Luo, Xiaoping Ma, Huaixin Yang, Lifen Shi, Jie Dou, Jie Feng, Jie Yang, Yunqing Shi, Zhian Ren, Hanming Ma, Pengtao Yang, Ziyi Liu, Yue Liu, Hua Zhang, Xiaoli Dong, Yuxin Wang, Kun Jiang, Jiangping Hu, Stuart Calder, Jiaqiang Yan, Jianping Sun , et al. (4 additional authors not shown)

    Abstract: The Ruddlesden-Popper (R-P) bilayer nickelate, La3Ni2O7, was recently found to show signatures of high-temperature superconductivity (HTSC) at pressures above 14 GPa. Subsequent investigations achieved zero resistance in single- and poly-crystalline samples under hydrostatic pressure conditions. Yet, obvious diamagnetic signals, the other hallmark of superconductors, are still lacking owing to the… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  43. arXiv:2407.05677  [pdf, other

    eess.IV

    PCAC-GAN: A Sparse-Tensor-Based Generative Adversarial Network for 3D Point Cloud Attribute Compression

    Authors: Xiaolong Mao, Hui Yuan, Xin Lu, Raouf Hamzaoui, Wei Gao

    Abstract: Learning-based methods have proven successful in compressing geometric information for point clouds. For attribute compression, however, they still lag behind non-learning-based methods such as the MPEG G-PCC standard. To bridge this gap, we propose a novel deep learning-based point cloud attribute compression method that uses a generative adversarial network (GAN) with sparse convolution layers.… ▽ More

    Submitted 19 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 14 pages, 5 figures, Accepted by Computational Visual Media

    MSC Class: 94J20 ACM Class: I.4.2

    Journal ref: Computational Visual Media, 2024

  44. arXiv:2407.05282  [pdf, other

    cs.CV

    UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

    Authors: Haozhe Zhao, Xiaojian Ma, Liang Chen, Shuzheng Si, Rujie Wu, Kaikai An, Peiyu Yu, Minjia Zhang, Qing Li, Baobao Chang

    Abstract: This paper presents UltraEdit, a large-scale (approximately 4 million editing samples), automatically generated dataset for instruction-based image editing. Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix and MagicBrush, and provide a systematic approach to producing massive and high-quality image editing samples. UltraEdit offers several distinct a… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 32 pages, 14 figures

  45. arXiv:2407.05251  [pdf, ps, other

    math.OA

    Almost elementary groupoid models for $C^*$-algebras

    Authors: Xin Ma, Jianchao Wu

    Abstract: The notion of almost elementariness for a locally compact Hausdorff étale groupoid $\mathcal{G}$ with a compact unit space was introduced by the authors as a sufficient condition ensuring the reduced groupoid $C^*$-algebra $C^*_r(\mathcal{G})$ is (tracially) $\mathcal{Z}$-stable and thus classifiable under additional natural assumption. In this paper, we explore the converse direction and show tha… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  46. arXiv:2407.05236  [pdf, other

    astro-ph.HE

    A timing view of the additional high-energy spectral component discovered in the black hole candidate Swift J1727.8-1613

    Authors: Zi-Xu Yang, Liang Zhang, Shuang-Nan Zhang, L. Tao, Shu Zhang, Ruican Ma, Qingcui Bu, Yue Huang, He-Xin Liu, Wei Yu, Guang C. Xiao, Peng-Ju Wang, Hua Feng, Li-Ming Song, Xiang Ma, Mingyu Ge, QingChang Zhao, J. L. Qu

    Abstract: We present an energy-dependent analysis for the type-C quasi-periodic oscillations (QPOs) observed in the black hole X-ray binary Swift J1727.8-1613 using Insight-HXMT observations. We find that the QPO fractional rms at energies above 40 keV is significantly higher than that below 20 keV. This is the first report of a high energy (HE)-rms excess in the rms spectrum of a black hole X-ray binary. I… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  47. arXiv:2407.05010  [pdf, other

    cs.CV

    PRANCE: Joint Token-Optimization and Structural Channel-Pruning for Adaptive ViT Inference

    Authors: Ye Li, Chen Tang, Yuan Meng, Jiajun Fan, Zenghao Chai, Xinzhu Ma, Zhi Wang, Wenwu Zhu

    Abstract: We introduce PRANCE, a Vision Transformer compression framework that jointly optimizes the activated channels and reduces tokens, based on the characteristics of inputs. Specifically, PRANCE~ leverages adaptive token optimization strategies for a certain computational budget, aiming to accelerate ViTs' inference from a unified data and architectural perspective. However, the joint framework poses… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  48. arXiv:2407.04984  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Prolonged Phase Segregation of Mixed-Halide Perovskite Nanocrystals in the Dark

    Authors: Xueying Ma, Yuhui Ye, Yang Xiao, Shengnan Feng, Chunfeng Zhang, Keyu Xia, Fengrui Hu, Min Xiao, Xiaoyong Wang

    Abstract: A critical issue hindering the potential applications of semiconductor mixed-halide perovskites is the phase segregation effect, wherein localized regions enriched with one type of halide anions would be formed upon continuous photogeneration of the excited-state charge carriers. These unexpected phases are capable of remixing again in the dark under the entropic driving force, the process of whic… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  49. arXiv:2407.04922  [pdf, other

    cond-mat.mtrl-sci

    Revolutionizing Alloy Microstructure Segmentation through SAM and Domain Knowledge without Extra Training

    Authors: Xudong Ma, Yuqi Zhang, Chenchong Wang, Wei Xu

    Abstract: Fundamental models, trained on large-scale datasets and adapted to new data using innovative learning methods, have revolutionized various fields. In materials science, microstructure image segmentation plays a pivotal role in understanding alloy properties. However, conventional supervised modelling algorithms often necessitate extensive annotations and intricate optimization procedures. The segm… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  50. arXiv:2407.04616  [pdf, other

    cs.CV cs.AI cs.LG

    Isomorphic Pruning for Vision Models

    Authors: Gongfan Fang, Xinyin Ma, Michael Bi Mi, Xinchao Wang

    Abstract: Structured pruning reduces the computational overhead of deep neural networks by removing redundant sub-structures. However, assessing the relative importance of different sub-structures remains a significant challenge, particularly in advanced vision models featuring novel mechanisms and architectures like self-attention, depth-wise convolutions, or residual connections. These heterogeneous subst… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.