Zum Hauptinhalt springen

Showing 1–50 of 1,087 results for author: Ren, Y

.
  1. arXiv:2408.17009  [pdf, other

    cs.SD eess.AS

    Utilizing Speaker Profiles for Impersonation Audio Detection

    Authors: Hao Gu, JiangYan Yi, Chenglong Wang, Yong Ren, Jianhua Tao, Xinrui Yan, Yujie Chen, Xiaohui Zhang

    Abstract: Fake audio detection is an emerging active topic. A growing number of literatures have aimed to detect fake utterance, which are mostly generated by Text-to-speech (TTS) or voice conversion (VC). However, countermeasures against impersonation remain an underexplored area. Impersonation is a fake type that involves an imitator replicating specific traits and speech style of a target speaker. Unlike… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: Accepted by ACM MM2024

  2. arXiv:2408.16809  [pdf, other

    cs.CV cs.CL cs.MM

    See or Guess: Counterfactually Regularized Image Captioning

    Authors: Qian Cao, Xu Chen, Ruihua Song, Xiting Wang, Xinting Huang, Yuchen Ren

    Abstract: Image captioning, which generates natural language descriptions of the visual information in an image, is a crucial task in vision-language research. Previous models have typically addressed this task by aligning the generative capabilities of machines with human intelligence through statistical fitting of existing datasets. While effective for normal images, they may struggle to accurately descri… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: Accepted by ACM MM 2024

  3. arXiv:2408.16300  [pdf, other

    cs.NE math.OC

    A Distance Similarity-based Genetic Optimization Algorithm for Satellite Ground Network Planning Considering Feeding Mode

    Authors: Yingying Ren, Qiuli Li, Yangyang Guo, Witold Pedrycz, Lining Xing, Anfeng Liu, Yanjie Song

    Abstract: With the rapid development of the satellite industry, the information transmission network based on communication satellites has gradually become a major and important part of the future satellite ground integration network. However, the low transmission efficiency of the satellite data relay back mission has become a problem that is currently constraining the construction of the system and needs… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: 25 pages

  4. arXiv:2408.14035  [pdf, other

    cs.RO cs.CV

    FAST-LIVO2: Fast, Direct LiDAR-Inertial-Visual Odometry

    Authors: Chunran Zheng, Wei Xu, Zuhao Zou, Tong Hua, Chongjian Yuan, Dongjiao He, Bingyang Zhou, Zheng Liu, Jiarong Lin, Fangcheng Zhu, Yunfan Ren, Rong Wang, Fanle Meng, Fu Zhang

    Abstract: This paper proposes FAST-LIVO2: a fast, direct LiDAR-inertial-visual odometry framework to achieve accurate and robust state estimation in SLAM tasks and provide great potential in real-time, onboard robotic applications. FAST-LIVO2 fuses the IMU, LiDAR and image measurements efficiently through an ESIKF. To address the dimension mismatch between the heterogeneous LiDAR and image measurements, we… ▽ More

    Submitted 28 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

    Comments: 30 pages, 31 figures, due to the limitation that 'The abstract field cannot exceed 1,920 characters', the abstract presented here is shorter than the one in the PDF file

  5. arXiv:2408.11986  [pdf, other

    cond-mat.mtrl-sci

    Magnetic proximity coupling to defects in a two-dimensional semiconductor

    Authors: Muhammad Hassan Shaikh, Matthew Whalen, Dai Q. Ho, Aqiq Ishraq, Collin Maurtua, Kenji Watanabe, Takashi Taniguchi, Yafei Ren, Anderson Janotti, John Xiao, Chitraleema Chakraborty

    Abstract: The ultrathin structure and efficient spin dynamics of two-dimensional (2D) antiferromagnetic (AFM) materials hold unprecedented opportunities for ultrafast memory devices, artificial intelligence circuits, and novel computing technology. For example, chromium thiophosphate (CrPS4) is one of the most promising 2D A-type AFM materials due to its robust stability in diverse environmental conditions… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  6. arXiv:2408.11758  [pdf, other

    cs.CV

    MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs

    Authors: Yulin Ren, Xin Li, Mengxi Guo, Bingchen Li, Shijie Zhao, Zhibo Chen

    Abstract: We present MambaCSR, a simple but effective framework based on Mamba for the challenging compressed image super-resolution (CSR) task. Particularly, the scanning strategies of Mamba are crucial for effective contextual knowledge modeling in the restoration process despite it relying on selective state space modeling for all tokens. In this work, we propose an efficient dual-interleaved scanning pa… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  7. arXiv:2408.10591  [pdf, ps, other

    math.DG

    On differential geometry of non-degenerate CR manifolds

    Authors: Yuxin Dong, Yibin Ren

    Abstract: In this paper, we consider a non-degenerate CR manifold (M,H(M),J) with a given pseudo-Hermitian 1-form θ, and endow the CR distribution H(M) with any Hermitian metric h instead of the Levi form L_{θ}. This induces a natural Riemannian metric g_{h,θ} on M compatible with the structure. The synthetic object (M,θ,J,h) will be called a pseudo-Hermitian manifold, which generalizes the usual notion of… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 36 pages, Comments welcome

  8. arXiv:2408.10124  [pdf, other

    cs.LG cs.AI cs.IR physics.chem-ph q-bio.BM

    Molecular Graph Representation Learning Integrating Large Language Models with Domain-specific Small Models

    Authors: Tianyu Zhang, Yuxiang Ren, Chengbin Hou, Hairong Lv, Xuegong Zhang

    Abstract: Molecular property prediction is a crucial foundation for drug discovery. In recent years, pre-trained deep learning models have been widely applied to this task. Some approaches that incorporate prior biological domain knowledge into the pre-training framework have achieved impressive results. However, these methods heavily rely on biochemical experts, and retrieving and summarizing vast amounts… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  9. arXiv:2408.09738  [pdf

    cond-mat.mtrl-sci

    Room-Temperature Multiferroic Skyrmions in LiNbO3 with enhancement in electric-optical property

    Authors: Yalong Yu, Bo Xiong, Siqi Wu, Yekai Ren, Nuo Chen, Qingjiao Mi, Zhaojie Zheng, Kangping Lou, Rui Wang, Tao Chu

    Abstract: LiNbO3 (LN) is renowned for its exceptional ferroelectric properties, particularly its notable linear electro-optical (EO) effect, which is highly advantageous for various applications such as high-speed communication, optical computation, and quantum information processing. Compared to its ferroelectric properties, the magnetism of LN is not attractive enough due to its weak ferromagnetic nature.… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  10. arXiv:2408.08202  [pdf, other

    cs.CV

    Towards Practical Human Motion Prediction with LiDAR Point Clouds

    Authors: Xiao Han, Yiming Ren, Yichen Yao, Yujing Sun, Yuexin Ma

    Abstract: Human motion prediction is crucial for human-centric multimedia understanding and interacting. Current methods typically rely on ground truth human poses as observed input, which is not practical for real-world scenarios where only raw visual sensor data is available. To implement these methods in practice, a pre-phrase of pose estimation is essential. However, such two-stage approaches often lead… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  11. arXiv:2408.06787  [pdf, other

    cs.CL

    Unlock the Power of Frozen LLMs in Knowledge Graph Completion

    Authors: Bo Xue, Yi Xu, Yunchong Song, Yiming Pang, Yuyang Ren, Jiaxin Ding, Luoyi Fu, Xinbing Wang

    Abstract: Classical knowledge graph completion (KGC) methods rely solely on structural information, struggling with the inherent sparsity of knowledge graphs (KGs). Large Language Models (LLMs) learn extensive knowledge from large corpora with powerful context modeling, which is ideal for mitigating the limitations of previous methods. Directly fine-tuning LLMs offers great capability but comes at the cost… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  12. arXiv:2408.04967  [pdf, other

    eess.AS cs.SD

    ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild

    Authors: Jiangyan Yi, Chu Yuan Zhang, Jianhua Tao, Chenglong Wang, Xinrui Yan, Yong Ren, Hao Gu, Junzuo Zhou

    Abstract: The growing prominence of the field of audio deepfake detection is driven by its wide range of applications, notably in protecting the public from potential fraud and other malicious activities, prompting the need for greater attention and research in this area. The ADD 2023 challenge goes beyond binary real/fake classification by emulating real-world scenarios, such as the identification of manip… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  13. arXiv:2408.04708  [pdf, other

    cs.SD cs.AI eess.AS

    MulliVC: Multi-lingual Voice Conversion With Cycle Consistency

    Authors: Jiawei Huang, Chen Zhang, Yi Ren, Ziyue Jiang, Zhenhui Ye, Jinglin Liu, Jinzheng He, Xiang Yin, Zhou Zhao

    Abstract: Voice conversion aims to modify the source speaker's voice to resemble the target speaker while preserving the original speech content. Despite notable advancements in voice conversion these days, multi-lingual voice conversion (including both monolingual and cross-lingual scenarios) has yet to be extensively studied. It faces two main challenges: 1) the considerable variability in prosody and art… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  14. arXiv:2408.04334  [pdf, other

    cs.AR

    A Node-Based Polar List Decoder with Frame Interleaving and Ensemble Decoding Support

    Authors: Yuqing Ren, Leyu Zhang, Ludovic Damien Blanc, Yifei Shen, Xinwei Li, Alexios Balatsoukas-Stimming, Chuan Zhang, Andreas Burg

    Abstract: Node-based successive cancellation list (SCL) decoding has received considerable attention in wireless communications for its significant reduction in decoding latency, particularly with 5G New Radio (NR) polar codes. However, the existing node-based SCL decoders are constrained by sequential processing, leading to complicated and data-dependent computational units that introduce unavoidable stall… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: 13 pages, 16 figures, accepted by IEEE Transactions on Circuits and Systems I: Regular Papers

  15. arXiv:2408.02431  [pdf, other

    physics.optics cond-mat.mes-hall

    Moire exciton polaritons in twisted photonic lattices at room temperature

    Authors: Chunzi Xing, Yu Wang, Tobias Schneider, Xiaokun Zhai, Xinzheng Zhang, Zhenyu Xiong, Hao Wu, Yuan Ren, Haitao Dai, Xiao Wang, Anlian Pan, Stefan Schumacher, Xuekai Ma, Tingge Gao

    Abstract: Moire lattices attract intensive attention in the double graphene/TMD layers and photonic crystals due to the interesting exotic physics within these structures. However, precise measurement of the moir'e ground states, excited states and Bloch bands in the twisted photonic lattices is still illusive. In this work we report the strong coupling between the excitons of CsPbBr3 microplates and the ph… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  16. arXiv:2408.02036  [pdf, other

    cs.CV

    LEGO: Self-Supervised Representation Learning for Scene Text Images

    Authors: Yujin Ren, Jiaxin Zhang, Lianwen Jin

    Abstract: In recent years, significant progress has been made in scene text recognition by data-driven methods. However, due to the scarcity of annotated real-world data, the training of these methods predominantly relies on synthetic data. The distribution gap between synthetic and real data constrains the further performance improvement of these methods in real-world applications. To tackle this problem,… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

  17. arXiv:2408.00788  [pdf, other

    cs.NE cs.LG

    SpikeVoice: High-Quality Text-to-Speech Via Efficient Spiking Neural Network

    Authors: Kexin Wang, Jiahong Zhang, Yong Ren, Man Yao, Di Shang, Bo Xu, Guoqi Li

    Abstract: Brain-inspired Spiking Neural Network (SNN) has demonstrated its effectiveness and efficiency in vision, natural language, and speech understanding tasks, indicating their capacity to "see", "listen", and "read". In this paper, we design \textbf{SpikeVoice}, which performs high-quality Text-To-Speech (TTS) via SNN, to explore the potential of SNN to "speak". A major obstacle to using SNN for such… ▽ More

    Submitted 17 July, 2024; originally announced August 2024.

    Comments: 9 pages

  18. arXiv:2408.00661  [pdf, other

    physics.ins-det quant-ph

    Neuromorphic detection and cooling of microparticle arrays

    Authors: Yugang Ren, Benjamin Siegel, Ronghao Yin, Muddassar Rashid, James Millen

    Abstract: Micro-objects levitated in a vacuum are an exciting platform for precision sensing due to their low dissipation motion and the potential for control at the quantum level. Arrays of such sensors would allow noise cancellation, directionality, increased sensitivity and in the quantum regime the potential to exploit correlation and entanglement. We use neuromorphic detection via a single event-based… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  19. arXiv:2407.21491  [pdf

    cs.CL cs.SD eess.AS

    Generative Expressive Conversational Speech Synthesis

    Authors: Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li

    Abstract: Conversational Speech Synthesis (CSS) aims to express a target utterance with the proper speaking style in a user-agent conversation setting. Existing CSS methods employ effective multi-modal context modeling techniques to achieve empathy understanding and expression. However, they often need to design complex network architectures and meticulously optimize the modules within them. In addition, du… ▽ More

    Submitted 31 July, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

    Comments: 14 pages, 6 figures, 8 tables. Accepted by ACM MM 2024

  20. arXiv:2407.20481  [pdf, ps, other

    quant-ph physics.optics

    Stronger sum uncertainty relations for non-Hermitian operators

    Authors: Xiao-Feng Song, Yi-Fang Ren, Shuang Liu, Xi-Hao Chen, Yusuf Turek

    Abstract: Unlike the uncertainty relationships of two arbitrary incompatible observables represented by the product of variances in the past, representing them by the sum of variances is better as it guarantees to be nontrivial for two incompatible operators in some special cases. Although the uncertainty relation is formulated as the sum of variances for unitary operators has been confirmed, its general fo… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  21. arXiv:2407.19973  [pdf, other

    cond-mat.mes-hall

    Spontaneous spin superconductor state in ABCA-stacked tetralayer graphene

    Authors: Shuai Li, Yuan-Hang Ren, Ao-Long Li, Hua Jiang

    Abstract: We theoretically demonstrate a spontaneous spin superconductor (SC) state in ABCA-stacked tetralayer graphene, under sequential effects of electron-electron (e-e) and electron-hole (e-h) interactions. First of all, we examine the ferromagnetic (FM) exchange instability and phase diagram of the system induced by the long-range e-e interaction. At non- or low-doping levels, the interaction trends to… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: 15 pages,7 figures

  22. arXiv:2407.19691  [pdf, ps, other

    quant-ph cond-mat.mes-hall

    Detection of Electron Paramagnetic Resonance of Two Electron Spins Using a Single NV Center in Diamond

    Authors: Yuhang Ren, Susumu Takahashi

    Abstract: An interacting spin system is a great testbed for fundamental quantum physics and applications in quantum sensing and quantum simulation. For these investigations, detailed information of the interactions, e.g. the number of spins and their interaction strengths, is often required. In this study, we present the identification and characterization of a single nitrogen-vacancy (NV) center coupled to… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: 15 pages, 5 figures, submitted to APL Quantum

  23. arXiv:2407.18939  [pdf

    cs.CY cs.AI

    Promoting AI Competencies for Medical Students: A Scoping Review on Frameworks, Programs, and Tools

    Authors: Yingbo Ma, Yukyeong Song, Jeremy A. Balch, Yuanfang Ren, Divya Vellanki, Zhenhong Hu, Meghan Brennan, Suraj Kolla, Ziyuan Guan, Brooke Armfield, Tezcan Ozrazgat-Baslanti, Parisa Rashidi, Tyler J. Loftus, Azra Bihorac, Benjamin Shickel

    Abstract: As more clinical workflows continue to be augmented by artificial intelligence (AI), AI literacy among physicians will become a critical requirement for ensuring safe and ethical AI-enabled patient care. Despite the evolving importance of AI in healthcare, the extent to which it has been adopted into traditional and often-overloaded medical curricula is currently unknown. In a scoping review of 1,… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 25 pages, 2 figures, 3 tables

  24. arXiv:2407.18469  [pdf, ps, other

    math.OC eess.SY

    On Asymptotic Analysis of Perturbed Sweeping Processes with Application to Optimization

    Authors: Zhaoyue Xia, Jun Du, Chunxiao Jiang, H. Vincent Poor, Yong Ren

    Abstract: Convergence analysis of constrained optimization methods from the dynamical systems viewpoint has attracted considerable attention because it provides a geometric demonstration towards the shadowing trajectory of a numerical scheme. In this work, we establish a tight connection between a continuous-time nonsmooth dynamical system called a perturbed sweeping process (PSP) and a proximal stochastic… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  25. arXiv:2407.17386  [pdf, other

    astro-ph.SR astro-ph.GA

    Data-driven stellar intrinsic colors and dust reddenings for spectro-photometric data: From the blue-edge method to a machine-learning approach

    Authors: He Zhao, Shu Wang, Biwei Jiang, Jun Li, Dongwei Fan, Yi Ren, Xiaoxiao Ma

    Abstract: Intrinsic colors (ICs) of stars are essential for the studies on both stellar physics and dust reddening. In this work, we developed an XGBoost model to predict the ICs with the atmospheric parameters $T_{\rm eff}$, ${\rm log}\,g$, and $\rm [M/H]$. The model was trained and tested for three colors at Gaia and 2MASS bands with 1,040,446 low-reddening sources. The atmospheric parameters were determi… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: 23 pages, 1 table, 11 figures, 2 appendices, accepted for publication in ApJ

  26. arXiv:2407.16004  [pdf, other

    cond-mat.mes-hall cond-mat.mtrl-sci

    Theory of electric polarization induced by magnon transport in two-dimensional honeycomb antiferromagnets

    Authors: D. Quang To, Federico Garcia-Gaitan, Yafei Ren, Joshua M. O. Zide, John Q. Xiao, Branislav K. Nikolić, Garnett W. Bryant, Matthew F. Doty

    Abstract: We introduce a quantum mechanical formalism for computing the electric polarization arising in two-dimensional (2D) antiferromagnets (AFMs) as a result of {\bf both} spin and orbital transport effect of magnons. We first show that an applied temperature gradient in a 2D collinear honeycomb AFM gives rise to accumulations of magnons having both orbital moment and spin moment at the edges of the 2D… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: 11 pages, 5 figures

  27. arXiv:2407.14560  [pdf, other

    cs.LG cs.AI cs.AR

    Automated and Holistic Co-design of Neural Networks and ASICs for Enabling In-Pixel Intelligence

    Authors: Shubha R. Kharel, Prashansa Mukim, Piotr Maj, Grzegorz W. Deptuch, Shinjae Yoo, Yihui Ren, Soumyajit Mandal

    Abstract: Extreme edge-AI systems, such as those in readout ASICs for radiation detection, must operate under stringent hardware constraints such as micron-level dimensions, sub-milliwatt power, and nanosecond-scale speed while providing clear accuracy advantages over traditional architectures. Finding ideal solutions means identifying optimal AI and ASIC design choices from a design space that has explosiv… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 18 pages, 17 figures

  28. arXiv:2407.14239  [pdf, other

    cs.AI

    KoMA: Knowledge-driven Multi-agent Framework for Autonomous Driving with Large Language Models

    Authors: Kemou Jiang, Xuan Cai, Zhiyong Cui, Aoyong Li, Yilong Ren, Haiyang Yu, Hao Yang, Daocheng Fu, Licheng Wen, Pinlong Cai

    Abstract: Large language models (LLMs) as autonomous agents offer a novel avenue for tackling real-world challenges through a knowledge-driven manner. These LLM-enhanced methodologies excel in generalization and interpretability. However, the complexity of driving tasks often necessitates the collaboration of multiple, heterogeneous agents, underscoring the need for such LLM-driven agents to engage in coope… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: 13 pages, 18 figures

  29. arXiv:2407.13985  [pdf

    cond-mat.mtrl-sci physics.comp-ph

    Cluster Sliding Ferroelectricity in Trilayer Quasi-Hexagonal C60

    Authors: Xuefei Wang, Yanhan Ren, Shi Qiu, Fan Zhang, Xueao Li, Junfeng Gao, Weiwei Gao, Jijun Zhao

    Abstract: Electric polarization typically originates from non-centrosymmetric charge distributions. Since chemical bonds between atoms of the same elements favor centrosymmetric crystal structures and symmetrically distributed electron charges, elemental ferroelectrics are extremely rare. In comparison to atoms, elemental clusters are less symmetric and typically have various preferred orientations in cryst… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 5 figures

  30. arXiv:2407.13637  [pdf

    q-bio.QM

    Autonomous self-evolving research on biomedical data: the DREAM paradigm

    Authors: Luojia Deng, Yijie Wu, Yongyong Ren, Hui Lu

    Abstract: In contemporary biomedical research, the efficiency of data-driven approaches is hindered by large data volumes, tool selection complexity, and human resource limitations, necessitating the development of fully autonomous research systems to meet complex analytical needs. Such a system should include the ability to autonomously generate research questions, write analytical code, configure the comp… ▽ More

    Submitted 10 August, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: 11 pages, 4 figures, content added, typos in figure corrected, references revised and font changed

  31. arXiv:2407.13108  [pdf, other

    cs.CV

    UCIP: A Universal Framework for Compressed Image Super-Resolution using Dynamic Prompt

    Authors: Xin Li, Bingchen Li, Yeying Jin, Cuiling Lan, Hanxin Zhu, Yulin Ren, Zhibo Chen

    Abstract: Compressed Image Super-resolution (CSR) aims to simultaneously super-resolve the compressed images and tackle the challenging hybrid distortions caused by compression. However, existing works on CSR usually focuses on a single compression codec, i.e., JPEG, ignoring the diverse traditional or learning-based codecs in the practical application, e.g., HEVC, VVC, HIFIC, etc. In this work, we propose… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  32. arXiv:2407.11367  [pdf, other

    quant-ph

    Enhancement of nonclassical properties of two-mode squeezed vacuum state with postselected von Neumann measurement

    Authors: Janarbek Yuanbek, Yi-Fang Ren, Ahmad Abliz, Yusuf Turek

    Abstract: We investigate the effects of weak value amplification on the nonclassical properties of two-mode squeezing vacuum state. To show the advantages of the two-mode squeezing vacuum state based post-selective weak measurements.

    Submitted 16 July, 2024; originally announced July 2024.

  33. arXiv:2407.10833  [pdf, other

    eess.IV cs.CV

    MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration

    Authors: Yulin Ren, Xin Li, Bingchen Li, Xingrui Wang, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen

    Abstract: We present MoE-DiffIR, an innovative universal compressed image restoration (CIR) method with task-customized diffusion priors. This intends to handle two pivotal challenges in the existing CIR methods: (i) lacking adaptability and universality for different image codecs, e.g., JPEG and WebP; (ii) poor texture generation capability, particularly at low bitrates. Specifically, our MoE-DiffIR develo… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  34. arXiv:2407.10490  [pdf, other

    cs.LG cs.AI cs.CL

    Learning Dynamics of LLM Finetuning

    Authors: Yi Ren, Danica J. Sutherland

    Abstract: Learning dynamics, which describes how the learning of specific training examples influences the model's prediction of other examples, give us a powerful tool for understanding the behavior of deep learning systems. We study the learning dynamics of large language models during finetuning, by analyzing the step-wise decomposition and accumulated influence among different responses. Our framework a… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 32 pages

  35. arXiv:2407.09833  [pdf, other

    cs.CV

    LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free Environment

    Authors: Yiming Ren, Xiao Han, Yichen Yao, Xiaoxiao Long, Yujing Sun, Yuexin Ma

    Abstract: LiDAR-based human motion capture has garnered significant interest in recent years for its practicability in large-scale and unconstrained environments. However, most methods rely on cleanly segmented human point clouds as input, the accuracy and smoothness of their motion results are compromised when faced with noisy data, rendering them unsuitable for practical applications. To address these lim… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  36. arXiv:2407.09697  [pdf, other

    cs.CV

    Uplifting Range-View-based 3D Semantic Segmentation in Real-Time with Multi-Sensor Fusion

    Authors: Shiqi Tan, Hamidreza Fazlali, Yixuan Xu, Yuan Ren, Bingbing Liu

    Abstract: Range-View(RV)-based 3D point cloud segmentation is widely adopted due to its compact data form. However, RV-based methods fall short in providing robust segmentation for the occluded points and suffer from distortion of projected RGB images due to the sparse nature of 3D point clouds. To alleviate these problems, we propose a new LiDAR and Camera Range-view-based 3D point cloud semantic segmentat… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  37. arXiv:2407.09361  [pdf, ps, other

    cond-mat.mes-hall

    Nonreciprocal phonons in PT-symmetric antiferromagnet

    Authors: Yafei Ren, Daniyar Saparov, Qian Niu

    Abstract: Phonon nonreciprocity, indicating different transport properties along opposite directions, has been observed in experiments under a magnetic field. We show that nonreciprocal acoustic phonons can also exist without a magnetic field nor net magnetization. We focus on PT symmetric antiferromagnets that break both time-reversal T and inversion symmetry P. We identify crucial contributions in phenome… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 5 pages, 2 figures

  38. arXiv:2407.08239  [pdf, other

    cs.SD cs.LG eess.AS

    An Unsupervised Domain Adaptation Method for Locating Manipulated Region in partially fake Audio

    Authors: Siding Zeng, Jiangyan Yi, Jianhua Tao, Yujie Chen, Shan Liang, Yong Ren, Xiaohui Zhang

    Abstract: When the task of locating manipulation regions in partially-fake audio (PFA) involves cross-domain datasets, the performance of deep learning models drops significantly due to the shift between the source and target domains. To address this issue, existing approaches often employ data augmentation before training. However, they overlook the characteristics in target domain that are absent in sourc… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  39. arXiv:2407.07464  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Video-to-Audio Generation with Hidden Alignment

    Authors: Manjie Xu, Chenxing Li, Yong Ren, Rilin Chen, Yu Gu, Wei Liang, Dong Yu

    Abstract: Generating semantically and temporally aligned audio content in accordance with video input has become a focal point for researchers, particularly following the remarkable breakthrough in text-to-video generation. In this work, we aim to offer insights into the video-to-audio generation paradigm, focusing on three crucial aspects: vision encoders, auxiliary embeddings, and data augmentation techni… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: https://sites.google.com/view/vta-ldm

  40. arXiv:2407.06516  [pdf, other

    cs.CV

    VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving

    Authors: Yibo Liu, Zheyuan Yang, Guile Wu, Yuan Ren, Kejian Lin, Bingbing Liu, Yang Liu, Jinjun Shan

    Abstract: Generating 3D vehicle assets from in-the-wild observations is crucial to autonomous driving. Existing image-to-3D methods cannot well address this problem because they learn generation merely from image RGB information without a deeper understanding of in-the-wild vehicles (such as car models, manufacturers, etc.). This leads to their poor zero-shot prediction capability to handle real-world obser… ▽ More

    Submitted 10 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  41. arXiv:2407.05349  [pdf

    cond-mat.mtrl-sci

    Stable room-temperature multiferroic skyrmions in lithium niobate with enhanced Pockels effect

    Authors: Yalong Yu, Bo Xiong, Siqi Wu, Yekai Ren, Nuo Chen, Qingjiao Mi, Kangping Lou, Rui Wang, Tao Chu

    Abstract: Lithium Niobate (LN) is a ferroelectric material with exceptional electrical characteristics, including high piezoelectricity, high Pockels effect, etc. These properties make it a promising platform for numerous fields such as high-speed communication, optical computation, and quantum information processing. Besides these, the introduction of magnetic structures to LN holds significant potential t… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Report number: submit/5797581

  42. arXiv:2407.05089  [pdf, other

    stat.ME stat.AP

    Bayesian network-guided sparse regression with flexible varying effects

    Authors: Yangfan Ren, Christine B. Peterson, Marina Vannucci

    Abstract: In this paper, we propose Varying Effects Regression with Graph Estimation (VERGE), a novel Bayesian method for feature selection in regression. Our model has key aspects that allow it to leverage the complex structure of data sets arising from genomics or imaging studies. We distinguish between the predictors, which are the features utilized in the outcome prediction model, and the subject-level… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  43. arXiv:2407.04575  [pdf, other

    eess.AS

    FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder

    Authors: Rubing Shen, Yanzhen Ren, Zongkun Sun

    Abstract: Generative adversarial network (GAN) based vocoders have achieved significant attention in speech synthesis with high quality and fast inference speed. However, there still exist many noticeable spectral artifacts, resulting in the quality decline of synthesized speech. In this work, we adopt a novel GAN-based vocoder designed for few artifacts and high fidelity, called FA-GAN. To suppress the ali… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  44. arXiv:2407.04216  [pdf, other

    cs.RO

    Safe MPC Alignment with Human Directional Feedback

    Authors: Zhixian Xie, Wenlong Zhang, Yi Ren, Zhaoran Wang, George J. Pappas, Wanxin Jin

    Abstract: In safety-critical robot planning or control, manually specifying safety constraints or learning them from demonstrations can be challenging. In this paper, we propose a certifiable alignment method for a robot to learn a safety constraint in its model predictive control (MPC) policy with human online directional feedback. To our knowledge, it is the first method to learn safety constraints from h… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 18 pages, submission to T-RO

  45. arXiv:2407.03000  [pdf, other

    cs.CL cs.CV

    VIVA: A Benchmark for Vision-Grounded Decision-Making with Human Values

    Authors: Zhe Hu, Yixiao Ren, Jing Li, Yu Yin

    Abstract: This paper introduces VIVA, a benchmark for VIsion-grounded decision-making driven by human VAlues. While most large vision-language models (VLMs) focus on physical-level skills, our work is the first to examine their multimodal capabilities in leveraging human values to make decisions under a vision-depicted situation. VIVA contains 1,062 images depicting diverse real-world situations and the man… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  46. arXiv:2407.02839  [pdf, other

    cs.IR cs.AI

    CRUISE on Quantum Computing for Feature Selection in Recommender Systems

    Authors: Jiayang Niu, Jie Li, Ke Deng, Yongli Ren

    Abstract: Using Quantum Computers to solve problems in Recommender Systems that classical computers cannot address is a worthwhile research topic. In this paper, we use Quantum Annealers to address the feature selection problem in recommendation algorithms. This feature selection problem is a Quadratic Unconstrained Binary Optimization(QUBO) problem. By incorporating Counterfactual Analysis, we significantl… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: accepted by QuantumCLEF 2024

  47. arXiv:2407.02598  [pdf, other

    cs.CV cs.AI

    AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction

    Authors: Mustafa Khan, Hamidreza Fazlali, Dhruv Sharma, Tongtong Cao, Dongfeng Bai, Yuan Ren, Bingbing Liu

    Abstract: Realistic scene reconstruction and view synthesis are essential for advancing autonomous driving systems by simulating safety-critical scenarios. 3D Gaussian Splatting excels in real-time rendering and static scene reconstructions but struggles with modeling driving scenarios due to complex backgrounds, dynamic objects, and sparse views. We propose AutoSplat, a framework employing Gaussian splatti… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  48. arXiv:2407.01816  [pdf, ps, other

    math.PR

    Asymptotic behaviors of subcritical branching killed Brownian motion with drift

    Authors: Haojie Hou, Yan-Xia Ren, Renming Song, Yaping Zhu

    Abstract: In this paper, we study asymptotic behaviors of a subcritical branching killed Brownian motion with drift $-ρ$ and offspring distribution $\{p_k:k\ge 0\}$. Let $\widetildeζ^{-ρ}$ be the extinction time of this subcritical branching killed Brownian motion, $\widetilde{M}_t^{-ρ}$ the maximal position of all the particles alive at time $t$ and $\widetilde{M}^{-ρ}:=\max_{t\ge 0}\widetilde{M}_t^{-ρ}$ t… ▽ More

    Submitted 3 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  49. arXiv:2407.00167  [pdf, other

    cs.CL cs.AI cs.ET cs.HC cs.SI

    Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach

    Authors: Sai Krishna Revanth Vuruma, Dezhi Wu, Saborny Sen Gupta, Lucas Aust, Valerie Lookingbill, Wyatt Bellamy, Yang Ren, Erin Kasson, Li-Shiun Chen, Patricia Cavazos-Rehg, Dian Hu, Ming Huang

    Abstract: In recent years, the United States has witnessed a significant surge in the popularity of vaping or e-cigarette use, leading to a notable rise in cases of e-cigarette and vaping use-associated lung injury (EVALI) that caused hospitalizations and fatalities during the EVALI outbreak in 2019, highlighting the urgency to comprehend vaping behaviors and develop effective strategies for cessation. Due… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: Accepted for the AI Applications in Public Health and Social Services workshop at the 22nd International Conference on Artificial Intelligence in Medicine (AIME 2024)

  50. arXiv:2407.00072  [pdf, other

    cs.IR cs.CL

    Pistis-RAG: A Scalable Cascading Framework Towards Trustworthy Retrieval-Augmented Generation

    Authors: Yu Bai, Yukai Miao, Li Chen, Dan Li, Yanyu Ren, Hongtao Xie, Ce Yang, Xuhui Cai

    Abstract: In Greek mythology, Pistis symbolized good faith, trust, and reliability. Drawing inspiration from these principles, Pistis-RAG is a scalable multi-stage framework designed to address the challenges of large-scale retrieval-augmented generation (RAG) systems. This framework consists of distinct stages: matching, pre-ranking, ranking, reasoning, and aggregating. Each stage contributes to narrowing… ▽ More

    Submitted 1 August, 2024; v1 submitted 21 June, 2024; originally announced July 2024.