Zum Hauptinhalt springen

Showing 1–50 of 330 results for author: Yao, C

.
  1. arXiv:2408.14805  [pdf, other

    cs.CV

    Platypus: A Generalized Specialist Model for Reading Text in Various Forms

    Authors: Peng Wang, Zhaohai Li, Jun Tang, Humen Zhong, Fei Huang, Zhibo Yang, Cong Yao

    Abstract: Reading text from images (either natural scenes or documents) has been a long-standing research topic for decades, due to the high technical challenge and wide application range. Previously, individual specialist models are developed to tackle the sub-tasks of text reading (e.g., scene text recognition, handwritten text recognition and mathematical expression recognition). However, such specialist… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: Accepted by ECCV2024

  2. arXiv:2408.06150  [pdf, other

    cs.CL physics.chem-ph q-bio.BM

    LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library

    Authors: Tianhao Yu, Cai Yao, Zhuorui Sun, Feng Shi, Lin Zhang, Kangjie Lyu, Xuan Bai, Andong Liu, Xicheng Zhang, Jiali Zou, Wenshou Wang, Chris Lai, Kai Wang

    Abstract: In this study, we generate and maintain a database of 10 million virtual lipids through METiS's in-house de novo lipid generation algorithms and lipid virtual screening techniques. These virtual lipids serve as a corpus for pre-training, lipid representation learning, and downstream task knowledge transfer, culminating in state-of-the-art LNP property prediction performance. We propose LipidBERT,… ▽ More

    Submitted 19 August, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

  3. arXiv:2408.04184  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Exotic thermoelectric properties of coronene-cyclobutadienoid graphene nanoribbons

    Authors: C. Yao, Chen Kong, H. F. Feng, Y. Dong, L. Huang, X. Zhang, Z. X. Song, Zhi-Xin Guo

    Abstract: Thermoelectric materials traditionally incorporate heavy metals to achieve low lattice thermal conductivity. However, elements such as Te, Bi, and Pb are costly and pose environmental hazards. In this study, we introduce a novel design strategy for thermoelectric materials, focusing on room-temperature, light-element, and high-ZT materials such as coronene-cyclobutadienoid graphene nanoribbons (co… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: 6 Figures

  4. arXiv:2407.18172  [pdf

    physics.optics

    Chip-scale sensor for spectroscopic metrology

    Authors: Chunhui Yao, Wanlu Zhang, Peng Bao, Jie Ma, Wei Zhuo, Minjia Chen, Zhitian Shi, Jingwen Zhou, Yuxiao Ye, Liang Ming, Ting Yan, Richard Penty, Qixiang Cheng

    Abstract: Miniaturized spectrometers hold great promise for in situ, in vitro, and even in vivo sensing applications. However, their size reduction imposes vital performance constraints in meeting the rigorous demands of spectroscopy, including fine resolution, high accuracy, and ultra-wide observation window. The prevailing view in the community holds that miniaturized spectrometers are most suitable for t… ▽ More

    Submitted 12 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

  5. arXiv:2407.17470  [pdf, other

    cs.CV

    SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency

    Authors: Yiming Xie, Chun-Han Yao, Vikram Voleti, Huaizu Jiang, Varun Jampani

    Abstract: We present Stable Video 4D (SV4D), a latent video diffusion model for multi-frame and multi-view consistent dynamic 3D content generation. Unlike previous methods that rely on separately trained generative models for video generation and novel view synthesis, we design a unified diffusion model to generate novel view videos of dynamic 3D objects. Specifically, given a monocular reference video, SV… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: Project page: https://sv4d.github.io/

  6. arXiv:2407.15502  [pdf, other

    cs.CV

    WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation

    Authors: Zirui Shao, Feiyu Gao, Hangdi Xing, Zepeng Zhu, Zhi Yu, Jiajun Bu, Qi Zheng, Cong Yao

    Abstract: In the era of content creation revolution propelled by advancements in generative models, the field of web design remains unexplored despite its critical role in modern digital communication. The web design process is complex and often time-consuming, especially for those with limited expertise. In this paper, we introduce Web Rendering Parameters Generation (WebRPG), a new task that aims at autom… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: Accepted at ECCV 2024. The dataset and code can be accessed at https://github.com/AlibabaResearch/AdvancedLiterateMachinery/tree/main/DocumentUnderstanding/WebRPG

  7. arXiv:2407.14526  [pdf, other

    math.NT

    A Random Matrix Model for a Family of Cusp Forms

    Authors: Owen Barrett, Zoë X. Batterman, Aditya Jambhale, Steven J. Miller, Akash L. Narayanan, Kishan Sharma, Chris Yao

    Abstract: The Katz-Sarnak philosophy states that statistics of zeros of $L$-function families near the central point as the conductors tend to infinity agree with those of eigenvalues of random matrix ensembles as the matrix size tends to infinity. While numerous results support this conjecture, S. J. Miller observed that for finite conductors, very different behavior can occur for zeros near the central po… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 55 pages, 13 figures. arXiv admin note: substantial text overlap with arXiv:2402.06641

    MSC Class: 11M26; 11M50; 15B52; 15B10

  8. arXiv:2407.14138  [pdf, other

    cs.CV

    Visual Text Generation in the Wild

    Authors: Yuanzhi Zhu, Jiawei Liu, Feiyu Gao, Wenyu Liu, Xinggang Wang, Peng Wang, Fei Huang, Cong Yao, Zhibo Yang

    Abstract: Recently, with the rapid advancements of generative models, the field of visual text generation has witnessed significant progress. However, it is still challenging to render high-quality text images in real-world scenarios, as three critical criteria should be satisfied: (1) Fidelity: the generated text images should be photo-realistic and the contents are expected to be the same as specified in… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  9. arXiv:2407.12358  [pdf, other

    cs.CV cs.CL

    ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data

    Authors: Yufan Shen, Chuwei Luo, Zhaoqing Zhu, Yang Chen, Qi Zheng, Zhi Yu, Jiajun Bu, Cong Yao

    Abstract: Recently, large language models (LLMs) and multimodal large language models (MLLMs) have demonstrated promising results on document visual question answering (VQA) task, particularly after training on document instruction datasets. An effective evaluation method for document instruction data is crucial in constructing instruction data with high efficacy, which, in turn, facilitates the training of… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  10. arXiv:2407.11950  [pdf, other

    cs.CV

    Temporally Consistent Stereo Matching

    Authors: Jiaxi Zeng, Chengtang Yao, Yuwei Wu, Yunde Jia

    Abstract: Stereo matching provides depth estimation from binocular images for downstream applications. These applications mostly take video streams as input and require temporally consistent depth maps. However, existing methods mainly focus on the estimation at the single-frame level. This commonly leads to temporally inconsistent results, especially in ill-posed regions. In this paper, we aim to leverage… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  11. arXiv:2406.17255  [pdf, other

    cs.CL

    MPCODER: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning

    Authors: Zhenlong Dai, Chang Yao, WenKang Han, Ying Yuan, Zhipeng Gao, Jingyuan Chen

    Abstract: Large Language Models (LLMs) have demonstrated great potential for assisting developers in their daily development. However, most research focuses on generating correct code, how to use LLMs to generate personalized code has seldom been investigated. To bridge this gap, we proposed MPCoder (Multi-user Personalized Code Generator) to generate personalized code for multiple users. To better learn co… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL 2024, Main Conference

  12. arXiv:2406.11382  [pdf, other

    hep-ph hep-ex

    Baryon-number-violating nucleon decays in ALP effective field theories

    Authors: Tong Li, Michael A. Schmidt, Chang-Yuan Yao

    Abstract: The search for baryon-number-violating (BNV) nucleon decay is an intriguing probe of new physics beyond the SM in future neutrino experiments with enhanced sensitivity. The dark sector states such as an axion or axion-like particle (ALP) can induce nucleon decays with distinct signature and kinematics from the conventional nucleon decays. In this work, we study the ALP effective field theories (EF… ▽ More

    Submitted 16 August, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 31 pages, 4 figures, 7 tables. version accepted for publication in JHEP

    Report number: CPPC-2024-05, DESY-24-082

  13. arXiv:2406.06062  [pdf, other

    cs.CV cs.AI

    ProcessPainter: Learn Painting Process from Sequence Data

    Authors: Yiren Song, Shijie Huang, Chen Yao, Xiaojun Ye, Hai Ci, Jiaming Liu, Yuxuan Zhang, Mike Zheng Shou

    Abstract: The painting process of artists is inherently stepwise and varies significantly among different painters and styles. Generating detailed, step-by-step painting processes is essential for art education and research, yet remains largely underexplored. Traditional stroke-based rendering methods break down images into sequences of brushstrokes, yet they fall short of replicating the authentic processe… ▽ More

    Submitted 20 July, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  14. arXiv:2406.03639  [pdf, ps, other

    math.DG math-ph math.SG

    Gravitating vortices and Symplectic Reduction by Stages

    Authors: L. Álvarez-Cónsul, M. Garcia-Fernandez, O. García-Prada, V. P. Pingali, C. -J. Yao

    Abstract: We undertake a novel approach to the existence problem for gravitating vortices on a Riemann surface based on symplectic reduction by stages, which seems to be new in the PDE as well as the gauge theory literature. The main technical tool for our study is the reduced $α$-K-energy, for which we establish convexity properties by means of finite-energy pluripotential theory, as recently applied to th… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 48 pages, no figures, comments are welcome

    MSC Class: Primary 53C07; Secondary 53D20; 53C25

  15. arXiv:2406.02430  [pdf, other

    eess.AS cs.SD

    Seed-TTS: A Family of High-Quality Versatile Speech Generation Models

    Authors: Philip Anastassiou, Jiawei Chen, Jitong Chen, Yuanzhe Chen, Zhuo Chen, Ziyi Chen, Jian Cong, Lelai Deng, Chuang Ding, Lu Gao, Mingqing Gong, Peisong Huang, Qingqing Huang, Zhiying Huang, Yuanyuan Huo, Dongya Jia, Chumin Li, Feiya Li, Hui Li, Jiaxin Li, Xiaoyang Li, Xingxing Li, Lin Liu, Shouda Liu, Sichao Liu , et al. (21 additional authors not shown)

    Abstract: We introduce Seed-TTS, a family of large-scale autoregressive text-to-speech (TTS) models capable of generating speech that is virtually indistinguishable from human speech. Seed-TTS serves as a foundation model for speech generation and excels in speech in-context learning, achieving performance in speaker similarity and naturalness that matches ground truth human speech in both objective and sub… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  16. arXiv:2406.00094  [pdf, other

    hep-ph

    The flavor invariants of the $ν$SM

    Authors: Christophe Grojean, Jonathan Kley, Damien Leflot, Chang-Yuan Yao

    Abstract: Sixty years after the experimental discovery of CP violation in the quark sector, the existence of a similar CP violation in the lepton sector is still to be established. Actually, the structure of such a violation depends crucially on the origin of the neutrino masses. In an attempt at categorizing the leptonic sources of CP violation, we studied the $ν$SM, the Standard Model extended with three… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: 27 pages + appendices, 3 figures

    Report number: CERN-TH-2024-076, DESY-24-021, HU-EP-24/14

  17. arXiv:2405.18458  [pdf

    cs.LG physics.optics

    Asymmetrical estimator for training encapsulated deep photonic neural networks

    Authors: Yizhi Wang, Minjia Chen, Chunhui Yao, Jie Ma, Ting Yan, Richard Penty, Qixiang Cheng

    Abstract: Scalable isomorphic physical neural networks (PNNs) are emerging NN acceleration paradigms for their high-bandwidth, in-propagation computation. Despite backpropagation (BP)-based training is often the industry standard for its robustness and fast gradient convergences, existing BP-PNN training methods need to truncate the propagation of analogue signal at each layer and acquire accurate hidden ne… ▽ More

    Submitted 15 August, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 21 pages, 6 figures

    MSC Class: 78-05

  18. arXiv:2405.14336  [pdf, other

    eess.IV

    I$^2$VC: A Unified Framework for Intra- & Inter-frame Video Compression

    Authors: Meiqin Liu, Chenming Xu, Yukai Gu, Chao Yao, Yao Zhao

    Abstract: Video compression aims to reconstruct seamless frames by encoding the motion and residual information from existing frames. Previous neural video compression methods necessitate distinct codecs for three types of frames (I-frame, P-frame and B-frame), which hinders a unified approach and generalization across different video contexts. Intra-codec techniques lack the advanced Motion Estimation and… ▽ More

    Submitted 1 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 19 pages, 10 figures

  19. arXiv:2405.02313  [pdf, ps, other

    physics.flu-dyn

    Physics-informed Data-driven Cavitation Model for a Specific MG EOS

    Authors: Minsheng Huang, Chengbao Yao, Pan Wang, Lidong Cheng, Wenjun Ying

    Abstract: We present a novel one-fluid cavitation model of a specific Mie-Grüneisen equation of state(EOS), named polynomial EOS, based on an artificial neural network. Not only the physics-informed equation but also the experimental data are embedded into the proposed model by an optimization problem. The physics-informed data-driven model provides the concerned pressure within the cavitation region, where… ▽ More

    Submitted 5 April, 2024; originally announced May 2024.

    Comments: 29 pages, 18 figures

  20. arXiv:2405.00277  [pdf, other

    quant-ph

    The strong-coupling quantum thermodynamics of quantum Brownian motion based on the exact solution of its reduced density matrix

    Authors: Chuan-Zhe Yao, Wei-Min Zhang

    Abstract: We derive the quantum thermodynamics of quantum Brownian motion from the exact solution of its reduced density matrix. We start from the total equilibrium thermal state between the Brownian particle and its reservoir, and solve analytically and exactly the reduced density matrix of the system by taking the partial trace over all the reservoir states. We find that the reduced Hamiltonian and the re… ▽ More

    Submitted 5 July, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

    Comments: 21 pages, 5 figures

  21. arXiv:2404.15016  [pdf, ps, other

    math.DG math.SG

    Convergence of the hypersymplectic flow on $T^4$ with $T^3$-symmetry

    Authors: Joel Fine, Weiyong He, Chengjian Yao

    Abstract: A hypersymplectic structure on a 4-manifold is a triple $ω_1, ω_2, ω_3$ of 2-forms for which every non-trivial linear combination $a^1ω_1 + a^2 ω_2 + a^3 ω_3$ is a symplectic form. Donaldson has conjectured that when the underlying manifold is compact, any such structure is isotopic in its cohomolgy class to a hyperkähler triple. We prove this conjecture for a hypersymplectic structure on $T^4$ wh… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 25 pages

    MSC Class: 58J35; 53C26; 53D05

  22. arXiv:2404.13600  [pdf, other

    cs.RO

    Are We Ready for Planetary Exploration Robots? The TAIL-Plus Dataset for SLAM in Granular Environments

    Authors: Zirui Wang, Chen Yao, Yangtao Ge, Guowei Shi, Ningbo Yang, Zheng Zhu, Kewei Dong, Hexiang Wei, Zhenzhong Jia, Jing Wu

    Abstract: So far, planetary surface exploration depends on various mobile robot platforms. The autonomous navigation and decision-making of these mobile robots in complex terrains largely rely on their terrain-aware perception, localization and mapping capabilities. In this paper we release the TAIL-Plus dataset, a new challenging dataset in deformable granular environments for planetary exploration robots,… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted to the IEEE ICRA Workshop on Field Robotics 2024

  23. arXiv:2404.09986  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Thermal conversion of ultrathin nickel hydroxide for wide bandgap 2D nickel oxides

    Authors: Lu Ping, Nicholas Russo, Zifan Wang, Ching-Hsiang Yao, Kevin E. Smith, Xi Ling

    Abstract: Wide bandgap (WBG) semiconductors (Eg >2.0 eV) are integral to the advancement of next generation electronics, optoelectronics, and power industries, owing to their capability for high temperature operation, high breakdown voltage and efficient light emission. Enhanced power efficiency and functional performance can be attained through miniaturization, specifically via the integration of device fa… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  24. arXiv:2404.06853  [pdf, other

    cond-mat.mtrl-sci

    Revealing mechanism of pore defect formation in laser directed energy deposition of aluminum alloy via in-situ synchrotron X-ray imaging

    Authors: Wei Liu, Yuxiao Li, Chunxia Yao, Dongsheng Zhang, Darui Sun, Sen Chen, Yu Wu, Jun Wang, Lei Lud, Sheng-Nian Luo, Ye Tao, Bingbing Zhang

    Abstract: Laser metal additive manufacturing technology is capable of producing components with complex geometries and compositions that cannot be realized by conventional manufacturing methods. However, a large number of pores generated during the additive manufacturing process greatly affect the mechanical properties of the additively manufactured parts, and the mechanism of such pore generation has not b… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 7 figures

  25. arXiv:2404.05225  [pdf, other

    cs.CV cs.CL

    LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding

    Authors: Chuwei Luo, Yufan Shen, Zhaoqing Zhu, Qi Zheng, Zhi Yu, Cong Yao

    Abstract: Recently, leveraging large language models (LLMs) or multimodal large language models (MLLMs) for document understanding has been proven very promising. However, previous works that employ LLMs/MLLMs for document understanding have not fully explored and utilized the document layout information, which is vital for precise document understanding. In this paper, we propose LayoutLLM, an LLM/MLLM bas… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  26. arXiv:2403.19128  [pdf, other

    cs.CV

    OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

    Authors: Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo Yang

    Abstract: Recently, visually-situated text parsing (VsTP) has experienced notable advancements, driven by the increasing demand for automated document understanding and the emergence of Generative Large Language Models (LLMs) capable of processing document-based questions. Various methods have been proposed to address the challenging problem of VsTP. However, due to the diversified targets and heterogeneous… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  27. arXiv:2403.17842  [pdf, other

    quant-ph cond-mat.str-el

    Experimental Realization of Discrete Time Quasi-Crystals

    Authors: Guanghui He, Bingtian Ye, Ruotian Gong, Changyu Yao, Zhongyuan Liu, Kater W. Murch, Norman Y. Yao, Chong Zu

    Abstract: Floquet (periodically driven) systems can give rise to unique non-equilibrium phases of matter without equilibrium analogs. The most prominent example is the realization of discrete time crystals. An intriguing question emerges: what other novel phases can manifest when the constraint of time periodicity is relaxed? In this study, we explore quantum systems subjected to a quasi-periodic drive. Lev… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 7+5 pages, 4+5 figures

  28. arXiv:2403.16875  [pdf, other

    cs.RO

    TAIL: A Terrain-Aware Multi-Modal SLAM Dataset for Robot Locomotion in Deformable Granular Environments

    Authors: Chen Yao, Yangtao Ge, Guowei Shi, Zirui Wang, Ningbo Yang, Zheng Zhu, Hexiang Wei, Yuntian Zhao, Jing Wu, Zhenzhong Jia

    Abstract: Terrain-aware perception holds the potential to improve the robustness and accuracy of autonomous robot navigation in the wilds, thereby facilitating effective off-road traversals. However, the lack of multi-modal perception across various motion patterns hinders the solutions of Simultaneous Localization And Mapping (SLAM), especially when confronting non-geometric hazards in demanding landscapes… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Submitted to IEEE Robotics and Automation Letters

  29. arXiv:2403.16662  [pdf, other

    cs.CL

    RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict

    Authors: Yirong Zeng, Xiao Ding, Yi Zhao, Xiangyu Li, Jie Zhang, Chao Yao, Ting Liu, Bing Qin

    Abstract: Fact-checking is the task of verifying the factuality of a given claim by examining the available evidence. High-quality evidence plays a vital role in enhancing fact-checking systems and facilitating the generation of explanations that are understandable to humans. However, the provision of both sufficient and relevant evidence for explainable fact-checking systems poses a challenge. To tackle th… ▽ More

    Submitted 26 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: 12 pages, 3 figures, accepted by lrec-coling2024

  30. arXiv:2403.14023  [pdf

    cs.CR

    A system capable of verifiably and privately screening global DNA synthesis

    Authors: Carsten Baum, Jens Berlips, Walther Chen, Hongrui Cui, Ivan Damgard, Jiangbin Dong, Kevin M. Esvelt, Mingyu Gao, Dana Gretton, Leonard Foner, Martin Kysel, Kaiyi Zhang, Juanru Li, Xiang Li, Omer Paneth, Ronald L. Rivest, Francesca Sage-Ling, Adi Shamir, Yue Shen, Meicen Sun, Vinod Vaikuntanathan, Lynn Van Hauwe, Theia Vogel, Benjamin Weinstein-Raun, Yun Wang , et al. (5 additional authors not shown)

    Abstract: Printing custom DNA sequences is essential to scientific and biomedical research, but the technology can be used to manufacture plagues as well as cures. Just as ink printers recognize and reject attempts to counterfeit money, DNA synthesizers and assemblers should deny unauthorized requests to make viral DNA that could be used to ignite a pandemic. There are three complications. First, we don't n… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Main text 10 pages, 4 figures. 5 supplementary figures. Total 21 pages. Direct correspondence to: Ivan B. Damgard ([email protected]), Andrew C. Yao ([email protected]), Kevin M. Esvelt ([email protected])

  31. arXiv:2403.13761  [pdf, other

    cs.CV

    HierCode: A Lightweight Hierarchical Codebook for Zero-shot Chinese Text Recognition

    Authors: Yuyi Zhang, Yuanzhi Zhu, Dezhi Peng, Peirong Zhang, Zhenhua Yang, Zhibo Yang, Cong Yao, Lianwen Jin

    Abstract: Text recognition, especially for complex scripts like Chinese, faces unique challenges due to its intricate character structures and vast vocabulary. Traditional one-hot encoding methods struggle with the representation of hierarchical radicals, recognition of Out-Of-Vocabulary (OOV) characters, and on-device deployment due to their computational intensity. To address these challenges, we propose… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  32. arXiv:2403.13065  [pdf, other

    hep-ph hep-th

    Aligned Yet Large Dipoles: a SMEFT Study

    Authors: Quentin Bonnefoy, Jonathan Kley, Di Liu, Alejo N. Rossia, Chang-Yuan Yao

    Abstract: We study a non-universal flavor scenario at the level of the Standard Model Effective Field Theory, according to which the matrix of Wilson coefficients $c_{uW}$ of an up-type electroweak quark dipole operator is aligned with the up-type Yukawa coupling. Such an alignment usually follows from the assumption of Minimal Flavor Violation (MFV), away from which we step by allowing the entries of… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 35 pages, 6 figures, 7 tables. Comments are welcomed

    Report number: DESY-24-033, HU-EP-24/09, LAPTH-011/24, COMETA-2024-004

  33. arXiv:2403.12008  [pdf, other

    cs.CV

    SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

    Authors: Vikram Voleti, Chun-Han Yao, Mark Boss, Adam Letts, David Pankratz, Dmitry Tochilkin, Christian Laforte, Robin Rombach, Varun Jampani

    Abstract: We present Stable Video 3D (SV3D) -- a latent video diffusion model for high-resolution, image-to-multi-view generation of orbital videos around a 3D object. Recent work on 3D generation propose techniques to adapt 2D generative models for novel view synthesis (NVS) and 3D optimization. However, these methods have several disadvantages due to either limited views or inconsistent NVS, thereby affec… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Project page: https://sv3d.github.io/

  34. arXiv:2403.11221  [pdf, other

    cs.DC cs.DB

    Lion: Minimizing Distributed Transactions through Adaptive Replica Provision (Extended Version)

    Authors: Qiushi Zheng, Zhanhao Zhao, Wei Lu, Chang Yao, Yuxing Chen, Anqun Pan, Xiaoyong Du

    Abstract: Distributed transaction processing often involves multiple rounds of cross-node communications, and therefore tends to be slow. To improve performance, existing approaches convert distributed transactions into single-node transactions by either migrating co-accessed partitions onto the same nodes or establishing a super node housing replicas of the entire database. However, migration-based methods… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  35. arXiv:2403.10357  [pdf, other

    cs.CV cs.GR

    ANIM: Accurate Neural Implicit Model for Human Reconstruction from a single RGB-D image

    Authors: Marco Pesavento, Yuanlu Xu, Nikolaos Sarafianos, Robert Maier, Ziyan Wang, Chun-Han Yao, Marco Volino, Edmond Boyer, Adrian Hilton, Tony Tung

    Abstract: Recent progress in human shape learning, shows that neural implicit models are effective in generating 3D human surfaces from limited number of views, and even from a single RGB image. However, existing monocular approaches still struggle to recover fine geometric details such as face, hands or cloth wrinkles. They are also easily prone to depth ambiguities that result in distorted geometries alon… ▽ More

    Submitted 18 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR24; Project page: https://marcopesavento.github.io/ANIM/

  36. arXiv:2403.00496  [pdf

    physics.optics physics.app-ph

    Benchmarking reconstructive spectrometer with multi-resonant cavities

    Authors: Chunhui Yao, Kangning Xu, Tianhua Lin, Jie Ma, Chumeng Yao, Peng Bao, Zhitian Shi, Richard Penty, Qixiang Cheng

    Abstract: Recent years have seen the rapid development of miniaturized reconstructive spectrometers (RSs), yet they still confront a range of technical challenges, such as bandwidth/resolution ratio, sensing speed, and/or power efficiency. Reported RS designs often suffer from insufficient decorrelation between sampling channels, which results in limited compressive sampling efficiency, in essence, due to i… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  37. arXiv:2402.17232  [pdf, other

    math.NA cs.LG physics.comp-ph

    Two-scale Neural Networks for Partial Differential Equations with Small Parameters

    Authors: Qiao Zhuang, Chris Ziyi Yao, Zhongqiang Zhang, George Em Karniadakis

    Abstract: We propose a two-scale neural network method for solving partial differential equations (PDEs) with small parameters using physics-informed neural networks (PINNs). We directly incorporate the small parameters into the architecture of neural networks. The proposed method enables solving PDEs with small parameters in a simple fashion, without adding Fourier features or other computationally taxing… ▽ More

    Submitted 13 August, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    MSC Class: 65N35; 35B25 ACM Class: I.2.6

  38. arXiv:2402.09152  [pdf, other

    cs.LG

    Improved Regret for Bandit Convex Optimization with Delayed Feedback

    Authors: Yuanyu Wan, Chang Yao, Mingli Song, Lijun Zhang

    Abstract: We investigate bandit convex optimization (BCO) with delayed feedback, where only the loss value of the action is revealed under an arbitrary delay. Let $n,T,\bar{d}$ denote the dimensionality, time horizon, and average delay, respectively. Previous studies have achieved an $O(\sqrt{n}T^{3/4}+(n\bar{d})^{1/3}T^{2/3})$ regret bound for this problem, whose delay-independent part matches the regret o… ▽ More

    Submitted 23 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  39. arXiv:2402.07625  [pdf, other

    cs.CL cs.AI cs.LG

    Autonomous Data Selection with Language Models for Mathematical Texts

    Authors: Yifan Zhang, Yifan Luo, Yang Yuan, Andrew Chi-Chih Yao

    Abstract: To improve language models' proficiency in mathematical reasoning via continual pretraining, we introduce a novel strategy that leverages base language models for autonomous data selection. Departing from conventional supervised fine-tuning or trained classifiers with human-annotated data, our approach Autonomous Data Selection (AutoDS) utilizes meta-prompted language models as zero-shot verifiers… ▽ More

    Submitted 2 April, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  40. arXiv:2402.06641  [pdf, other

    math.NT

    A Survey of a Random Matrix Model for a Family of Cusp Forms

    Authors: Owen Barrett, Zoë X. Batterman, Aditya Jambhale, Steven J. Miller, Akash L. Narayanan, Kishan Sharma, Chris Yao

    Abstract: The Katz-Sarnak philosophy states that statistics of zeros of $L$-function families near the central point as the conductors tend to infinity agree with those of eigenvalues of random matrix ensembles as the matrix size tends to infinity. While numerous results support this conjecture, S. J. Miller observed that for finite conductors, very different behavior can occur for zeros near the central po… ▽ More

    Submitted 17 April, 2024; v1 submitted 28 January, 2024; originally announced February 2024.

    Comments: 28 pages, 7 figures

    MSC Class: 11M26; 11M50

  41. arXiv:2401.17372  [pdf, other

    quant-ph physics.bio-ph

    Optically-Trapped Nanodiamond-Relaxometry Detection of Nanomolar Paramagnetic Spins in Aqueous Environments

    Authors: Shiva Iyer, Changyu Yao, Olivia Lazorik, Pengyun Wang, Gianna Glenn, Michael Mohs, Yinyao Shi, Michael Mansour, Erik Henriksen, Kater Murch, Shankar Mukherji, Chong Zu

    Abstract: Probing electrical and magnetic properties in aqueous environments remains a frontier challenge in nanoscale sensing. Our inability to do so with quantitative accuracy imposes severe limitations, for example, on our understanding of the ionic environments in a diverse array of systems, ranging from novel materials to the living cell. The Nitrogen-Vacancy (NV) center in fluorescent nanodiamonds (FN… ▽ More

    Submitted 20 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 6+7 pages, 3+8 figures

  42. arXiv:2401.17254  [pdf, other

    math.NT

    Limiting Behavior in Missing Sums of Sumsets

    Authors: Aditya Jambhale, Rauan Kaldybayev, Steven J. Miller, Chris Yao

    Abstract: We study $|A + A|$ as a random variable, where $A \subseteq \{0, \dots, N\}$ is a random subset such that each $0 \le n \le N$ is included with probability $0 < p < 1$, and where $A + A$ is the set of sums $a + b$ for $a,b$ in $A$. Lazarev, Miller, and O'Bryant studied the distribution of $2N + 1 - |A + A|$, the number of summands not represented in $A + A$ when $p = 1/2$. A recent paper by Chu, K… ▽ More

    Submitted 1 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 25 pages, 6 figures

    MSC Class: 11P99; 11B13

  43. arXiv:2401.09003  [pdf, other

    cs.CL cs.AI cs.LG

    Augmenting Math Word Problems via Iterative Question Composing

    Authors: Haoxiong Liu, Yifan Zhang, Yifan Luo, Andrew Chi-Chih Yao

    Abstract: Despite the advancements in large language models (LLMs) for mathematical reasoning, solving competition-level math problems remains a significant challenge, especially for open-source LLMs without external tools. We introduce the MMIQC dataset, comprising a mixture of processed web data and synthetic question-response pairs, aimed at enhancing the mathematical reasoning capabilities of base langu… ▽ More

    Submitted 10 February, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  44. arXiv:2401.07030  [pdf, ps, other

    math.AP

    Subsonic Euler flows in a three-dimensional finitely long cylinder with arbitrary cross section

    Authors: Shangkun Weng, Changkui Yao

    Abstract: This paper concerns the well-posedness of subsonic flows in a three-dimensional finitely long cylinder with arbitrary cross section. We establish the existence and uniqueness of subsonic flows in the Sobolev space by prescribing the normal component of the momentum, the vorticity, the entropy, the Bernoulli's quantity at the entrance and the normal component of the momentum at the exit. One of the… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    MSC Class: 35M12; 76G25; 76N10

  45. arXiv:2401.05638  [pdf, other

    cs.CV

    MatSAM: Efficient Extraction of Microstructures of Materials via Visual Large Model

    Authors: Changtai Li, Xu Han, Chao Yao, Xiaojuan Ban

    Abstract: Efficient and accurate extraction of microstructures in micrographs of materials is essential in process optimization and the exploration of structure-property relationships. Deep learning-based image segmentation techniques that rely on manual annotation are laborious and time-consuming and hardly meet the demand for model transferability and generalization on various source images. Segment Anyth… ▽ More

    Submitted 2 March, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: 18 pages, 8 figures, and 5 tables. Updated with revision and code repository

  46. arXiv:2401.05412  [pdf, other

    cs.CV cs.AI eess.SP

    Spatial-Related Sensors Matters: 3D Human Motion Reconstruction Assisted with Textual Semantics

    Authors: Xueyuan Yang, Chao Yao, Xiaojuan Ban

    Abstract: Leveraging wearable devices for motion reconstruction has emerged as an economical and viable technique. Certain methodologies employ sparse Inertial Measurement Units (IMUs) on the human body and harness data-driven strategies to model human poses. However, the reconstruction of motion based solely on sparse IMUs data is inherently fraught with ambiguity, a consequence of numerous identical IMU r… ▽ More

    Submitted 26 December, 2023; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  47. arXiv:2401.01522  [pdf, other

    cs.CV

    LORE++: Logical Location Regression Network for Table Structure Recognition with Pre-training

    Authors: Rujiao Long, Hangdi Xing, Zhibo Yang, Qi Zheng, Zhi Yu, Cong Yao, Fei Huang

    Abstract: Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats. Recent methods solve this problem by predicting the adjacency relations of detected cell boxes or learning to directly generate the corresponding markup sequences from the table images. However, existing approaches either count on additional heuristic rules to recover the table structures, or… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.03730

  48. arXiv:2312.12142  [pdf, other

    cs.CV cs.AI

    FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

    Authors: Zhenhua Yang, Dezhi Peng, Yuxin Kong, Yuyi Zhang, Cong Yao, Lianwen Jin

    Abstract: Automatic font generation is an imitation task, which aims to create a font library that mimics the style of reference images while preserving the content from source images. Although existing font generation methods have achieved satisfactory performance, they still struggle with complex characters and large style variations. To address these issues, we propose FontDiffuser, a diffusion-based ima… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024; Github Page: https://github.com/yeungchenwa/FontDiffuser

    Journal ref: 38th AAAI Conference on Artificial Intelligence (AAAI2024), Vancouver, BC, Canada, 2024

  49. arXiv:2312.09613  [pdf, other

    cs.LG cs.AI stat.ML

    Rethinking Causal Relationships Learning in Graph Neural Networks

    Authors: Hang Gao, Chengyu Yao, Jiangmeng Li, Lingyu Si, Yifan Jin, Fengge Wu, Changwen Zheng, Huaping Liu

    Abstract: Graph Neural Networks (GNNs) demonstrate their significance by effectively modeling complex interrelationships within graph-structured data. To enhance the credibility and robustness of GNNs, it becomes exceptionally crucial to bolster their ability to capture causal relationships. However, despite recent advancements that have indeed strengthened GNNs from a causal learning perspective, conductin… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  50. arXiv:2312.07823  [pdf, other

    cs.CV

    Semantic Lens: Instance-Centric Semantic Alignment for Video Super-Resolution

    Authors: Qi Tang, Yao Zhao, Meiqin Liu, Jian Jin, Chao Yao

    Abstract: As a critical clue of video super-resolution (VSR), inter-frame alignment significantly impacts overall performance. However, accurate pixel-level alignment is a challenging task due to the intricate motion interweaving in the video. In response to this issue, we introduce a novel paradigm for VSR named Semantic Lens, predicated on semantic priors drawn from degraded videos. Specifically, video is… ▽ More

    Submitted 19 January, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024