Zum Hauptinhalt springen

Showing 151–200 of 891 results for author: Dai, Y

.
  1. arXiv:2311.13127  [pdf, other

    cs.CV cs.AI cs.CR

    MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

    Authors: Yixin Liu, Chenrui Fan, Yutong Dai, Xun Chen, Pan Zhou, Lichao Sun

    Abstract: Text-to-image diffusion models allow seamless generation of personalized images from scant reference photos. Yet, these tools, in the wrong hands, can fabricate misleading or harmful content, endangering individuals. To address this problem, existing poisoning-based approaches perturb user images in an imperceptible way to render them "unlearnable" from malicious uses. We identify two limitations… ▽ More

    Submitted 26 April, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Accepted to CVPR 2024 (Oral)

  2. arXiv:2311.11863  [pdf, other

    cs.CV

    GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding

    Authors: Hao Li, Dingwen Zhang, Yalun Dai, Nian Liu, Lechao Cheng, Jingfeng Li, Jingdong Wang, Junwei Han

    Abstract: Applying NeRF to downstream perception tasks for scene understanding and representation is becoming increasingly popular. Most existing methods treat semantic prediction as an additional rendering task, \textit{i.e.}, the "label rendering" task, to build semantic NeRFs. However, by rendering semantic/instance labels per pixel without considering the contextual information of the rendered image, th… ▽ More

    Submitted 7 April, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: CVPR 2024 (Highlight). Project Page: https://lifuguan.github.io/gpnerf-pages/

  3. arXiv:2311.09861  [pdf, other

    cs.CL cs.AI

    ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology

    Authors: Junlei Zhang, Hongliang He, Nirui Song, Zhanchao Zhou, Shuyuan He, Shuai Zhang, Huachuan Qiu, Anqi Li, Yong Dai, Lizhi Ma, Zhenzhong Lan

    Abstract: The critical field of psychology necessitates a comprehensive benchmark to enhance the evaluation and development of domain-specific Large Language Models (LLMs). Existing MMLU-type benchmarks, such as C-EVAL and CMMLU, include psychology-related subjects, but their limited number of questions and lack of systematic concept sampling strategies mean they cannot cover the concepts required in psycho… ▽ More

    Submitted 16 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Under Review

  4. arXiv:2311.08045  [pdf, other

    cs.CL cs.AI cs.LG

    Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game

    Authors: Pengyu Cheng, Yifan Yang, Jian Li, Yong Dai, Tianhao Hu, Peixin Cao, Nan Du, Xiaolong Li

    Abstract: Human preference alignment is essential to improve the interaction quality of large language models (LLMs). Existing alignment methods depend on manually annotated preference data to guide the LLM optimization directions. However, continuously updating LLMs for alignment raises a distribution gap between model-generated samples and human-annotated responses, hindering training effectiveness. To mi… ▽ More

    Submitted 3 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Accepted by ACL2024 findings

  5. arXiv:2311.07869  [pdf

    quant-ph

    Hybrid GRU-CNN Bilinear Parameters Initialization for Quantum Approximate Optimization Algorithm

    Authors: Zuyu Xu, Pengnian Cai, Kang Sheng, Tao Yang, Yuanming Hu, Yunlai Zhu, Zuheng Wu, Yuehua Dai, Fei Yang

    Abstract: The Quantum Approximate Optimization Algorithm (QAOA), a pivotal paradigm in the realm of variational quantum algorithms (VQAs), offers promising computational advantages for tackling combinatorial optimization problems. Well-defined initial circuit parameters, responsible for preparing a parameterized quantum state encoding the solution, play a key role in optimizing QAOA. However, classical opti… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  6. arXiv:2311.05374  [pdf, other

    cs.CL cs.AI

    TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for Human-Aligned LLMs

    Authors: Shuyi Xie, Wenlin Yao, Yong Dai, Shaobo Wang, Donlin Zhou, Lifeng Jin, Xinhua Feng, Pengzhi Wei, Yujie Lin, Zhichao Hu, Dong Yu, Zhengyou Zhang, Jing Nie, Yuhong Liu

    Abstract: Large language models (LLMs) have shown impressive capabilities across various natural language tasks. However, evaluating their alignment with human preferences remains a challenge. To this end, we propose a comprehensive human evaluation framework to assess LLMs' proficiency in following instructions on diverse real-world tasks. We construct a hierarchical task tree encompassing 7 major areas co… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  7. arXiv:2311.03706  [pdf, other

    math.OC

    Parallelized Conflict Graph Cut Generation

    Authors: Yongzheng Dai, Chen Chen

    Abstract: A conflict graph represents logical relations between binary variables, and effective use of the graph can significantly accelerate branch-and-cut solvers for mixed-integer programming (MIP). In this paper we develop efficient parallel conflict graph management: conflict detection; maximal clique generation; clique extension; and clique merging. We leverage parallel computing in order to intensify… ▽ More

    Submitted 27 May, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: 19 pages, 2 figures

    MSC Class: 90C10

  8. arXiv:2311.00669  [pdf, ps, other

    cond-mat.str-el

    The ground-state phase diagram for an alternative anisotropic extension of quantum spin-1 ferromagnetic biquadratic model

    Authors: Yan-Wei Dai, Qian-Qian Shi, Xi-Hao Chen, Huan-Qiang Zhou

    Abstract: The ground-state phase diagram is mapped out for an alternative anisotropic extension of quantum spin-1 ferromagnetic biquadratic model, which accommodates twelve distinct phases: three degenerate fractal phases, six Luttinger liquid phases and three symmetry-protected trivial phases. It is found that distinct types of quantum phase transitions are involved between them. In particular, one type ar… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: 11 pages, 8 figures, 2 tables

  9. arXiv:2311.00574  [pdf, ps, other

    cond-mat.str-el

    An alternative spontaneous symmetry breaking pattern for $\rm{U}(1)$ with no gapless Goldstone mode

    Authors: Huan-Qiang Zhou, Qian-Qian Shi, Yan-Wei Dai

    Abstract: An emergent gapless Goldstone mode originates from continuous spontaneous symmetry breaking, which has become a doctrine since the pioneering work by Goldstone [J. Goldstone, Nuovo Cimento \textbf{19}, 154 (1961)]. However, we argue that it is possible for a continuous symmetry group $\rm{U}(1)$ to make an exceptional case, simply due to the well-known mathematical result that a continuous symmetr… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: 13 pages, 9 figures, 9 tables

  10. arXiv:2310.20155  [pdf

    physics.chem-ph cs.AI

    MLatom 3: Platform for machine learning-enhanced computational chemistry simulations and workflows

    Authors: Pavlo O. Dral, Fuchun Ge, Yi-Fan Hou, Peikun Zheng, Yuxinxin Chen, Mario Barbatti, Olexandr Isayev, Cheng Wang, Bao-Xin Xue, Max Pinheiro Jr, Yuming Su, Yiheng Dai, Yangtao Chen, Lina Zhang, Shuang Zhang, Arif Ullah, Quanhao Zhang, Yanchi Ou

    Abstract: Machine learning (ML) is increasingly becoming a common tool in computational chemistry. At the same time, the rapid development of ML methods requires a flexible software framework for designing custom workflows. MLatom 3 is a program package designed to leverage the power of ML to enhance typical computational chemistry simulations and to create complex workflows. This open-source package provid… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  11. arXiv:2310.18635  [pdf, other

    cs.HC

    T-PickSeer: Visual Analysis of Taxi Pick-up Point Selection Behavior

    Authors: Shuxian Gu, Yemo Dai, Zezheng Feng, Yong Wang, Haipeng Zeng

    Abstract: Taxi drivers often take much time to navigate the streets to look for passengers, which leads to high vacancy rates and wasted resources. Empty taxi cruising remains a big concern for taxi companies. Analyzing the pick-up point selection behavior can solve this problem effectively, providing suggestions for taxi management and dispatch. Many studies have been devoted to analyzing and recommending… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: 10 pages, 10 figures; The 10th China Visualization and Visual Analytics Conference

  12. The impact of the Russia-Ukraine conflict on the extreme risk spillovers between agricultural futures and spots

    Authors: Wei-Xing Zhou, Yun-Shi Dai, Kiet Tuan Duong, Peng-Fei Dai

    Abstract: The ongoing Russia-Ukraine conflict between two major agricultural powers has posed significant threats and challenges to the global food system and world food security. Focusing on the impact of the conflict on the global agricultural market, we propose a new analytical framework for tail dependence, and combine the Copula-CoVaR method with the ARMA-GARCH-skewed Student-t model to examine the tai… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 35 pages, 2 figures

    Journal ref: Journal of Economic Behavior & Organization 217, 91-111 (2024)

  13. Correlation structure analysis of the global agricultural futures market

    Authors: Yun-Shi Dai, Ngoc Quang Anh Huynh, Qing-Huan Zheng, Wei-Xing Zhou

    Abstract: This paper adopts the random matrix theory (RMT) to analyze the correlation structure of the global agricultural futures market from 2000 to 2020. It is found that the distribution of correlation coefficients is asymmetric and right skewed, and many eigenvalues of the correlation matrix deviate from the RMT prediction. The largest eigenvalue reflects a collective market effect common to all agricu… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 19 pages, 7 figures

    Journal ref: Research in International Business and Finance 61, 101677 (2022)

  14. Janus icosahedral particles: amorphization driven by three-dimensional atomic misfit and edge dislocation compensation

    Authors: Zhen Sun, Yao Zhang, Zezhou Li, Xuanxuan Du, Zhiheng Xie, Yiheng Dai, Colin Ophus, Jihan Zhou

    Abstract: Icosahedral nanoparticles composed of fivefold twinned tetrahedra have broad applications. The strain relief mechanism and angular deficiency in icosahedral multiply twinned particles are poorly understood in three dimensions. Here, we resolved the three-dimensional atomic structures of Janus icosahedral nanoparticles using atomic resolution electron tomography. A geometrically fivefold face consi… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 30 pages, 5 figures

  15. Noema formIng Cluster survEy (NICE): Discovery of a starbursting galaxy group with a radio-luminous core at z=3.95

    Authors: Luwenjia Zhou, Tao Wang, Emanuele Daddi, Rosemary Coogan, Hanwen Sun, Ke Xu, Vinodiran Arumugam, Shuowen Jin, Daizhong Liu, Shiying Lu, Nikolaj Sillassen, Yijun Wang, Yong Shi, Zhi-Yu Zhang, Qinghua Tan, Qiusheng Gu, David Elbaz, Aurelien Le Bail, Benjamin Magnelli, Carlos Gómez-Guijarro, Chiara d'Eugenio, Georgios E. Magdis, Francesco Valentino, Zhiyuan Ji, Raphael Gobat , et al. (12 additional authors not shown)

    Abstract: The study of distant galaxy groups and clusters at the peak epoch of star formation is limited by the lack of a statistically and homogeneously selected and spectroscopically confirmed sample. Recent discoveries of concentrated starburst activities in cluster cores have opened a new window to hunt for these structures based on their integrated IR luminosities. Hereby we carry out the large NOEMA (… ▽ More

    Submitted 29 April, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: 10 pages, 8 figures, published by A&A

    Journal ref: A&A, 684, A196 (2024)

  16. arXiv:2310.09726  [pdf, other

    cs.GR cs.CV

    FuseSR: Super Resolution for Real-time Rendering through Efficient Multi-resolution Fusion

    Authors: Zhihua Zhong, Jingsen Zhu, Yuxin Dai, Chuankun Zheng, Yuchi Huo, Guanlin Chen, Hujun Bao, Rui Wang

    Abstract: The workload of real-time rendering is steeply increasing as the demand for high resolution, high refresh rates, and high realism rises, overwhelming most graphics cards. To mitigate this problem, one of the most popular solutions is to render images at a low resolution to reduce rendering overhead, and then manage to accurately upsample the low-resolution rendered image to the target resolution,… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: Accepted by SIGGRAPH Asia 2023. Project page: https://isaac-paradox.github.io/FuseSR/

  17. arXiv:2310.08956  [pdf, other

    cs.CV

    LRRU: Long-short Range Recurrent Updating Networks for Depth Completion

    Authors: Yufei Wang, Bo Li, Ge Zhang, Qi Liu, Tao Gao, Yuchao Dai

    Abstract: Existing deep learning-based depth completion methods generally employ massive stacked layers to predict the dense depth map from sparse input data. Although such approaches greatly advance this task, their accompanied huge computational complexity hinders their practical applications. To accomplish depth completion more efficiently, we propose a novel lightweight deep network framework, the Long-… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

    Comments: Published in ICCV 2023

  18. arXiv:2310.08303  [pdf, other

    cs.CV cs.SD eess.AS

    Multimodal Variational Auto-encoder based Audio-Visual Segmentation

    Authors: Yuxin Mao, Jing Zhang, Mochu Xiang, Yiran Zhong, Yuchao Dai

    Abstract: We propose an Explicit Conditional Multimodal Variational Auto-Encoder (ECMVAE) for audio-visual segmentation (AVS), aiming to segment sound sources in the video sequence. Existing AVS methods focus on implicit feature fusion strategies, where models are trained to fit the discrete samples in the dataset. With a limited and less diverse dataset, the resulting performance is usually unsatisfactory.… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted by ICCV2023,Project page(https://npucvr.github.io/MMVAE-AVS),Code(https://github.com/OpenNLPLab/MMVAE-AVS)

  19. arXiv:2310.08233  [pdf, other

    cs.RO cs.AI

    The Impact of Time Step Frequency on the Realism of Robotic Manipulation Simulation for Objects of Different Scales

    Authors: Minh Q. Ta, Holly Dinkel, Hameed Abdul-Rashid, Yangfei Dai, Jessica Myers, Tan Chen, Junyi Geng, Timothy Bretl

    Abstract: This work evaluates the impact of time step frequency and component scale on robotic manipulation simulation accuracy. Increasing the time step frequency for small-scale objects is shown to improve simulation accuracy. This simulation, demonstrating pre-assembly part picking for two object geometries, serves as a starting point for discussing how to improve Sim2Real transfer in robotic assembly pr… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 3 pages, 3 figures, Best Poster Finalist at the 2023 Robotics and AI in Future Factory Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Video presentation [https://www.youtube.com/watch?v=JOXrBpMmI0A]. Robotics and AI in Future Factory workshop [https://sites.google.com/view/robot-ai-future-factory/]

  20. arXiv:2310.08027  [pdf, other

    cs.CL cs.CV

    Exploring Large Language Models for Multi-Modal Out-of-Distribution Detection

    Authors: Yi Dai, Hao Lang, Kaisheng Zeng, Fei Huang, Yongbin Li

    Abstract: Out-of-distribution (OOD) detection is essential for reliable and trustworthy machine learning. Recent multi-modal OOD detection leverages textual information from in-distribution (ID) class names for visual OOD detection, yet it currently neglects the rich contextual information of ID classes. Large language models (LLMs) encode a wealth of world knowledge and can be prompted to generate descript… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: EMNLP2023 Findings Long Paper

  21. arXiv:2310.07968  [pdf, other

    cs.RO cs.CL cs.HC

    Think, Act, and Ask: Open-World Interactive Personalized Robot Navigation

    Authors: Yinpei Dai, Run Peng, Sikai Li, Joyce Chai

    Abstract: Zero-Shot Object Navigation (ZSON) enables agents to navigate towards open-vocabulary objects in unknown environments. The existing works of ZSON mainly focus on following individual instructions to find generic object classes, neglecting the utilization of natural language interaction and the complexities of identifying user-specific objects. To address these limitations, we introduce Zero-shot I… ▽ More

    Submitted 29 May, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: Video URL: https://www.youtube.com/watch?v=rN5S8QIhhQc

  22. arXiv:2310.05620  [pdf, other

    cs.CL

    LAiW: A Chinese Legal Large Language Models Benchmark

    Authors: Yongfu Dai, Duanyu Feng, Jimin Huang, Haochen Jia, Qianqian Xie, Yifang Zhang, Weiguang Han, Wei Tian, Hao Wang

    Abstract: General and legal domain LLMs have demonstrated strong performance in various tasks of LegalAI. However, the current evaluations of these LLMs in LegalAI are defined by the experts of computer science, lacking consistency with the logic of legal practice, making it difficult to judge their practical capabilities. To address this challenge, we are the first to build the Chinese legal LLMs benchmark… ▽ More

    Submitted 18 February, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

  23. arXiv:2310.03143  [pdf

    physics.app-ph cond-mat.mtrl-sci

    One-Dimensional Crystallographic Etching of Few-Layer WS$_2$

    Authors: Shisheng Li, Yung-Chang Lin, Yiling Chiew, Yunyun Dai, Zixuan Ning, Hideaki Nakajima, Hong En Lim, Jing Wu, Yasuhisa Naito, Toshiya Okazaki, Zhipei Sun, Kazu Suenaga, Yoshiki Sakuma, Kazuhito Tsukagoshi, Takaaki Taniguchi

    Abstract: Layer number-dependent band structures and symmetry are vital for the electrical and optical characteristics of two-dimensional (2D) transition metal dichalcogenides (TMDCs). Harvesting 2D TMDCs with tunable thickness and properties can be achieved through top-down etching and bottom-up growth strategies. In this study, we report a pioneering technique that utilizes the migration of in-situ genera… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: 37 pages, 16 figures

    Journal ref: Advanced Functional Materials, 2024

  24. arXiv:2310.00919  [pdf, other

    eess.IV cs.CV cs.LG

    BAAF: A Benchmark Attention Adaptive Framework for Medical Ultrasound Image Segmentation Tasks

    Authors: Gongping Chen, Lei Zhao, Xiaotao Yin, Liang Cui, Jianxun Zhang, Yu Dai

    Abstract: The AI-based assisted diagnosis programs have been widely investigated on medical ultrasound images. Complex scenario of ultrasound image, in which the coupled interference of internal and external factors is severe, brings a unique challenge for localize the object region automatically and precisely in ultrasound images. In this study, we seek to propose a more general and robust Benchmark Attent… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  25. arXiv:2310.00566  [pdf, other

    cs.LG cs.AI cs.CL cs.CY

    Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models

    Authors: Duanyu Feng, Yongfu Dai, Jimin Huang, Yifang Zhang, Qianqian Xie, Weiguang Han, Zhengyu Chen, Alejandro Lopez-Lira, Hao Wang

    Abstract: In the financial industry, credit scoring is a fundamental element, shaping access to credit and determining the terms of loans for individuals and businesses alike. Traditional credit scoring methods, however, often grapple with challenges such as narrow knowledge scope and isolated evaluation of credit tasks. Our work posits that Large Language Models (LLMs) have great potential for credit scori… ▽ More

    Submitted 17 February, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

  26. arXiv:2309.17390  [pdf, other

    cs.CV

    Forward Flow for Novel View Synthesis of Dynamic Scenes

    Authors: Xiang Guo, Jiadai Sun, Yuchao Dai, Guanying Chen, Xiaoqing Ye, Xiao Tan, Errui Ding, Yumeng Zhang, Jingdong Wang

    Abstract: This paper proposes a neural radiance field (NeRF) approach for novel view synthesis of dynamic scenes using forward warping. Existing methods often adopt a static NeRF to represent the canonical space, and render dynamic images at other time steps by mapping the sampled 3D points back to the canonical space with the learned backward flow field. However, this backward flow field is non-smooth and… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: Accepted by ICCV2023 as oral. Project page: https://npucvr.github.io/ForwardFlowDNeRF

    Journal ref: ICCV2023

  27. arXiv:2309.15082  [pdf, other

    cs.CV

    RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation

    Authors: Zhexiong Wan, Yuxin Mao, Jing Zhang, Yuchao Dai

    Abstract: Recently, the RGB images and point clouds fusion methods have been proposed to jointly estimate 2D optical flow and 3D scene flow. However, as both conventional RGB cameras and LiDAR sensors adopt a frame-based data acquisition mechanism, their performance is limited by the fixed low sampling rates, especially in highly-dynamic scenes. By contrast, the event camera can asynchronously capture the i… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: ICCV 2023. Project page: https://npucvr.github.io/RPEFlow Code: https://github.com/danqu130/RPEFlow

  28. arXiv:2309.13816  [pdf, ps, other

    math.OC

    Exact penalty method for D-stationary point of nonlinear optimization

    Authors: Xin-Wei Liu, Yu-Hong Dai

    Abstract: We consider the nonlinear optimization problem with least $\ell_1$-norm measure of constraint violations and introduce the concepts of the D-stationary point, the DL-stationary point and the DZ-stationary point with the help of exact penalty function. If the stationary point is feasible, they correspond to the Fritz-John stationary point, the KKT stationary point and the singular stationary point,… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

    Comments: 24 pages

    MSC Class: 49M37; 65K05; 90C26; 90C30; 90C55

  29. arXiv:2309.12300  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    See to Touch: Learning Tactile Dexterity through Visual Incentives

    Authors: Irmak Guzey, Yinlong Dai, Ben Evans, Soumith Chintala, Lerrel Pinto

    Abstract: Equipping multi-fingered robots with tactile sensing is crucial for achieving the precise, contact-rich, and dexterous manipulation that humans excel at. However, relying solely on tactile sensing fails to provide adequate cues for reasoning about objects' spatial configurations, limiting the ability to correct errors and adapt to changing situations. In this paper, we present Tactile Adaptation f… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  30. arXiv:2309.11559  [pdf, other

    astro-ph.GA

    A Close Look at Ly$α$ Emitters with JWST/NIRCam at $z\approx3.1$

    Authors: Yixiao Liu, Y. Sophia Dai, Stijn Wuyts, Jia-Sheng Huang, Linhua Jiang

    Abstract: We study 10 spectroscopically confirmed Ly$α$ emitters (LAEs) at $z\approx3.1$ in the UDS field, covered by JWST/NIRCam in the PRIMER program. All LAEs are detected in all NIRCam bands from F090W to F444W, corresponding to restframe 2200Å--1.2$\mathrm{μm}$. Based on morphological analysis of the F200W images, three out of the 10 targets are resolved into pair-like systems with separations of… ▽ More

    Submitted 2 April, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: 17 pages, 6 figures, ApJ accepted

  31. arXiv:2309.09426  [pdf, other

    eess.IV cs.AI cs.CV cs.LG eess.SP

    Joint Demosaicing and Denoising with Double Deep Image Priors

    Authors: Taihui Li, Anish Lahiri, Yutong Dai, Owen Mayer

    Abstract: Demosaicing and denoising of RAW images are crucial steps in the processing pipeline of modern digital cameras. As only a third of the color information required to produce a digital image is captured by the camera sensor, the process of demosaicing is inherently ill-posed. The presence of noise further exacerbates this problem. Performing these two steps sequentially may distort the content of th… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

  32. arXiv:2309.08622  [pdf, other

    cs.IR cs.AI

    Representation Learning in Low-rank Slate-based Recommender Systems

    Authors: Yijia Dai, Wen Sun

    Abstract: Reinforcement learning (RL) in recommendation systems offers the potential to optimize recommendations for long-term user engagement. However, the environment often involves large state and action spaces, which makes it hard to efficiently learn and explore. In this work, we propose a sample-efficient representation learning algorithm, using the standard slate recommendation setup, to treat this a… ▽ More

    Submitted 18 September, 2023; v1 submitted 10 September, 2023; originally announced September 2023.

    Comments: in MFPL, ICML 2023

  33. arXiv:2309.08348  [pdf, other

    eess.AS cs.SD

    The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction

    Authors: Shilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, Chenyue Zhang, Ruoyu Wang, Hongbo Lan, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Zhong-Qiu Wang, Jia Pan, Jianqing Gao

    Abstract: Previous Multimodal Information based Speech Processing (MISP) challenges mainly focused on audio-visual speech recognition (AVSR) with commendable success. However, the most advanced back-end recognition systems often hit performance limits due to the complex acoustic environments. This has prompted a shift in focus towards the Audio-Visual Target Speaker Extraction (AVTSE) task for the MISP 2023… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: 5 pages, 4 figures

  34. arXiv:2309.04953  [pdf, ps, other

    cond-mat.stat-mech

    Extracting the number of type-B Goldstone modes and the dynamical critical exponent for a type of scale-invariant states

    Authors: Huan-Qiang Zhou, Yan-Wei Dai, Qian-Qian Shi, Ian P. McCulloch, Murray T. Batchelor

    Abstract: A generic scheme is proposed to perform a finite-entanglement scaling analysis for scale-invariant states, which appear to be highly degenerate ground states arising from spontaneous symmetry breaking with type-B Goldstone modes. This allows us to extract the number of type-B Goldstone modes and the dynamical critical exponent, in combination with a finite block-size scaling analysis, from numeric… ▽ More

    Submitted 30 November, 2023; v1 submitted 10 September, 2023; originally announced September 2023.

    Comments: 14 pages, 24 figures, 11 tables. Minor changes

  35. arXiv:2309.03559  [pdf, other

    cs.CL

    An Anchor Learning Approach for Citation Field Learning

    Authors: Zilin Yuan, Borun Chen, Yimeng Dai, Yinghui Li, Hai-Tao Zheng, Rui Zhang

    Abstract: Citation field learning is to segment a citation string into fields of interest such as author, title, and venue. Extracting such fields from citations is crucial for citation indexing, researcher profile analysis, etc. User-generated resources like academic homepages and Curriculum Vitae, provide rich citation field information. However, extracting fields from these resources is challenging due t… ▽ More

    Submitted 14 December, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: accepted by ICASSP2024

  36. arXiv:2309.03490  [pdf, other

    math.PR

    Lipschitz Transport Maps via the Follmer Flow

    Authors: Yin Dai, Yuan Gao, Jian Huang, Yuling Jiao, Lican Kang, Jin Liu

    Abstract: Inspired by the construction of the F{ö}llmer process, we construct a unit-time flow on the Euclidean space, termed the F{ö}llmer flow, whose flow map at time 1 pushes forward a standard Gaussian measure onto a general target measure. We study the well-posedness of the F{ö}llmer flow and establish the Lipschitz property of the flow map at time 1. We apply the Lipschitz mapping to several rich clas… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  37. arXiv:2309.03126  [pdf, other

    cs.CL

    Everyone Deserves A Reward: Learning Customized Human Preferences

    Authors: Pengyu Cheng, Jiawen Xie, Ke Bai, Yong Dai, Nan Du

    Abstract: Reward models (RMs) are essential for aligning large language models (LLMs) with human preferences to improve interaction quality. However, the real world is pluralistic, which leads to diversified human preferences with respect to different religions, politics, cultures, etc. Moreover, each individual can have their unique preferences on various topics. Neglecting the diversity of human preferenc… ▽ More

    Submitted 15 September, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

  38. arXiv:2309.02043  [pdf, other

    cs.CV

    Decomposed Guided Dynamic Filters for Efficient RGB-Guided Depth Completion

    Authors: Yufei Wang, Yuxin Mao, Qi Liu, Yuchao Dai

    Abstract: RGB-guided depth completion aims at predicting dense depth maps from sparse depth measurements and corresponding RGB images, where how to effectively and efficiently exploit the multi-modal information is a key issue. Guided dynamic filters, which generate spatially-variant depth-wise separable convolutional filters from RGB features to guide depth features, have been proven to be effective in thi… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  39. arXiv:2308.14032  [pdf, ps, other

    hep-ph

    $ρ$-meson longitudinal leading-twist distribution amplitude revisited and the $D\to ρ$ semileptonic decay

    Authors: Tao Zhong, Ya-Hong Dai, Hai-Bing Fu

    Abstract: Motivated by our previous work [Phys. Rev. D \textbf{104}, no.1, 016021 (2021)] on pionic leading-twist distribution amplitude (DA), we revisit $ρ$-meson leading-twist longitudinal DA $φ_{2;ρ}^\|(x,μ)$ in this paper. A model proposed by Chang based on the Dyson-Schwinger equations (DSEs) is adopted to describe the behavior of $φ_{2;ρ}^\|(x,μ)$. On the other hand, the $ξ$-moments of… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: 9 pages, 3 figures

  40. arXiv:2308.13774  [pdf, other

    cs.CV cs.IR cs.MM

    Central Similarity Multi-View Hashing for Multimedia Retrieval

    Authors: Jian Zhu, Wen Cheng, Yu Cui, Chang Tang, Yuyang Dai, Yong Li, Lingfang Zeng

    Abstract: Hash representation learning of multi-view heterogeneous data is the key to improving the accuracy of multimedia retrieval. However, existing methods utilize local similarity and fall short of deeply fusing the multi-view features, resulting in poor retrieval accuracy. Current methods only use local similarity to train their model. These methods ignore global similarity. Furthermore, most recent w… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

    Comments: accepted by the Asia Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data (APWeb-WAIM2023)

  41. arXiv:2308.13191  [pdf, other

    cs.CL cs.AI

    Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers

    Authors: Jiawen Xie, Pengyu Cheng, Xiao Liang, Yong Dai, Nan Du

    Abstract: Although dominant in natural language processing, transformer-based models remain challenged by the task of long-sequence processing, because the computational cost of self-attention operations in transformers swells quadratically with the input sequence length. To alleviate the complexity of long-sequence processing, we propose a simple framework to enable the offthe-shelf pre-trained transformer… ▽ More

    Submitted 5 July, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: ACL 2024

  42. arXiv:2308.11925  [pdf, other

    math.OC cs.LG math.NA

    Solving Elliptic Optimal Control Problems via Neural Networks and Optimality System

    Authors: Yongcheng Dai, Bangti Jin, Ramesh Sau, Zhi Zhou

    Abstract: In this work, we investigate a neural network based solver for optimal control problems (without / with box constraint) for linear and semilinear second-order elliptic problems. It utilizes a coupled system derived from the first-order optimality system of the optimal control problem, and employs deep neural networks to represent the solutions to the reduced system. We present an error analysis of… ▽ More

    Submitted 8 May, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: 26 pages

  43. arXiv:2308.10705  [pdf, other

    cs.CV cs.AI

    Unsupervised 3D Pose Estimation with Non-Rigid Structure-from-Motion Modeling

    Authors: Haorui Ji, Hui Deng, Yuchao Dai, Hongdong Li

    Abstract: Most of the previous 3D human pose estimation work relied on the powerful memory capability of the network to obtain suitable 2D-3D mappings from the training data. Few works have studied the modeling of human posture deformation in motion. In this paper, we propose a new modeling method for human pose deformations and design an accompanying diffusion-based motion prior. Inspired by the field of n… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

  44. arXiv:2308.09064  [pdf, other

    astro-ph.GA

    The Lyman Continuum Escape Fraction of Star-forming Galaxies at $2.4\lesssim z\lesssim3.7$ from UVCANDELS

    Authors: Xin Wang, Harry I. Teplitz, Brent M. Smith, Rogier A. Windhorst, Marc Rafelski, Vihang Mehta, Anahita Alavi, Gabriel Brammer, James Colbert, Norman Grogin, Nimish P. Hathi, Anton M. Koekemoer, Laura Prichard, Claudia Scarlata, Ben Sunnquist, Pablo Arrabal Haro, Christopher Conselice, Eric Gawiser, Yicheng Guo, Matthew Hayes, Rolf A. Jansen, Zhiyuan Ji, Ray A. Lucas, Robert O'Connell, Brant Robertson , et al. (52 additional authors not shown)

    Abstract: The UltraViolet Imaging of the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey Fields (UVCANDELS) survey is a Hubble Space Telescope (HST) Cycle-26 Treasury Program, allocated in total 164 orbits of primary Wide-Field Camera 3 Ultraviolet and Visible light F275W imaging with coordinated parallel Advanced Camera for Surveys F435W imaging, on four of the five premier extragalactic sur… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: 33 pages, 21 figures, and 5 tables. Resubmitted after addressing the referee report

  45. arXiv:2308.08488  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder

    Authors: Yusheng Dai, Hang Chen, Jun Du, Xiaofei Ding, Ning Ding, Feijun Jiang, Chin-Hui Lee

    Abstract: In recent research, slight performance improvement is observed from automatic speech recognition systems to audio-visual speech recognition systems in the end-to-end framework with low-quality videos. Unmatching convergence rates and specialized input representations between audio and visual modalities are considered to cause the problem. In this paper, we propose two novel techniques to improve a… ▽ More

    Submitted 8 March, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: 6 pages, 2 figures, published in ICME2023

  46. arXiv:2308.08288  [pdf, other

    cs.CV

    Improving Audio-Visual Segmentation with Bidirectional Generation

    Authors: Dawei Hao, Yuxin Mao, Bowen He, Xiaodong Han, Yuchao Dai, Yiran Zhong

    Abstract: The aim of audio-visual segmentation (AVS) is to precisely differentiate audible objects within videos down to the pixel level. Traditional approaches often tackle this challenge by combining information from various modalities, where the contribution of each modality is implicitly or explicitly modeled. Nevertheless, the interconnections between different modalities tend to be overlooked in audio… ▽ More

    Submitted 19 December, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: AAAI Camera Ready. Dawei Hao and Yuxin Mao contribute equality to this paper. Yiran Zhong is the corresponding author. The code will be released at https://github.com/OpenNLPLab/AVS-bidirectional

  47. arXiv:2308.04413  [pdf, other

    cs.CV

    Digging into Depth Priors for Outdoor Neural Radiance Fields

    Authors: Chen Wang, Jiadai Sun, Lina Liu, Chenming Wu, Zhelun Shen, Dayan Wu, Yuchao Dai, Liangjun Zhang

    Abstract: Neural Radiance Fields (NeRF) have demonstrated impressive performance in vision and graphics tasks, such as novel view synthesis and immersive reality. However, the shape-radiance ambiguity of radiance fields remains a challenge, especially in the sparse viewpoints setting. Recent work resorts to integrating depth priors into outdoor NeRF training to alleviate the issue. However, the criteria for… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted to ACM MM 2023. Project Page: https://cwchenwang.github.io/outdoor-nerf-depth

  48. arXiv:2308.02809  [pdf

    physics.class-ph physics.app-ph

    3D front tip fields in creeping solids under constraint effects: a higher-order asymptotic solution

    Authors: Weichen Kong, Yanwei Dai, Yinghua Liu

    Abstract: As one of the most important topics studied in creep fracture mechanics, mechanics fields at three-dimensional (3D) sharp V-notches and crack tip have drawn tremendous attentions. With many years efforts on constraint theory developed in creeping solids, there still seems dense fog on how in-plane and out-of-plane constraint effects are interacted for 3D sharp V-notch and crack in creeping solids.… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Comments: 56 pages, 25 figures

  49. UV-Bright Star-Forming Clumps and Their Host Galaxies in UVCANDELS at 0.5 $\leq$ z $\leq$ 1

    Authors: Alec Martin, Yicheng Guo, Xin Wang, Anton M. Koekemoer, Marc Rafelski, Harry I. Teplitz, Rogier A. Windhorst, Anahita Alavi, Norman A. Grogin, Laura Prichard, Ben Sunnquist, Daniel Ceverino, Nima Chartab, Christopher J. Conselice, Y. Sophia Dai, Avishai Dekel, Johnathan P. Gardner, Eric Gawiser, Nimish P. Hathi, Matthew J. Hayes, Rolf A. Jansen, Zhiyuan Ji, David C. Koo, Ray A. Lucas, Nir Mandelker , et al. (10 additional authors not shown)

    Abstract: Giant star-forming clumps are a prominent feature of star-forming galaxies (SFGs) and contain important clues on galaxy formation and evolution. However, basic demographics of clumps and their host galaxies remain uncertain. Using the HST/WFC3 F275W images from the Ultraviolet Imaging of the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (UVCANDELS), we detect and analyze giant sta… ▽ More

    Submitted 2 October, 2023; v1 submitted 31 July, 2023; originally announced August 2023.

    Comments: 21 pages, 13 figures, accepted for publication in ApJ

    Journal ref: ApJ 955 106 (2023)

  50. arXiv:2307.16579  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Contrastive Conditional Latent Diffusion for Audio-visual Segmentation

    Authors: Yuxin Mao, Jing Zhang, Mochu Xiang, Yunqiu Lv, Yiran Zhong, Yuchao Dai

    Abstract: We propose a latent diffusion model with contrastive learning for audio-visual segmentation (AVS) to extensively explore the contribution of audio. We interpret AVS as a conditional generation task, where audio is defined as the conditional variable for sound producer(s) segmentation. With our new interpretation, it is especially necessary to model the correlation between audio and the final segme… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.