Zum Hauptinhalt springen

Showing 1–50 of 493 results for author: Long, Y

.
  1. arXiv:2408.15117  [pdf, other

    physics.soc-ph stat.AP

    Inferring ghost cities on the globe in newly developed urban areas based on urban vitality with multi-source data

    Authors: Yecheng Zhang, Tangqi Tu, Ying long

    Abstract: Due to rapid urbanization over the past 20 years, many newly developed areas have lagged in socio-economic maturity, creating an imbalance with older cities and leading to the rise of "ghost cities." However, due to the complexity of socio-economic factors, no global studies have measured this phenomenon. We propose a unified framework based on urban vitality theory and multi-source data, validate… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: 28 pages, 13 figures

  2. arXiv:2408.13889  [pdf, other

    cs.CL

    LLM with Relation Classifier for Document-Level Relation Extraction

    Authors: Xingzuo Li, Kehai Chen, Yunfei Long, Min Zhang

    Abstract: Large language models (LLMs) create a new paradigm for natural language processing. Despite their advancement, LLM-based methods still lag behind traditional approaches in document-level relation extraction (DocRE), a critical task for understanding complex entity relations. This paper investigates the causes of this performance gap, identifying the dispersion of attention by LLMs due to entity pa… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  3. arXiv:2408.13102  [pdf, other

    cs.LG cs.CV

    Dynamic Label Adversarial Training for Deep Learning Robustness Against Adversarial Attacks

    Authors: Zhenyu Liu, Haoran Duan, Huizhi Liang, Yang Long, Vaclav Snasel, Guiseppe Nicosia, Rajiv Ranjan, Varun Ojha

    Abstract: Adversarial training is one of the most effective methods for enhancing model robustness. Recent approaches incorporate adversarial distillation in adversarial training architectures. However, we notice two scenarios of defense methods that limit their performance: (1) Previous methods primarily use static ground truth for adversarial training, but this often causes robust overfitting; (2) The los… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

    Journal ref: 31st International Conference on Neural Information Processing (ICONIP), 2024

  4. arXiv:2408.11878  [pdf, other

    cs.CL cs.CE q-fin.CP

    Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

    Authors: Qianqian Xie, Dong Li, Mengxi Xiao, Zihao Jiang, Ruoyu Xiang, Xiao Zhang, Zhengyu Chen, Yueru He, Weiguang Han, Yuzhe Yang, Shunian Chen, Yifei Zhang, Lihang Shen, Daniel Kim, Zhiwei Liu, Zheheng Luo, Yangyang Yu, Yupeng Cao, Zhiyang Deng, Zhiyuan Yao, Haohang Li, Duanyu Feng, Yongfu Dai, VijayaSai Somasundaram, Peng Lu , et al. (14 additional authors not shown)

    Abstract: Large language models (LLMs) have advanced financial applications, yet they often lack sufficient financial knowledge and struggle with tasks involving multi-modal inputs like tables and time series data. To address these limitations, we introduce \textit{Open-FinLLMs}, a series of Financial LLMs. We begin with FinLLaMA, pre-trained on a 52 billion token financial corpus, incorporating text, table… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 33 pages, 13 figures

  5. arXiv:2408.10561  [pdf, other

    cs.SD eess.AS

    ICSD: An Open-source Dataset for Infant Cry and Snoring Detection

    Authors: Qingyu Liu, Longfei Song, Dongxing Xu, Yanhua Long

    Abstract: The detection and analysis of infant cry and snoring events are crucial tasks within the field of audio signal processing. While existing datasets for general sound event detection are plentiful, they often fall short in providing sufficient, strongly labeled data specific to infant cries and snoring. To provide a benchmark dataset and thus foster the research of infant cry and snoring detection,… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    Comments: 11 pages, 6 figures

  6. arXiv:2408.09368  [pdf, ps, other

    cs.DS

    Unbreakable Decomposition in Close-to-Linear Time

    Authors: Aditya Anand, Euiwoong Lee, Jason Li, Yaowei Long, Thatchaphol Saranurak

    Abstract: Unbreakable decomposition, introduced by Cygan et al. (SICOMP'19) and Cygan et al. (TALG'20), has proven to be one of the most powerful tools for parameterized graph cut problems in recent years. Unfortunately, all known constructions require at least $Ω_k\left(mn^2\right)$ time, given an undirected graph with $n$ vertices, $m$ edges, and cut-size parameter $k$. In this work, we show the first clo… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

    Comments: 37 pages

  7. arXiv:2408.09278  [pdf, other

    eess.IV cs.CV

    Cross-Species Data Integration for Enhanced Layer Segmentation in Kidney Pathology

    Authors: Junchao Zhu, Mengmeng Yin, Ruining Deng, Yitian Long, Yu Wang, Yaohong Wang, Shilin Zhao, Haichun Yang, Yuankai Huo

    Abstract: Accurate delineation of the boundaries between the renal cortex and medulla is crucial for subsequent functional structural analysis and disease diagnosis. Training high-quality deep-learning models for layer segmentation relies on the availability of large amounts of annotated data. However, due to the patient's privacy of medical data and scarce clinical cases, constructing pathological datasets… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

  8. An Unsupervised Learning Framework Combined with Heuristics for the Maximum Minimal Cut Problem

    Authors: Huaiyuan Liu, Xianzhang Liu, Donghua Yang, Hongzhi Wang, Yingchi Long, Mengtong Ji, Dongjing Miao, Zhiyu Liang

    Abstract: The Maximum Minimal Cut Problem (MMCP), a NP-hard combinatorial optimization (CO) problem, has not received much attention due to the demanding and challenging bi-connectivity constraint. Moreover, as a CO problem, it is also a daunting task for machine learning, especially without labeled instances. To deal with these problems, this work proposes an unsupervised learning framework combined with h… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  9. arXiv:2408.05891  [pdf, other

    cs.CV

    CMAB: A First National-Scale Multi-Attribute Building Dataset in China Derived from Open Source Data and GeoAI

    Authors: Yecheng Zhang, Huimin Zhao, Ying Long

    Abstract: Rapidly acquiring three-dimensional (3D) building data, including geometric attributes like rooftop, height and orientations, as well as indicative attributes like function, quality, and age, is essential for accurate urban analysis, simulations, and policy updates. Current building datasets suffer from incomplete coverage of building multi-attributes. This paper introduces a geospatial artificial… ▽ More

    Submitted 21 August, 2024; v1 submitted 11 August, 2024; originally announced August 2024.

    Comments: 43 pages, 20 figures

    ACM Class: I.4.9

  10. HAIGEN: Towards Human-AI Collaboration for Facilitating Creativity and Style Generation in Fashion Design

    Authors: Jianan Jiang, Di Wu, Hanhui Deng, Yidan Long, Wenyi Tang, Xiang Li, Can Liu, Zhanpeng Jin, Wenlei Zhang, Tangquan Qi

    Abstract: The process of fashion design usually involves sketching, refining, and coloring, with designers drawing inspiration from various images to fuel their creative endeavors. However, conventional image search methods often yield irrelevant results, impeding the design process. Moreover, creating and coloring sketches can be time-consuming and demanding, acting as a bottleneck in the design workflow.… ▽ More

    Submitted 11 August, 2024; v1 submitted 1 August, 2024; originally announced August 2024.

    Comments: Accepted by Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (ACM IMWUT/UbiComp 2024)

  11. arXiv:2407.15862  [pdf

    cs.LG cs.AI cs.CL cs.CY

    Performance Evaluation of Lightweight Open-source Large Language Models in Pediatric Consultations: A Comparative Analysis

    Authors: Qiuhong Wei, Ying Cui, Mengwei Ding, Yanqin Wang, Lingling Xiang, Zhengxiong Yao, Ceran Chen, Ying Long, Zhezhen Jin, Ximing Xu

    Abstract: Large language models (LLMs) have demonstrated potential applications in medicine, yet data privacy and computational burden limit their deployment in healthcare institutions. Open-source and lightweight versions of LLMs emerge as potential solutions, but their performance, particularly in pediatric settings remains underexplored. In this cross-sectional study, 250 patient consultation questions w… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 27 pages in total with 17 pages of main manuscript and 10 pages of supplementary materials; 4 figures in the main manuscript and 2 figures in supplementary material

    MSC Class: 68M20 (Primary) 62G10 (Secondary)

  12. arXiv:2407.11906  [pdf, other

    cs.CV cs.RO

    SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge

    Authors: Hao Ding, Tuxun Lu, Yuqian Zhang, Ruixing Liang, Hongchao Shu, Lalithkumar Seenivasan, Yonghao Long, Qi Dou, Cong Gao, Mathias Unberath

    Abstract: Accurate segmentation of tools in robot-assisted surgery is critical for machine perception, as it facilitates numerous downstream tasks including augmented reality feedback. While current feed-forward neural network-based methods exhibit excellent segmentation performance under ideal conditions, these models have proven susceptible to even minor corruptions, significantly impairing the model's pe… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  13. arXiv:2406.16967  [pdf, other

    eess.SP eess.SY

    Remaining useful life prediction of rolling bearings based on refined composite multi-scale attention entropy and dispersion entropy

    Authors: Yunchong Long, Qinkang Pang, Guangjie Zhu, Junxian Cheng, Xiangshun Li

    Abstract: Remaining useful life (RUL) prediction based on vibration signals is crucial for ensuring the safe operation and effective health management of rotating machinery. Existing studies often extract health indicators (HI) from time domain and frequency domain features to analyze complex vibration signals, but these features may not accurately capture the degradation process. In this study, we propose… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 12pages, 9 figures

  14. arXiv:2406.14962  [pdf, other

    cs.CV

    Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning

    Authors: Suyi Li, Chenyi Jiang, Shidong Wang, Yang Long, Zheng Zhang, Haofeng Zhang

    Abstract: Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known attribute-object pairs. The primary challenge in CZSL tasks lies in the significant discrepancies introduced by the complex interaction between the visual primitives of attribute and object, consequently decreasing the classification performance towards novel compositions. Previous remarkable works primarily addr… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  15. arXiv:2406.04882  [pdf, other

    cs.RO cs.AI cs.CL cs.CV

    InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment

    Authors: Yuxing Long, Wenzhe Cai, Hongcheng Wang, Guanqi Zhan, Hao Dong

    Abstract: Enabling robots to navigate following diverse language instructions in unexplored environments is an attractive goal for human-robot interaction. However, this goal is challenging because different navigation tasks require different strategies. The scarcity of instruction navigation data hinders training an instruction navigation model with varied strategies. Therefore, previous methods are all co… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Submitted to CoRL 2024

  16. arXiv:2406.00321  [pdf, other

    physics.optics cond-mat.other quant-ph

    Non-Abelian lattice gauge fields in the photonic synthetic frequency dimension

    Authors: Dali Cheng, Kai Wang, Charles Roques-Carmes, Eran Lustig, Olivia Y. Long, Heming Wang, Shanhui Fan

    Abstract: Non-Abelian gauge fields provide a conceptual framework for the description of particles having spins. The theoretical importance of non-Abelian gauge fields motivates their experimental synthesis and explorations. Here, we demonstrate non-Abelian lattice gauge fields for photons. In the study of gauge fields, lattice models are essential for the understanding of their implications in extended sys… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  17. arXiv:2405.20714  [pdf

    cond-mat.mtrl-sci cond-mat.str-el

    Large low-field magnetocaloric response in a ferromagnetic gadolinium orthophosphate

    Authors: Ziyu W. Yang, Jie Zhang, Maocai Pi, Xubin Ye, Chenxu Kang, Xiaoliang Weng, Wei Tang, Hongzhi Cui, Yu-Jia Zeng, Youwen Long

    Abstract: Bulk magnetic and thermodynamic measurements, along with mean-field calculations, were conducted on the ferromagnetic K3Gd5(PO4)6 powders. No magnetic ordering was observed until 2 K, while the application of an external field B > 1 T resulted in the splitting of the Gd3+ ground state multiplet and induced a non-cooperative Schottky effect. The average nearest-neighbor exchange strength |J1/kB| is… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures

  18. arXiv:2405.18757  [pdf, other

    cs.RO

    Multi-objective Cross-task Learning via Goal-conditioned GPT-based Decision Transformers for Surgical Robot Task Automation

    Authors: Jiawei Fu, Yonghao Long, Kai Chen, Wang Wei, Qi Dou

    Abstract: Surgical robot task automation has been a promising research topic for improving surgical efficiency and quality. Learning-based methods have been recognized as an interesting paradigm and been increasingly investigated. However, existing approaches encounter difficulties in long-horizon goal-conditioned tasks due to the intricate compositional structure, which requires decision-making for a seque… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  19. Wearable-based behaviour interpolation for semi-supervised human activity recognition

    Authors: Haoran Duan, Shidong Wang, Varun Ojha, Shizheng Wang, Yawen Huang, Yang Long, Rajiv Ranjan, Yefeng Zheng

    Abstract: While traditional feature engineering for Human Activity Recognition (HAR) involves a trial-anderror process, deep learning has emerged as a preferred method for high-level representations of sensor-based human activities. However, most deep learning-based HAR requires a large amount of labelled data and extracting HAR features from unlabelled data for effective deep learning training remains chal… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  20. arXiv:2405.15914  [pdf, other

    cs.CV

    ExactDreamer: High-Fidelity Text-to-3D Content Creation via Exact Score Matching

    Authors: Yumin Zhang, Xingyu Miao, Haoran Duan, Bo Wei, Tejal Shah, Yang Long, Rajiv Ranjan

    Abstract: Text-to-3D content creation is a rapidly evolving research area. Given the scarcity of 3D data, current approaches often adapt pre-trained 2D diffusion models for 3D synthesis. Among these approaches, Score Distillation Sampling (SDS) has been widely adopted. However, the issue of over-smoothing poses a significant limitation on the high-fidelity generation of 3D models. To address this challenge,… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  21. arXiv:2405.11252  [pdf, other

    cs.CV

    Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

    Authors: Xingyu Miao, Haoran Duan, Varun Ojha, Jun Song, Tejal Shah, Yang Long, Rajiv Ranjan

    Abstract: In this work, we propose a novel Trajectory Score Matching (TSM) method that aims to solve the pseudo ground truth inconsistency problem caused by the accumulated error in Interval Score Matching (ISM) when using the Denoising Diffusion Implicit Models (DDIM) inversion process. Unlike ISM which adopts the inversion process of DDIM to calculate on a single path, our TSM method leverages the inversi… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  22. arXiv:2405.08748  [pdf, other

    cs.CV

    Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

    Authors: Zhimin Li, Jianwei Zhang, Qin Lin, Jiangfeng Xiong, Yanxin Long, Xinchi Deng, Yingfang Zhang, Xingchao Liu, Minbin Huang, Zedong Xiao, Dayou Chen, Jiajun He, Jiahao Li, Wenyue Li, Chen Zhang, Rongwei Quan, Jianxiang Lu, Jiabin Huang, Xiaoyan Yuan, Xiaoxiao Zheng, Yixuan Li, Jihong Zhang, Chao Zhang, Meng Chen, Jie Liu , et al. (20 additional authors not shown)

    Abstract: We present Hunyuan-DiT, a text-to-image diffusion transformer with fine-grained understanding of both English and Chinese. To construct Hunyuan-DiT, we carefully design the transformer structure, text encoder, and positional encoding. We also build from scratch a whole data pipeline to update and evaluate data for iterative model optimization. For fine-grained language understanding, we train a Mu… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Project Page: https://dit.hunyuan.tencent.com/

  23. Non-Abelian Braiding of Topological Edge Bands

    Authors: Yang Long, Zihao Wang, Chen Zhang, Haoran Xue, Yuxin Zhao, Baile Zhang

    Abstract: Braiding is a geometric concept that manifests itself in a variety of scientific contexts from biology to physics, and has been employed to classify bulk band topology in topological materials. Topological edge states can also form braiding structures, as demonstrated recently in a type of topological insulators known as Möbius insulators, whose topological edge states form two braided bands exhib… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Journal ref: Phys. Rev. Lett. 132, 236401 (2024)

  24. arXiv:2405.04652  [pdf, ps, other

    cs.HC

    AffirmativeAI: Towards LGBTQ+ Friendly Audit Frameworks for Large Language Models

    Authors: Yinru Long, Zilin Ma, Yiyang Mei, Zhaoyuan Su

    Abstract: LGBTQ+ community face disproportionate mental health challenges, including higher rates of depression, anxiety, and suicidal ideation. Research has shown that LGBTQ+ people have been using large language model-based chatbots, such as ChatGPT, for their mental health needs. Despite the potential for immediate support and anonymity these chatbots offer, concerns regarding their capacity to provide e… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  25. arXiv:2405.00956  [pdf, other

    cs.RO cs.CV cs.GR

    SimEndoGS: Efficient Data-driven Scene Simulation using Robotic Surgery Videos via Physics-embedded 3D Gaussians

    Authors: Zhenya Yang, Kai Chen, Yonghao Long, Qi Dou

    Abstract: Surgical scene simulation plays a crucial role in surgical education and simulator-based robot learning. Traditional approaches for creating these environments with surgical scene involve a labor-intensive process where designers hand-craft tissues models with textures and geometries for soft body simulations. This manual approach is not only time-consuming but also limited in the scalability and… ▽ More

    Submitted 6 August, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  26. arXiv:2404.19449  [pdf, other

    cs.IT

    AoI-aware Sensing Scheduling and Trajectory Optimization for Multi-UAV-assisted Wireless Backscatter Networks

    Authors: Yusi Long, Songhan Zhao, Shimin Gong, Bo Gu, Dusit Niyato, Xuemin, Shen

    Abstract: This paper considers multiple unmanned aerial vehicles (UAVs) to assist sensing data transmissions from the ground users (GUs) to a remote base station (BS). Each UAV collects sensing data from the GUs and then forwards the sensing data to the remote BS. The GUs first backscatter their data to the UAVs and then all UAVs forward data to the BS by the nonorthogonal multiple access (NOMA) transmissio… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted by IEEE TVT

  27. arXiv:2404.15339  [pdf, other

    eess.IV

    Efficient EndoNeRF Reconstruction and Its Application for Data-driven Surgical Simulation

    Authors: Yuehao Wang, Bingchen Gong, Yonghao Long, Siu Hin Fan, Qi Dou

    Abstract: The healthcare industry has a growing need for realistic modeling and efficient simulation of surgical scenes. With effective models of deformable surgical scenes, clinicians are able to conduct surgical planning and surgery training on scenarios close to real-world cases. However, a significant challenge in achieving such a goal is the scarcity of high-quality soft tissue models with accurate sha… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 14 pages, 4 figures. Accepted by International Journal of Computer Assisted Radiology and Surgery

  28. arXiv:2404.12291  [pdf

    cs.CL cs.AI

    Augmenting emotion features in irony detection with Large language modeling

    Authors: Yucheng Lin, Yuhan Xia, Yunfei Long

    Abstract: This study introduces a novel method for irony detection, applying Large Language Models (LLMs) with prompt-based learning to facilitate emotion-centric text augmentation. Traditional irony detection techniques typically fall short due to their reliance on static linguistic features and predefined knowledge bases, often overlooking the nuanced emotional dimensions integral to irony. In contrast, o… ▽ More

    Submitted 19 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: 11 pages, 3 tables, 2 figures. Accepted by the 25th Chinese Lexical Semantics Workshop

  29. arXiv:2403.15905  [pdf, other

    cs.LG cs.CV

    Towards Low-Energy Adaptive Personalization for Resource-Constrained Devices

    Authors: Yushan Huang, Josh Millar, Yuxuan Long, Yuchen Zhao, Hamed Haddadi

    Abstract: The personalization of machine learning (ML) models to address data drift is a significant challenge in the context of Internet of Things (IoT) applications. Presently, most approaches focus on fine-tuning either the full base model or its last few layers to adapt to new data, while often neglecting energy costs. However, various types of data drift exist, and fine-tuning the full base model or th… ▽ More

    Submitted 29 March, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

    Comments: Accepetd to The 4th Workshop on Machine Learning and Systems (EuroMLSys '24)

  30. arXiv:2403.15574  [pdf, other

    cs.AI

    SensoryT5: Infusing Sensorimotor Norms into T5 for Enhanced Fine-grained Emotion Classification

    Authors: Yuhan Xia, Qingqing Zhao, Yunfei Long, Ge Xu, Jia Wang

    Abstract: In traditional research approaches, sensory perception and emotion classification have traditionally been considered separate domains. Yet, the significant influence of sensory experiences on emotional responses is undeniable. The natural language processing (NLP) community has often missed the opportunity to merge sensory knowledge with emotion classification. To address this gap, we propose Sens… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Accepted by CogALex 2024 conference

  31. From Explainable to Interpretable Deep Learning for Natural Language Processing in Healthcare: How Far from Reality?

    Authors: Guangming Huang, Yingya Li, Shoaib Jameel, Yunfei Long, Giorgos Papanastasiou

    Abstract: Deep learning (DL) has substantially enhanced natural language processing (NLP) in healthcare research. However, the increasing complexity of DL-based NLP necessitates transparent model interpretability, or at least explainability, for reliable decision-making. This work presents a thorough scoping review of explainable and interpretable DL in healthcare NLP. The term "eXplainable and Interpretabl… ▽ More

    Submitted 9 May, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: This paper has been accepted by Computational and Structural Biotechnology Journal

  32. arXiv:2403.09363  [pdf, other

    cs.CV

    Sentinel-Guided Zero-Shot Learning: A Collaborative Paradigm without Real Data Exposure

    Authors: Fan Wan, Xingyu Miao, Haoran Duan, Jingjing Deng, Rui Gao, Yang Long

    Abstract: With increasing concerns over data privacy and model copyrights, especially in the context of collaborations between AI service providers and data owners, an innovative SG-ZSL paradigm is proposed in this work. SG-ZSL is designed to foster efficient collaboration without the need to exchange models or sensitive data. It consists of a teacher model, a student model and a generator that links both m… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  33. arXiv:2403.08857  [pdf, other

    cs.CV

    DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation

    Authors: Minbin Huang, Yanxin Long, Xinchi Deng, Ruihang Chu, Jiangfeng Xiong, Xiaodan Liang, Hong Cheng, Qinglin Lu, Wei Liu

    Abstract: Text-to-image (T2I) generation models have significantly advanced in recent years. However, effective interaction with these models is challenging for average users due to the need for specialized prompt engineering knowledge and the inability to perform multi-turn image generation, hindering a dynamic and iterative creation process. Recent attempts have tried to equip Multi-modal Large Language M… ▽ More

    Submitted 3 July, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: Project page: https://hunyuan-dialoggen.github.io/

  34. Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning

    Authors: Bingqian Lin, Yanxin Long, Yi Zhu, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Liang Lin

    Abstract: Vision-and-language navigation (VLN) asks an agent to follow a given language instruction to navigate through a real 3D environment. Despite significant advances, conventional VLN agents are trained typically under disturbance-free environments and may easily fail in real-world scenarios, since they are unaware of how to deal with various possible disturbances, such as sudden obstacles or human in… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted by TPAMI 2023

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI,2023)

  35. arXiv:2402.19350  [pdf, other

    cs.CL

    Prompting Explicit and Implicit Knowledge for Multi-hop Question Answering Based on Human Reading Process

    Authors: Guangming Huang, Yunfei Long, Cunjin Luo, Jiaxing Shen, Xia Sun

    Abstract: Pre-trained language models (PLMs) leverage chains-of-thought (CoT) to simulate human reasoning and inference processes, achieving proficient performance in multi-hop QA. However, a gap persists between PLMs' reasoning abilities and those of humans when tackling complex problems. Psychological studies suggest a vital connection between explicit information in passages and human prior knowledge dur… ▽ More

    Submitted 27 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: This paper has been accepted at COLING 2024

  36. arXiv:2402.18541  [pdf, ps, other

    cs.DS

    Dynamic Deterministic Constant-Approximate Distance Oracles with $n^ε$ Worst-Case Update Time

    Authors: Bernhard Haeupler, Yaowei Long, Thatchaphol Saranurak

    Abstract: We present a new distance oracle in the fully dynamic setting: given a weighted undirected graph $G=(V,E)$ with $n$ vertices undergoing both edge insertions and deletions, and an arbitrary parameter $ε$ where $ε\in[1/\log^{c} n,1]$ and $c>0$ is a small constant, we can deterministically maintain a data structure with $n^ε$ worst-case update time that, given any pair of vertices $(u,v)$, returns a… ▽ More

    Submitted 10 April, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 137 pages

  37. arXiv:2402.16722  [pdf

    physics.optics

    All-optical polarization scrambler based on polarization beam splitting with amplified fiber ring

    Authors: Yuanjie Yu, Shiyun Dai, Qiang Wu, Yu Long, Ai Liu, Peng Cai, Ligang Huang, Lei Gao, Tao Zhu

    Abstract: Optical-fiber-based polarization scramblers can reduce the impact of polarization sensitive performance of various optical fiber systems. Here, we propose a simple and efficient polarization scrambler based on an all optical Mach-Zehnder structure by combining polarization beam splitter and amplified fiber ring. To totally decoherence one polarization splitted beam, a fiber ring together with an a… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  38. arXiv:2402.15078  [pdf, other

    cs.SE

    LLM-CompDroid: Repairing Configuration Compatibility Bugs in Android Apps with Pre-trained Large Language Models

    Authors: Zhijie Liu, Yutian Tang, Meiyun Li, Xin Jin, Yunfei Long, Liang Feng Zhang, Xiapu Luo

    Abstract: XML configurations are integral to the Android development framework, particularly in the realm of UI display. However, these configurations can introduce compatibility issues (bugs), resulting in divergent visual outcomes and system crashes across various Android API versions (levels). In this study, we systematically investigate LLM-based approaches for detecting and repairing configuration comp… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  39. arXiv:2402.10353  [pdf, other

    cs.CL cs.LG

    Prompt-Based Bias Calibration for Better Zero/Few-Shot Learning of Language Models

    Authors: Kang He, Yinghan Long, Kaushik Roy

    Abstract: Prompt learning is susceptible to intrinsic bias present in pre-trained language models (LMs), resulting in sub-optimal performance of prompt-based zero/few-shot learning. In this work, we propose a null-input prompting method to calibrate intrinsic bias encoded in pre-trained LMs. Different from prior efforts that address intrinsic bias primarily for social fairness and often involve excessive co… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  40. arXiv:2402.09748  [pdf, other

    cs.CL cs.AI cs.LG cs.PF

    Model Compression and Efficient Inference for Large Language Models: A Survey

    Authors: Wenxiao Wang, Wei Chen, Yicong Luo, Yongliu Long, Zhengkai Lin, Liye Zhang, Binbin Lin, Deng Cai, Xiaofei He

    Abstract: Transformer based large language models have achieved tremendous success. However, the significant memory and computational costs incurred during the inference process make it challenging to deploy large models on resource-constrained devices. In this paper, we investigate compression and efficient inference methods for large language models from an algorithmic perspective. Regarding taxonomy, sim… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: 47 pages, review 380 papers. The work is ongoing

  41. Evaluating the Experience of LGBTQ+ People Using Large Language Model Based Chatbots for Mental Health Support

    Authors: Zilin Ma, Yiyang Mei, Yinru Long, Zhaoyuan Su, Krzysztof Z. Gajos

    Abstract: LGBTQ+ individuals are increasingly turning to chatbots powered by large language models (LLMs) to meet their mental health needs. However, little research has explored whether these chatbots can adequately and safely provide tailored support for this demographic. We interviewed 18 LGBTQ+ and 13 non-LGBTQ+ participants about their experiences with LLM-based chatbots for mental health needs. LGBTQ+… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  42. arXiv:2402.09150  [pdf, ps, other

    cs.DS

    Better Decremental and Fully Dynamic Sensitivity Oracles for Subgraph Connectivity

    Authors: Yaowei Long, Yunfan Wang

    Abstract: We study the \emph{sensitivity oracles problem for subgraph connectivity} in the \emph{decremental} and \emph{fully dynamic} settings. In the fully dynamic setting, we preprocess an $n$-vertices $m$-edges undirected graph $G$ with $n_{\rm off}$ deactivated vertices initially and the others are activated. Then we receive a single update $D\subseteq V(G)$ of size $|D| = d \leq d_{\star}$, representi… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 30 pages

  43. arXiv:2402.02380  [pdf

    cs.CL cs.AI cs.HC

    Evaluating Large Language Models in Analysing Classroom Dialogue

    Authors: Yun Long, Haifeng Luo, Yu Zhang

    Abstract: This study explores the application of Large Language Models (LLMs), specifically GPT-4, in the analysis of classroom dialogue, a crucial research task for both teaching diagnosis and quality improvement. Recognizing the knowledge-intensive and labor-intensive nature of traditional qualitative methods in educational research, this study investigates the potential of LLM to streamline and enhance t… ▽ More

    Submitted 22 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  44. arXiv:2402.01950  [pdf, other

    cs.CV

    ConRF: Zero-shot Stylization of 3D Scenes with Conditioned Radiation Fields

    Authors: Xingyu Miao, Yang Bai, Haoran Duan, Fan Wan, Yawen Huang, Yang Long, Yefeng Zheng

    Abstract: Most of the existing works on arbitrary 3D NeRF style transfer required retraining on each single style condition. This work aims to achieve zero-shot controlled stylization in 3D scenes utilizing text or visual input as conditioning factors. We introduce ConRF, a novel method of zero-shot stylization. Specifically, due to the ambiguity of CLIP features, we employ a conversion process that maps th… ▽ More

    Submitted 6 March, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  45. arXiv:2402.01181  [pdf, other

    cs.RO cs.GR

    Efficient Physically-based Simulation of Soft Bodies in Embodied Environment for Surgical Robot

    Authors: Zhenya Yang, Yonghao Long, Kai Chen, Wang Wei, Qi Dou

    Abstract: Surgical robot simulation platform plays a crucial role in enhancing training efficiency and advancing research on robot learning. Much effort have been made by scholars on developing open-sourced surgical robot simulators to facilitate research. We also developed SurRoL formerly, an open-source, da Vinci Research Kit (dVRK) compatible and interactive embodied environment for robot learning. Despi… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 8 pages

  46. arXiv:2401.17968  [pdf, other

    cond-mat.mes-hall physics.data-an

    Unsupervised Learning of Topological Non-Abelian Braiding in Non-Hermitian Bands

    Authors: Yang Long, Haoran Xue, Baile Zhang

    Abstract: The topological classification of energy bands has laid the groundwork for the discovery of various topological phases of matter in recent decades. While this classification has traditionally focused on real-energy bands, recent studies have revealed the intriguing topology of complex-energy, or non-Hermitian bands. For example, the spectral winding of complex-energy bands can from unique topologi… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  47. arXiv:2401.15383  [pdf, ps, other

    math.GT math.MG

    Connectedness of the Gromov boundary of fine curve graphs

    Authors: Yusen Long, Dong Tan

    Abstract: In this paper, we study the topological properties of the Gromov boundary of the fine curve graph of an orientable finite-type surface of genus at least 2. This graph consisting of topological curves has much richer dynamics than the classical curve graph. Using the techniques introduced by Wright [Wri23], we show that this boundary is (path) connected and that the spheres in non-separating fine c… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

    Comments: 16 pages. New version specifies the topology of curves, corrects some minor errors and typos. Comments are welcome!

    MSC Class: 57K20; 53C23

  48. arXiv:2401.13259  [pdf, ps, other

    math.DS math.SG

    Three closed characteristics on non-degenerate star-shaped hypersurfaces in $\mathbf{R}^{6}$

    Authors: Huagui Duan, Hui Liu, Yiming Long, Zihao Qi, Wei Wang

    Abstract: In this paper, we prove that for every non-degenerate $C^3$ compact star-shaped hypersurface $Σ$ in $\mathbf{R}^{6}$ which carries no prime closed characteristic of Maslov-type index $0$ or no prime closed characteristic of Maslov-type index $-1$, there exist at least three prime closed characteristics on $Σ$.

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 30 pages. arXiv admin note: text overlap with arXiv:2205.07082, arXiv:1510.08648, arXiv:2205.14789

  49. CTNeRF: Cross-Time Transformer for Dynamic Neural Radiance Field from Monocular Video

    Authors: Xingyu Miao, Yang Bai, Haoran Duan, Yawen Huang, Fan Wan, Yang Long, Yefeng Zheng

    Abstract: The goal of our work is to generate high-quality novel views from monocular videos of complex and dynamic scenes. Prior methods, such as DynamicNeRF, have shown impressive performance by leveraging time-varying dynamic radiation fields. However, these methods have limitations when it comes to accurately modeling the motion of complex objects, which can lead to inaccurate and blurry renderings of d… ▽ More

    Submitted 26 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted by Pattern Recognition

  50. arXiv:2401.03623  [pdf

    eess.IV

    A Video Coding Method Based on Neural Network for CLIC2024

    Authors: Zhengang Li, Jingchi Zhang, Yonghua Wang, Xing Zeng, Zhen Zhang, Yunlin Long, Menghu Jia, Ning Wang

    Abstract: This paper presents a video coding scheme that combines traditional optimization methods with deep learning methods based on the Enhanced Compression Model (ECM). In this paper, the traditional optimization methods adaptively adjust the quantization parameter (QP). The key frame QP offset is set according to the video content characteristics, and the coding tree unit (CTU) level QP of all frames i… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.