Skip to main content

Showing 1–50 of 1,119 results for author: Li, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13246  [pdf, other

    cs.CV

    STS MICCAI 2023 Challenge: Grand challenge on 2D and 3D semi-supervised tooth segmentation

    Authors: Yaqi Wang, Yifan Zhang, Xiaodiao Chen, Shuai Wang, Dahong Qian, Fan Ye, Feng Xu, Hongyuan Zhang, Qianni Zhang, Chengyu Wu, Yunxiang Li, Weiwei Cui, Shan Luo, Chengkai Wang, Tianhao Li, Yi Liu, Xiang Feng, Huiyu Zhou, Dongyun Liu, Qixuan Wang, Zhouhao Lin, Wei Song, Yuanlin Li, Bing Wang, Chunshi Wang , et al. (2 additional authors not shown)

    Abstract: Computer-aided design (CAD) tools are increasingly popular in modern dental practice, particularly for treatment planning or comprehensive prognosis evaluation. In particular, the 2D panoramic X-ray image efficiently detects invisible caries, impacted teeth and supernumerary teeth in children, while the 3D dental cone beam computed tomography (CBCT) is widely used in orthodontics and endodontics d… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.12788  [pdf, other

    cs.CV cs.AI

    SS-ADA: A Semi-Supervised Active Domain Adaptation Framework for Semantic Segmentation

    Authors: Weihao Yan, Yeqiang Qian, Yueyuan Li, Tao Li, Chunxiang Wang, Ming Yang

    Abstract: Semantic segmentation plays an important role in intelligent vehicles, providing pixel-level semantic information about the environment. However, the labeling budget is expensive and time-consuming when semantic segmentation model is applied to new driving scenarios. To reduce the costs, semi-supervised semantic segmentation methods have been proposed to leverage large quantities of unlabeled imag… ▽ More

    Submitted 17 June, 2024; originally announced July 2024.

    Comments: 12 pages,13 figures,8 tables

  3. arXiv:2407.12550  [pdf, other

    cs.LG

    UniTE: A Survey and Unified Pipeline for Pre-training ST Trajectory Embeddings

    Authors: Yan Lin, Zeyu Zhou, Yicheng Liu, Haochen Lv, Haomin Wen, Tianyi Li, Yushuai Li, Christian S. Jensen, Shengnan Guo, Youfang Lin, Huaiyu Wan

    Abstract: Spatio-temporal (ST) trajectories are sequences of timestamped locations, which enable a variety of analyses that in turn enable important real-world applications. It is common to map trajectories to vectors, called embeddings, before subsequent analyses. Thus, the qualities of embeddings are very important. Methods for pre-training embeddings, which leverage unlabeled trajectories for training un… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  4. arXiv:2407.12247  [pdf, other

    cs.CL

    Lacuna Language Learning: Leveraging RNNs for Ranked Text Completion in Digitized Coptic Manuscripts

    Authors: Lauren Levine, Cindy Tung Li, Lydia Bremer-McCollum, Nicholas Wagner, Amir Zeldes

    Abstract: Ancient manuscripts are frequently damaged, containing gaps in the text known as lacunae. In this paper, we present a bidirectional RNN model for character prediction of Coptic characters in manuscript lacunae. Our best model performs with 72% accuracy on single character reconstruction, but falls to 37% when reconstructing lacunae of various lengths. While not suitable for definitive manuscript r… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Machine Learning for Ancient Languages, ACL 2024 Workshop, 15 August 2024

    ACM Class: I.2.7

  5. Incremental high average-utility itemset mining: survey and challenges

    Authors: Jing Chen, Shengyi Yang, Weiping Ding, Peng Li, Aijun Liu, Hongjun Zhang, Tian Li

    Abstract: The High Average Utility Itemset Mining (HAUIM) technique, a variation of High Utility Itemset Mining (HUIM), uses the average utility of the itemsets. Historically, most HAUIM algorithms were designed for static databases. However, practical applications like market basket analysis and business decision-making necessitate regular updates of the database with new transactions. As a result, researc… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 25 pages, 23 figures

  6. arXiv:2407.10671  [pdf, other

    cs.CL cs.AI

    Qwen2 Technical Report

    Authors: An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin , et al. (37 additional authors not shown)

    Abstract: This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a… ▽ More

    Submitted 17 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 25 pages, 1 figure

  7. arXiv:2407.10427  [pdf, other

    eess.IV cs.CV

    Transformer for Multitemporal Hyperspectral Image Unmixing

    Authors: Hang Li, Qiankun Dong, Xueshuo Xie, Xia Xu, Tao Li, Zhenwei Shi

    Abstract: Multitemporal hyperspectral image unmixing (MTHU) holds significant importance in monitoring and analyzing the dynamic changes of surface. However, compared to single-temporal unmixing, the multitemporal approach demands comprehensive consideration of information across different phases, rendering it a greater challenge. To address this challenge, we propose the Multitemporal Hyperspectral Image U… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  8. arXiv:2407.09508  [pdf, other

    cs.HC cs.LG

    Focused State Recognition Using EEG with Eye Movement-Assisted Annotation

    Authors: Tian-Hua Li, Tian-Fang Ma, Dan Peng, Wei-Long Zheng, Bao-Liang Lu

    Abstract: With the rapid advancement in machine learning, the recognition and analysis of brain activity based on EEG and eye movement signals have attained a high level of sophistication. Utilizing deep learning models for learning EEG and eye movement features proves effective in classifying brain activities. A focused state indicates intense concentration on a task or thought. Distinguishing focused and… ▽ More

    Submitted 15 June, 2024; originally announced July 2024.

  9. arXiv:2407.09007  [pdf, other

    cs.CL

    Benchmarking Language Model Creativity: A Case Study on Code Generation

    Authors: Yining Lu, Dixuan Wang, Tianjian Li, Dongwei Jiang, Daniel Khashabi

    Abstract: As LLMs become increasingly prevalent, it is interesting to consider how ``creative'' these models can be. From cognitive science, creativity consists of at least two key characteristics: \emph{convergent} thinking (purposefulness to achieve a given goal) and \emph{divergent} thinking (adaptability to new environments or constraints) \citep{runco2003critical}. In this work, we introduce a framewor… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  10. arXiv:2407.08132  [pdf, other

    cs.CV

    DMM: Disparity-guided Multispectral Mamba for Oriented Object Detection in Remote Sensing

    Authors: Minghang Zhou, Tianyu Li, Chaofan Qiao, Dongyu Xie, Guoqing Wang, Ningjuan Ruan, Lin Mei, Yang Yang

    Abstract: Multispectral oriented object detection faces challenges due to both inter-modal and intra-modal discrepancies. Recent studies often rely on transformer-based models to address these issues and achieve cross-modal fusion detection. However, the quadratic computational complexity of transformers limits their performance. Inspired by the efficiency and lower complexity of Mamba in long sequence task… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 12 pages, 9 figures

  11. arXiv:2407.07835  [pdf, other

    cs.CV cs.AI

    RoBus: A Multimodal Dataset for Controllable Road Networks and Building Layouts Generation

    Authors: Tao Li, Ruihang Li, Huangnan Zheng, Shanding Ye, Shijian Li, Zhijie Pan

    Abstract: Automated 3D city generation, focusing on road networks and building layouts, is in high demand for applications in urban design, multimedia games and autonomous driving simulations. The surge of generative AI facilitates designing city layouts based on deep learning models. However, the lack of high-quality datasets and benchmarks hinders the progress of these data-driven methods in generating ro… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  12. arXiv:2407.07436  [pdf, other

    cs.IT math.OC

    Alternating Subspace Approximate Message Passing

    Authors: Xu Zhu, Yufei Ma, Xiaoguang Li, Tiejun Li

    Abstract: Numerous renowned algorithms for tackling the compressed sensing problem employ an alternating strategy, which typically involves data matching in one module and denoising in another. Based on an in-depth analysis of the connection between the message passing and operator splitting, we present a novel approach, the Alternating Subspace Method (ASM), which intuitively combines the principles of the… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 19 pages, 6 figures

    MSC Class: 94A12; 65F10; 90C06

  13. arXiv:2407.06116  [pdf

    eess.IV cs.CV cs.LG

    Data-driven Nucleus Subclassification on Colon H&E using Style-transferred Digital Pathology

    Authors: Lucas W. Remedios, Shunxing Bao, Samuel W. Remedios, Ho Hin Lee, Leon Y. Cai, Thomas Li, Ruining Deng, Nancy R. Newlin, Adam M. Saunders, Can Cui, Jia Li, Qi Liu, Ken S. Lau, Joseph T. Roland, Mary K Washington, Lori A. Coburn, Keith T. Wilson, Yuankai Huo, Bennett A. Landman

    Abstract: Understanding the way cells communicate, co-locate, and interrelate is essential to furthering our understanding of how the body functions. H&E is widely available, however, cell subtyping often requires expert knowledge and the use of specialized stains. To reduce the annotation burden, AI has been proposed for the classification of cells on H&E. For example, the recent Colon Nucleus Identificati… ▽ More

    Submitted 15 May, 2024; originally announced July 2024.

    Comments: arXiv admin note: text overlap with arXiv:2401.05602

  14. arXiv:2407.06027  [pdf, other

    cs.CL

    PAS: Data-Efficient Plug-and-Play Prompt Augmentation System

    Authors: Miao Zheng, Hao Liang, Fan Yang, Haoze Sun, Tianpeng Li, Lingchu Xiong, Yan Zhang, Youzhen Wu, Kun Li, Yanjun Shen, Mingan Lin, Tao Zhang, Guosheng Dong, Yujing Qiao, Kun Fang, Weipeng Chen, Bin Cui, Wentao Zhang, Zenan Zhou

    Abstract: In recent years, the rise of Large Language Models (LLMs) has spurred a growing demand for plug-and-play AI systems. Among the various AI techniques, prompt engineering stands out as particularly significant. However, users often face challenges in writing prompts due to the steep learning curve and significant time investment, and existing automatic prompt engineering (APE) models can be difficul… ▽ More

    Submitted 18 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  15. arXiv:2407.04996  [pdf, other

    cs.LG cs.CV

    The Solution for the sequential task continual learning track of the 2nd Greater Bay Area International Algorithm Competition

    Authors: Sishun Pan, Xixian Wu, Tingmin Li, Longfei Huang, Mingxu Feng, Zhonghua Wan, Yang Yang

    Abstract: This paper presents a data-free, parameter-isolation-based continual learning algorithm we developed for the sequential task continual learning track of the 2nd Greater Bay Area International Algorithm Competition. The method learns an independent parameter subspace for each task within the network's convolutional and linear layers and freezes the batch normalization layers after the first task. S… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  16. arXiv:2407.04965  [pdf, other

    cs.CL

    Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression

    Authors: Zhichao Xu, Ashim Gupta, Tao Li, Oliver Bentham, Vivek Srikumar

    Abstract: Large language models (LLMs) are increasingly deployed in real-world scenarios with the help of recent model compression techniques. Such momentum towards local deployment means the use of compressed LLMs will widely impact a large population. However, prior analysis works often prioritize on preserving perplexity which is a direct analogy to training loss. The impact of compression method on othe… ▽ More

    Submitted 10 July, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

  17. arXiv:2407.04949  [pdf, other

    cs.LG cs.DC

    Beyond the Federation: Topology-aware Federated Learning for Generalization to Unseen Clients

    Authors: Mengmeng Ma, Tang Li, Xi Peng

    Abstract: Federated Learning is widely employed to tackle distributed sensitive data. Existing methods primarily focus on addressing in-federation data heterogeneity. However, we observed that they suffer from significant performance degradation when applied to unseen clients for out-of-federation (OOF) generalization. The recent attempts to address generalization to unseen clients generally struggle to sca… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: ICML 2024

  18. arXiv:2407.04938  [pdf, other

    cs.CV

    SAM-Med3D-MoE: Towards a Non-Forgetting Segment Anything Model via Mixture of Experts for 3D Medical Image Segmentation

    Authors: Guoan Wang, Jin Ye, Junlong Cheng, Tianbin Li, Zhaolin Chen, Jianfei Cai, Junjun He, Bohan Zhuang

    Abstract: Volumetric medical image segmentation is pivotal in enhancing disease diagnosis, treatment planning, and advancing medical research. While existing volumetric foundation models for medical image segmentation, such as SAM-Med3D and SegVol, have shown remarkable performance on general organs and tumors, their ability to segment certain categories in clinical downstream tasks remains limited. Supervi… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Journal ref: MICCAI 2024

  19. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  20. arXiv:2407.03641  [pdf, other

    cs.LG

    Scalable Learned Model Soup on a Single GPU: An Efficient Subspace Training Strategy

    Authors: Tao Li, Weisen Jiang, Fanghui Liu, Xiaolin Huang, James T. Kwok

    Abstract: Pre-training followed by fine-tuning is widely adopted among practitioners. The performance can be improved by "model soups"~\cite{wortsman2022model} via exploring various hyperparameter configurations.The Learned-Soup, a variant of model soups, significantly improves the performance but suffers from substantial memory and time costs due to the requirements of (i) having to load all fine-tuned mod… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  21. arXiv:2407.03040  [pdf, other

    cs.CL cs.AI

    Raw Text is All you Need: Knowledge-intensive Multi-turn Instruction Tuning for Large Language Model

    Authors: Xia Hou, Qifeng Li, Jian Yang, Tongliang Li, Linzheng Chai, Xianjie Wu, Hangyuan Ji, Zhoujun Li, Jixuan Nie, Jingbo Dun, Wenfeng Song

    Abstract: Instruction tuning as an effective technique aligns the outputs of large language models (LLMs) with human preference. But how to generate the seasonal multi-turn dialogues from raw documents for instruction tuning still requires further exploration. In this paper, we present a novel framework named R2S that leverages the CoD-Chain of Dialogue logic to guide large language models (LLMs) in generat… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures

    MSC Class: 68T50 ACM Class: I.2.7

  22. arXiv:2407.02846  [pdf, other

    cs.CV

    Multi-Task Domain Adaptation for Language Grounding with 3D Objects

    Authors: Penglei Sun, Yaoxian Song, Xinglin Pan, Peijie Dong, Xiaofei Yang, Qiang Wang, Zhixu Li, Tiefeng Li, Xiaowen Chu

    Abstract: The existing works on object-level language grounding with 3D objects mostly focus on improving performance by utilizing the off-the-shelf pre-trained models to capture features, such as viewpoint selection or geometric priors. However, they have failed to consider exploring the cross-modal representation of language-vision alignment in the cross-domain field. To answer this problem, we propose a… ▽ More

    Submitted 5 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

  23. arXiv:2407.02830  [pdf, other

    cs.CV eess.IV

    A Radiometric Correction based Optical Modeling Approach to Removing Reflection Noise in TLS Point Clouds of Urban Scenes

    Authors: Li Fang, Tianyu Li, Yanghong Lin, Shudong Zhou, Wei Yao

    Abstract: Point clouds are vital in computer vision tasks such as 3D reconstruction, autonomous driving, and robotics. However, TLS-acquired point clouds often contain virtual points from reflective surfaces, causing disruptions. This study presents a reflection noise elimination algorithm for TLS point clouds. Our innovative reflection plane detection algorithm, based on geometry-optical models and physica… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  24. arXiv:2407.02765  [pdf, ps, other

    eess.SY cs.AI math.OC math.PR

    Graphon Particle Systems, Part II: Dynamics of Distributed Stochastic Continuum Optimization

    Authors: Yan Chen, Tao Li

    Abstract: We study the distributed optimization problem over a graphon with a continuum of nodes, which is regarded as the limit of the distributed networked optimization as the number of nodes goes to infinity. Each node has a private local cost function. The global cost function, which all nodes cooperatively minimize, is the integral of the local cost functions on the node set. We propose stochastic grad… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  25. arXiv:2407.02763  [pdf, other

    cs.CV

    ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers

    Authors: Yanfeng Jiang, Ning Sun, Xueshuo Xie, Fei Yang, Tao Li

    Abstract: Vision Transformers (ViTs) have exhibited exceptional performance across diverse computer vision tasks, while their substantial parameter size incurs significantly increased memory and computational demands, impeding effective inference on resource-constrained devices. Quantization has emerged as a promising solution to mitigate these challenges, yet existing methods still suffer from significant… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 28 pages,9 figures

  26. arXiv:2407.02211  [pdf, other

    cs.CL cs.AI cs.LG

    PromptIntern: Saving Inference Costs by Internalizing Recurrent Prompt during Large Language Model Fine-tuning

    Authors: Jiaru Zou, Mengyu Zhou, Tao Li, Shi Han, Dongmei Zhang

    Abstract: Large language models (LLMs) have played a fundamental role in various natural language processing tasks with powerful prompt techniques. However, in real-world applications, there are often similar prompt components for repeated queries, which causes significant computational burdens during inference. Existing prompt compression and direct fine-tuning methods aim to tackle these challenges, yet t… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  27. arXiv:2407.02081  [pdf, other

    cs.DC

    On the Performance and Memory Footprint of Distributed Training: An Empirical Study on Transformers

    Authors: Zhengxian Lu, Fangyu Wang, Zhiwei Xu, Fei Yang, Tao Li

    Abstract: Transformer models have emerged as potent solutions to a wide array of multidisciplinary challenges. The deployment of Transformer architectures is significantly hindered by their extensive computational and memory requirements, necessitating the reliance on advanced efficient distributed training methodologies. Prior research has delved into the performance bottlenecks associated with distributed… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  28. arXiv:2407.01897  [pdf, other

    cs.CL

    Proposal Report for the 2nd SciCAP Competition 2024

    Authors: Pengpeng Li, Tingmin Li, Jingyuan Wang, Boyuan Wang, Yang Yang

    Abstract: In this paper, we propose a method for document summarization using auxiliary information. This approach effectively summarizes descriptions related to specific images, tables, and appendices within lengthy texts. Our experiments demonstrate that leveraging high-quality OCR data and initially extracted information from the original text enables efficient summarization of the content related to des… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  29. arXiv:2407.01887  [pdf, other

    cs.LG cs.AI cs.CL

    Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents

    Authors: Fanzeng Xia, Hao Liu, Yisong Yue, Tongxin Li

    Abstract: In-context decision-making is an important capability of artificial general intelligence, which Large Language Models (LLMs) have effectively demonstrated in various scenarios. However, LLMs often face challenges when dealing with numerical contexts, and limited attention has been paid to evaluating their performance through preference feedback generated by the environment. This paper investigates… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  30. arXiv:2407.00875  [pdf, other

    cs.CL cs.AI

    MoE-CT: A Novel Approach For Large Language Models Training With Resistance To Catastrophic Forgetting

    Authors: Tianhao Li, Shangjie Li, Binbin Xie, Deyi Xiong, Baosong Yang

    Abstract: The advent of large language models (LLMs) has predominantly catered to high-resource languages, leaving a disparity in performance for low-resource languages. Conventional Continual Training (CT) approaches to bridge this gap often undermine a model's original linguistic proficiency when expanding to multilingual contexts. Addressing this issue, we introduce a novel MoE-CT architecture, a paradig… ▽ More

    Submitted 25 June, 2024; originally announced July 2024.

    Comments: 13 pages, 2 figures

  31. arXiv:2407.00315  [pdf, other

    cs.CV

    Learning Unsupervised Gaze Representation via Eye Mask Driven Information Bottleneck

    Authors: Yangzhou Jiang, Yinxin Lin, Yaoming Wang, Teng Li, Bilian Ke, Bingbing Ni

    Abstract: Appearance-based supervised methods with full-face image input have made tremendous advances in recent gaze estimation tasks. However, intensive human annotation requirement inhibits current methods from achieving industrial level accuracy and robustness. Although current unsupervised pre-training frameworks have achieved success in many image recognition tasks, due to the deep coupling between fa… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 12 pages, 6 figures, 7 tables

  32. Personalized Federated Continual Learning via Multi-granularity Prompt

    Authors: Hao Yu, Xin Yang, Xin Gao, Yan Kang, Hao Wang, Junbo Zhang, Tianrui Li

    Abstract: Personalized Federated Continual Learning (PFCL) is a new practical scenario that poses greater challenges in sharing and personalizing knowledge. PFCL not only relies on knowledge fusion for server aggregation at the global spatial-temporal perspective but also needs model improvement for each client according to the local requirements. Existing methods, whether in Personalized Federated Learning… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Accepted by KDD 2024 Research Track

  33. arXiv:2406.18559  [pdf, other

    cs.HC cs.AI cs.CV cs.LG

    Revision Matters: Generative Design Guided by Revision Edits

    Authors: Tao Li, Chin-Yi Cheng, Amber Xie, Gang Li, Yang Li

    Abstract: Layout design, such as user interface or graphical layout in general, is fundamentally an iterative revision process. Through revising a design repeatedly, the designer converges on an ideal layout. In this paper, we investigate how revision edits from human designer can benefit a multimodal generative model. To do so, we curate an expert dataset that traces how human designers iteratively edit an… ▽ More

    Submitted 27 May, 2024; originally announced June 2024.

  34. arXiv:2406.18372  [pdf, ps, other

    cs.AR

    A Lightweight Algorithm for Classifying Ex Vivo Tissues Samples

    Authors: Tzu-Hao Li, Ethan Murphy, Allaire Doussan, Ryan Halter, Kofi Odame

    Abstract: In this paper, we present a novel algorithm for classifying ex vivo tissue that comprises multi-channel bioimpedance analysis and a hardware neural network. When implemented in a mixed-signal 180 nm CMOS process, the classifier has an estimated power budget of 39 mW and an area of 30 mm2. This means that the classifier can be integrated into the tip of a surgical margin assessment probe, for in vi… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  35. arXiv:2406.17841  [pdf, other

    quant-ph cs.AI

    Probing many-body Bell correlation depth with superconducting qubits

    Authors: Ke Wang, Weikang Li, Shibo Xu, Mengyao Hu, Jiachen Chen, Yaozu Wu, Chuanyu Zhang, Feitong Jin, Xuhao Zhu, Yu Gao, Ziqi Tan, Aosai Zhang, Ning Wang, Yiren Zou, Tingting Li, Fanhao Shen, Jiarun Zhong, Zehang Bao, Zitian Zhu, Zixuan Song, Jinfeng Deng, Hang Dong, Xu Zhang, Pengfei Zhang, Wenjie Jiang , et al. (10 additional authors not shown)

    Abstract: Quantum nonlocality describes a stronger form of quantum correlation than that of entanglement. It refutes Einstein's belief of local realism and is among the most distinctive and enigmatic features of quantum mechanics. It is a crucial resource for achieving quantum advantages in a variety of practical applications, ranging from cryptography and certified random number generation via self-testing… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 11 pages,6 figures + 14 pages, 6 figures

  36. arXiv:2406.16615  [pdf, other

    cs.CV

    The Championship-Winning Solution for the 5th CLVISION Challenge 2024

    Authors: Sishun Pan, Tingmin Li, Yang Yang

    Abstract: In this paper, we introduce our approach to the 5th CLVision Challenge, which presents distinctive challenges beyond traditional class incremental learning. Unlike standard settings, this competition features the recurrence of previously encountered classes and includes unlabeled data that may contain Out-of-Distribution (OOD) categories. Our approach is based on Winning Subnetworks to allocate in… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  37. Residual path integrals for re-rendering

    Authors: Bing Xu, Tzu-Mao Li, Iliyan Georgiev, Trevor Hedstrom, Ravi Ramamoorthi

    Abstract: Conventional rendering techniques are primarily designed and optimized for single-frame rendering. In practical applications, such as scene editing and animation rendering, users frequently encounter scenes where only a small portion is modified between consecutive frames. In this paper, we develop a novel approach to incremental re-rendering of scenes with dynamic objects, where only a small part… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 14 pages, 13 figures

    ACM Class: I.3.0

  38. arXiv:2406.16253  [pdf, other

    cs.CL

    LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

    Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

    Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  39. arXiv:2406.16177  [pdf, other

    cs.HC

    Flowy: Supporting UX Design Decisions Through AI-Driven Pattern Annotation in Multi-Screen User Flows

    Authors: Yuwen Lu, Ziang Tong, Qinyi Zhao, Yewon Oh, Bryan Wang, Toby Jia-Jun Li

    Abstract: Many recent AI-powered UX design tools focus on generating individual static UI screens from natural language. However, they overlook the crucial aspect of interactions and user experiences across multiple screens. Through formative studies with UX professionals, we identified limitations of these tools in supporting realistic UX design workflows. In response, we designed and developed Flowy, an a… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  40. arXiv:2406.16173  [pdf, other

    cs.HC

    Crepe: A Mobile Screen Data Collector Using Graph Query

    Authors: Yuwen Lu, Meng Chen, Qi Zhao, Victor Cox, Yang Yang, Meng Jiang, Jay Brockman, Tamara Kay, Toby Jia-Jun Li

    Abstract: Collecting mobile datasets remains challenging for academic researchers due to limited data access and technical barriers. Commercial organizations often possess exclusive access to mobile data, leading to a "data monopoly" that restricts the independence of academic research. Existing open-source mobile data collection frameworks primarily focus on mobile sensing data rather than screen content,… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  41. arXiv:2406.16062  [pdf, other

    cs.NE

    Towards Biologically Plausible Computing: A Comprehensive Comparison

    Authors: Changze Lv, Yufei Gu, Zhengkang Guo, Zhibo Xu, Yixin Wu, Feiran Zhang, Tianyuan Shi, Zhenghua Wang, Ruicheng Yin, Yu Shang, Siqi Zhong, Xiaohua Wang, Muling Wu, Wenhao Liu, Tianlong Li, Jianhao Zhu, Cenyuan Zhang, Zixuan Ling, Xiaoqing Zheng

    Abstract: Backpropagation is a cornerstone algorithm in training neural networks for supervised learning, which uses a gradient descent method to update network weights by minimizing the discrepancy between actual and desired outputs. Despite its pivotal role in propelling deep learning advancements, the biological plausibility of backpropagation is questioned due to its requirements for weight symmetry, gl… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  42. arXiv:2406.15349  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

    Authors: Daniel Dauner, Marcel Hallgarten, Tianyu Li, Xinshuo Weng, Zhiyu Huang, Zetong Yang, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, Andreas Geiger, Kashyap Chitta

    Abstract: Benchmarking vision-based driving policies is challenging. On one hand, open-loop evaluation with real data is easy, but these results do not reflect closed-loop performance. On the other, closed-loop evaluation is possible in simulation, but is hard to scale due to its significant computational demands. Further, the simulators available today exhibit a large domain gap to real data. This has resu… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  43. arXiv:2406.14401  [pdf, other

    cs.LG cs.AI

    Fair Streaming Feature Selection

    Authors: Zhangling Duan, Tianci Li, Xingyu Wu, Zhaolong Ling, Jingye Yang, Zhaohong Jia

    Abstract: Streaming feature selection techniques have become essential in processing real-time data streams, as they facilitate the identification of the most relevant attributes from continuously updating information. Despite their performance, current algorithms to streaming feature selection frequently fall short in managing biases and avoiding discrimination that could be perpetuated by sensitive attrib… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 30 pages, 10 figures

  44. arXiv:2406.13036  [pdf, other

    stat.ML cs.LG math.PR math.ST stat.CO

    Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities

    Authors: Matthew T. C. Li, Tiangang Cui, Fengyi Li, Youssef Marzouk, Olivier Zahm

    Abstract: Identifying low-dimensional structure in high-dimensional probability measures is an essential pre-processing step for efficient sampling. We introduce a method for identifying and approximating a target measure $π$ as a perturbation of a given reference measure $μ$ along a few significant directions of $\mathbb{R}^{d}$. The reference measure can be a Gaussian or a nonlinear transformation of a Ga… ▽ More

    Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  45. arXiv:2406.12193  [pdf, other

    cs.LG

    Adaptive Collaborative Correlation Learning-based Semi-Supervised Multi-Label Feature Selection

    Authors: Yanyong Huang, Li Yang, Dongjie Wang, Ke Li, Xiuwen Yi, Fengmao Lv, Tianrui Li

    Abstract: Semi-supervised multi-label feature selection has recently been developed to solve the curse of dimensionality problem in high-dimensional multi-label data with certain samples missing labels. Although many efforts have been made, most existing methods use a predefined graph approach to capture the sample similarity or the label correlation. In this manner, the presence of noise and outliers withi… ▽ More

    Submitted 25 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  46. arXiv:2406.11939  [pdf, other

    cs.LG cs.AI cs.CL

    From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

    Authors: Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, Ion Stoica

    Abstract: The rapid evolution of language models has necessitated the development of more challenging benchmarks. Current static benchmarks often struggle to consistently distinguish between the capabilities of different models and fail to align with real-world user preferences. On the other hand, live crowd-sourced platforms like the Chatbot Arena collect a wide range of natural prompts and user feedback.… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  47. arXiv:2406.11838  [pdf, other

    cs.CV

    Autoregressive Image Generation without Vector Quantization

    Authors: Tianhong Li, Yonglong Tian, He Li, Mingyang Deng, Kaiming He

    Abstract: Conventional wisdom holds that autoregressive models for image generation are typically accompanied by vector-quantized tokens. We observe that while a discrete-valued space can facilitate representing a categorical distribution, it is not a necessity for autoregressive modeling. In this work, we propose to model the per-token probability distribution using a diffusion procedure, which allows us t… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Tech report

  48. arXiv:2406.10976  [pdf, other

    cs.LG cs.CL cs.CR

    Promoting Data and Model Privacy in Federated Learning through Quantized LoRA

    Authors: JianHao Zhu, Changze Lv, Xiaohua Wang, Muling Wu, Wenhao Liu, Tianlong Li, Zixuan Ling, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang

    Abstract: Conventional federated learning primarily aims to secure the privacy of data distributed across multiple edge devices, with the global model dispatched to edge devices for parameter updates during the learning process. However, the development of large language models (LLMs) requires substantial data and computational resources, rendering them valuable intellectual properties for their developers… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  49. arXiv:2406.10534  [pdf, other

    cs.LG cs.AI physics.flu-dyn

    A Finite Difference Informed Graph Network for Solving Steady-State Incompressible Flows on Block-Structured Grids

    Authors: Yiye Zou, Tianyu Li, Shufan Zou, Jingyu Wang, Laiping Zhang, Xiaogang Deng

    Abstract: Recently, advancements in deep learning have enabled physics-informed neural networks (PINNs) to solve partial differential equations (PDEs). Numerical differentiation (ND) using the finite difference (FD) method is efficient in physics-constrained designs, even in parameterized settings, often employing body-fitted block-structured grids for complex flow cases. However, convolution operators in C… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  50. arXiv:2406.09973  [pdf, other

    cs.CV

    InstructRL4Pix: Training Diffusion for Image Editing by Reinforcement Learning

    Authors: Tiancheng Li, Jinxiu Liu, Huajun Chen, Qi Liu

    Abstract: Instruction-based image editing has made a great process in using natural human language to manipulate the visual content of images. However, existing models are limited by the quality of the dataset and cannot accurately localize editing regions in images with complex object relationships. In this paper, we propose Reinforcement Learning Guided Image Editing Method(InstructRL4Pix) to train a diff… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.