Skip to main content

Showing 1–50 of 193 results for author: Tao, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12117  [pdf, other

    cs.LG cs.DC

    Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs

    Authors: Pinxue Zhao, Hailin Zhang, Fangcheng Fu, Xiaonan Nie, Qibin Liu, Fang Yang, Yuanbo Peng, Dian Jiao, Shuaipeng Li, Jinbao Xue, Yangyu Tao, Bin Cui

    Abstract: Nowadays, Large Language Models (LLMs) have been trained using extended context lengths to foster more creative applications. However, long context training poses great challenges considering the constraint of GPU memory. It not only leads to substantial activation memory consumption during training, but also incurs considerable memory fragmentation. To facilitate long context training, existing f… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  2. arXiv:2407.07333  [pdf, other

    cs.LG cs.AI stat.ML

    Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy

    Authors: Cameron Allen, Aaron Kirtland, Ruo Yu Tao, Sam Lobel, Daniel Scott, Nicholas Petrocelli, Omer Gottesman, Ronald Parr, Michael L. Littman, George Konidaris

    Abstract: Reinforcement learning algorithms typically rely on the assumption that the environment dynamics and value function can be expressed in terms of a Markovian state representation. However, when state information is only partially observable, how can an agent learn such a state representation, and how can it detect when it has found one? We introduce a metric that can accomplish both objectives, wit… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: GitHub URL: https://github.com/brownirl/lambda_discrepancy

  3. arXiv:2407.06153  [pdf, other

    cs.SE cs.CL

    What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

    Authors: Shihan Dou, Haoxiang Jia, Shenxi Wu, Huiyuan Zheng, Weikang Zhou, Muling Wu, Mingxu Chai, Jessica Fan, Caishuang Huang, Yunbo Tao, Yan Liu, Enyu Zhou, Ming Zhang, Yuhao Zhou, Yueming Wu, Rui Zheng, Ming Wen, Rongxiang Weng, Jingang Wang, Xunliang Cai, Tao Gui, Xipeng Qiu, Qi Zhang, Xuanjing Huang

    Abstract: The increasing development of large language models (LLMs) in code generation has drawn significant attention among researchers. To enhance LLM-based code generation ability, current efforts are predominantly directed towards collecting high-quality datasets and leveraging diverse training technologies. However, there is a notable lack of comprehensive studies examining the limitations and boundar… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 17 pages, 7 figures

  4. arXiv:2407.05324  [pdf, other

    cs.CV

    PICA: Physics-Integrated Clothed Avatar

    Authors: Bo Peng, Yunfan Tao, Haoyu Zhan, Yudong Guo, Juyong Zhang

    Abstract: We introduce PICA, a novel representation for high-fidelity animatable clothed human avatars with physics-accurate dynamics, even for loose clothing. Previous neural rendering-based representations of animatable clothed humans typically employ a single model to represent both the clothing and the underlying body. While efficient, these approaches often fail to accurately represent complex garment… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Project page: https://ustc3dv.github.io/PICA/

  5. Consistency and Discrepancy-Based Contrastive Tripartite Graph Learning for Recommendations

    Authors: Linxin Guo, Yaochen Zhu, Min Gao, Yinghui Tao, Junliang Yu, Chen Chen

    Abstract: Tripartite graph-based recommender systems markedly diverge from traditional models by recommending unique combinations such as user groups and item bundles. Despite their effectiveness, these systems exacerbate the longstanding cold-start problem in traditional recommender systems, because any number of user groups or item bundles can be formed among users or items. To address this issue, we intr… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  6. arXiv:2407.01638  [pdf, other

    cs.SE cs.AI cs.DC cs.PL

    LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes

    Authors: Matthew T. Dearing, Yiheng Tao, Xingfu Wu, Zhiling Lan, Valerie Taylor

    Abstract: This paper addresses the problem of providing a novel approach to sourcing significant training data for LLMs focused on science and engineering. In particular, a crucial challenge is sourcing parallel scientific codes in the ranges of millions to billions of codes. To tackle this problem, we propose an automated pipeline framework, called LASSI, designed to translate between parallel programming… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  7. arXiv:2406.19964  [pdf, other

    cs.CR

    Secure Outsourced Decryption for FHE-based Privacy-preserving Cloud Computing

    Authors: Xirong Ma, Chuan Li, Yuchang Hu, Yunting Tao, Yali Jiang, Yanbin Li, Fanyu Kong, Chunpeng Ge

    Abstract: The demand for processing vast volumes of data has surged dramatically due to the advancement of machine learning technology. Large-scale data processing necessitates substantial computational resources, prompting individuals and enterprises to turn to cloud services. Accompanying this trend is a growing concern regarding data leakage and misuse. Homomorphic encryption (HE) is one solution for saf… ▽ More

    Submitted 9 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

    Comments: content and title updated

  8. arXiv:2406.17807  [pdf, other

    cs.CL cs.AI

    Enhancing Commentary Strategies for Imperfect Information Card Games: A Study of Large Language Models in Guandan Commentary

    Authors: Meiling Tao, Xuechen Liang, Yiling Tao, Tianyu Shi

    Abstract: Recent advancements in large language models (LLMs) have unlocked the potential for generating high-quality game commentary. However, producing insightful and engaging commentary for complex games with incomplete information remains a significant challenge. In this paper, we introduce a novel commentary method that combine Reinforcement Learning (RL) and LLMs, tailored specifically for the Chinese… ▽ More

    Submitted 4 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  9. arXiv:2406.17608  [pdf, other

    cs.CV

    Test-Time Generative Augmentation for Medical Image Segmentation

    Authors: Xiao Ma, Yuhui Tao, Yuhan Zhang, Zexuan Ji, Yizhe Zhang, Qiang Chen

    Abstract: In this paper, we propose a novel approach to enhance medical image segmentation during test time. Instead of employing hand-crafted transforms or functions on the input test image to create multiple views for test-time augmentation, we advocate for the utilization of an advanced domain-fine-tuned generative model (GM), e.g., stable diffusion (SD), for test-time augmentation. Given that the GM has… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 12pages, 2figures

  10. arXiv:2406.17249  [pdf, other

    cs.RO

    SlideSLAM: Sparse, Lightweight, Decentralized Metric-Semantic SLAM for Multi-Robot Navigation

    Authors: Xu Liu, Jiuzhou Lei, Ankit Prabhu, Yuezhan Tao, Igor Spasojevic, Pratik Chaudhari, Nikolay Atanasov, Vijay Kumar

    Abstract: This paper develops a real-time decentralized metric-semantic Simultaneous Localization and Mapping (SLAM) approach that leverages a sparse and lightweight object-based representation to enable a heterogeneous robot team to autonomously explore 3D environments featuring indoor, urban, and forested areas without relying on GPS. We use a hierarchical metric-semantic representation of the environment… ▽ More

    Submitted 2 July, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: Preliminary release

  11. arXiv:2406.03510  [pdf, other

    cs.SD cs.AI eess.AS

    Speech-based Clinical Depression Screening: An Empirical Study

    Authors: Yangbin Chen, Chenyang Xu, Chunfeng Liang, Yanbao Tao, Chuan Shi

    Abstract: This study investigates the utility of speech signals for AI-based depression screening across varied interaction scenarios, including psychiatric interviews, chatbot conversations, and text readings. Participants include depressed patients recruited from the outpatient clinics of Peking University Sixth Hospital and control group members from the community, all diagnosed by psychiatrists followin… ▽ More

    Submitted 12 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: 5 pages, 3 figures

  12. arXiv:2406.01555  [pdf, other

    cs.CV

    Towards Flexible Interactive Reflection Removal with Human Guidance

    Authors: Xiao Chen, Xudong Jiang, Yunkang Tao, Zhen Lei, Qing Li, Chenyang Lei, Zhaoxiang Zhang

    Abstract: Single image reflection removal is inherently ambiguous, as both the reflection and transmission components requiring separation may follow natural image statistics. Existing methods attempt to address the issue by using various types of low-level and physics-based cues as sources of reflection signals. However, these cues are not universally applicable, since they are only observable in specific… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  13. arXiv:2405.17051  [pdf, other

    cs.LG cs.AI

    BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics

    Authors: Hao Wu, Xingjian Shi, Ziyue Huang, Penghao Zhao, Wei Xiong, Jinbao Xue, Yangyu Tao, Xiaomeng Huang, Weiyan Wang

    Abstract: Data-driven deep learning has emerged as the new paradigm to model complex physical space-time systems. These data-driven methods learn patterns by optimizing statistical metrics and tend to overlook the adherence to physical laws, unlike traditional model-driven numerical methods. Thus, they often generate predictions that are not physically realistic. On the other hand, by sampling a large amoun… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  14. arXiv:2405.14578  [pdf, other

    cs.LG

    Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

    Authors: Shuaipeng Li, Penghao Zhao, Hailin Zhang, Xingwu Sun, Hao Wu, Dian Jiao, Weiyan Wang, Chengjun Liu, Zheng Fang, Jinbao Xue, Yangyu Tao, Bin Cui, Di Wang

    Abstract: In current deep learning tasks, Adam style optimizers such as Adam, Adagrad, RMSProp, Adafactor, and Lion have been widely used as alternatives to SGD style optimizers. These optimizers typically update model parameters using the sign of gradients, resulting in more stable convergence curves. The learning rate and the batch size are the most critical hyperparameters for optimizers, which require c… ▽ More

    Submitted 4 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  15. arXiv:2405.10691  [pdf, other

    eess.IV cs.CV

    LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion

    Authors: Zihao Zhu, Tianli Tao, Yitian Tao, Haowen Deng, Xinyi Cai, Gaofeng Wu, Kaidong Wang, Haifeng Tang, Lixuan Zhu, Zhuoyang Gu, Jiawei Huang, Dinggang Shen, Han Zhang

    Abstract: The infant brain undergoes rapid development in the first few years after birth.Compared to cross-sectional studies, longitudinal studies can depict the trajectories of infants brain development with higher accuracy, statistical power and flexibility.However, the collection of infant longitudinal magnetic resonance (MR) data suffers a notorious dropout problem, resulting in incomplete datasets wit… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  16. arXiv:2405.08748  [pdf, other

    cs.CV

    Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

    Authors: Zhimin Li, Jianwei Zhang, Qin Lin, Jiangfeng Xiong, Yanxin Long, Xinchi Deng, Yingfang Zhang, Xingchao Liu, Minbin Huang, Zedong Xiao, Dayou Chen, Jiajun He, Jiahao Li, Wenyue Li, Chen Zhang, Rongwei Quan, Jianxiang Lu, Jiabin Huang, Xiaoyan Yuan, Xiaoxiao Zheng, Yixuan Li, Jihong Zhang, Chao Zhang, Meng Chen, Jie Liu , et al. (20 additional authors not shown)

    Abstract: We present Hunyuan-DiT, a text-to-image diffusion transformer with fine-grained understanding of both English and Chinese. To construct Hunyuan-DiT, we carefully design the transformer structure, text encoder, and positional encoding. We also build from scratch a whole data pipeline to update and evaluate data for iterative model optimization. For fine-grained language understanding, we train a Mu… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Project Page: https://dit.hunyuan.tencent.com/

  17. arXiv:2405.05931  [pdf

    cs.HC

    Harms in Repurposing Real-World Sensory Cues for Mixed Reality: A Causal Perspective

    Authors: Yujie Tao, Sean Follmer

    Abstract: The rise of Mixed Reality (MR) stimulates new interactive techniques that seamlessly blend the virtual and physical environments. Just as virtual content could be overlayed onto the physical world for providing adaptive user interfaces [5, 8], emergent techniques "repurpose" everyday environments and sensory cues to support the virtual content [7, 9, 13-15]. For instance, a strong wind gust in the… ▽ More

    Submitted 23 April, 2024; originally announced May 2024.

    Comments: This is an accepted position statement of CHI 2024 Workshop (Novel Approaches for Understanding and Mitigating Emerging New Harms in Immersive and Embodied Virtual Spaces: A Workshop at CHI 2024)

  18. arXiv:2405.03119  [pdf, ps, other

    cs.IT eess.SP

    DAFT-Spread Affine Frequency Division Multiple Access for Downlink Transmission

    Authors: Yiwei Tao, Miaowen Wen, Yao Ge, Tianqi Mao, Lixia Xiao, Jun Li

    Abstract: Affine frequency division multiplexing (AFDM) and orthogonal AFDM access (O-AFDMA) are promising techniques based on chirp signals, which are able to suppress the performance deterioration caused by Doppler shifts in high-mobility scenarios. However, the high peak-to-average power ratio (PAPR) in AFDM or O-AFDMA is still a crucial problem, which severely limits their practical applications. In thi… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  19. arXiv:2405.00216  [pdf, other

    cs.CL cs.AI cs.LG

    Graphical Reasoning: LLM-based Semi-Open Relation Extraction

    Authors: Yicheng Tao, Yiqun Wang, Longju Bai

    Abstract: This paper presents a comprehensive exploration of relation extraction utilizing advanced language models, specifically Chain of Thought (CoT) and Graphical Reasoning (GRE) techniques. We demonstrate how leveraging in-context learning with GPT-3.5 can significantly enhance the extraction process, particularly through detailed example-based reasoning. Additionally, we introduce a novel graphical re… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  20. arXiv:2404.08564  [pdf, ps, other

    cs.LG

    Federated Distillation: A Survey

    Authors: Lin Li, Jianping Gou, Baosheng Yu, Lan Du, Zhang Yiand Dacheng Tao

    Abstract: Federated Learning (FL) seeks to train a model collaboratively without sharing private training data from individual clients. Despite its promise, FL encounters challenges such as high communication costs for large-scale models and the necessity for uniform model architectures across all clients and the server. These challenges severely restrict the practical applications of FL. To address these l… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  21. arXiv:2404.00769  [pdf, other

    cs.RO

    An Active Perception Game for Robust Autonomous Exploration

    Authors: Siming He, Yuezhan Tao, Igor Spasojevic, Vijay Kumar, Pratik Chaudhari

    Abstract: We formulate active perception for an autonomous agent that explores an unknown environment as a two-player zero-sum game: the agent aims to maximize information gained from the environment while the environment aims to minimize the information gained by the agent. In each episode, the environment reveals a set of actions with their potentially erroneous information gain. In order to select the be… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  22. arXiv:2404.00588  [pdf, other

    cs.CV cs.AI

    Memory-based Cross-modal Semantic Alignment Network for Radiology Report Generation

    Authors: Yitian Tao, Liyan Ma, Jing Yu, Han Zhang

    Abstract: Generating radiology reports automatically reduces the workload of radiologists and helps the diagnoses of specific diseases. Many existing methods take this task as modality transfer process. However, since the key information related to disease accounts for a small proportion in both image and report, it is hard for the model to learn the latent relation between the radiology image and its repor… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: 12 pages, 8 figures

  23. arXiv:2403.20058  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Revolutionizing Disease Diagnosis with simultaneous functional PET/MR and Deeply Integrated Brain Metabolic, Hemodynamic, and Perfusion Networks

    Authors: Luoyu Wang, Yitian Tao, Qing Yang, Yan Liang, Siwei Liu, Hongcheng Shi, Dinggang Shen, Han Zhang

    Abstract: Simultaneous functional PET/MR (sf-PET/MR) presents a cutting-edge multimodal neuroimaging technique. It provides an unprecedented opportunity for concurrently monitoring and integrating multifaceted brain networks built by spatiotemporally covaried metabolic activity, neural activity, and cerebral blood flow (perfusion). Albeit high scientific/clinical values, short in hardware accessibility of P… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 11 pages

  24. arXiv:2403.18361  [pdf, other

    cs.CV

    ViTAR: Vision Transformer with Any Resolution

    Authors: Qihang Fan, Quanzeng You, Xiaotian Han, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang

    Abstract: This paper tackles a significant challenge faced by Vision Transformers (ViTs): their constrained scalability across different image resolutions. Typically, ViTs experience a performance decline when processing resolutions different from those seen during training. Our work introduces two key innovations to address this issue. Firstly, we propose a novel module for dynamic resolution adjustment, d… ▽ More

    Submitted 28 March, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

  25. arXiv:2403.18121  [pdf, other

    cs.CL cs.HC

    ChatGPT Role-play Dataset: Analysis of User Motives and Model Naturalness

    Authors: Yufei Tao, Ameeta Agrawal, Judit Dombi, Tetyana Sydorenko, Jung In Lee

    Abstract: Recent advances in interactive large language models like ChatGPT have revolutionized various domains; however, their behavior in natural and role-play conversation settings remains underexplored. In our study, we address this gap by deeply investigating how ChatGPT behaves during conversations in different settings by analyzing its interactions in both a normal way and a role-play setting. We int… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted by LREC-COLING 2024

  26. arXiv:2403.17067  [pdf, other

    cs.RO

    Trajectory Optimization with Global Yaw Parameterization for Field-of-View Constrained Autonomous Flight

    Authors: Yuwei Wu, Yuezhan Tao, Igor Spasojevic, Vijay Kumar

    Abstract: Trajectory generation for quadrotors with limited field-of-view sensors has numerous applications such as aerial exploration, coverage, inspection, videography, and target tracking. Most previous works simplify the task of optimizing yaw trajectories by either aligning the heading of the robot with its velocity, or potentially restricting the feasible space of candidate trajectories by using a lim… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  27. arXiv:2403.14020  [pdf, other

    cs.CR cs.NI

    Zero-Knowledge Proof of Distinct Identity: a Standard-compatible Sybil-resistant Pseudonym Extension for C-ITS

    Authors: Ye Tao, Hongyi Wu, Ehsan Javanmardi, Manabu Tsukada, Hiroshi Esaki

    Abstract: Pseudonyms are widely used in Cooperative Intelligent Transport Systems (C-ITS) to protect the location privacy of vehicles. However, the unlinkability nature of pseudonyms also enables Sybil attacks, where a malicious vehicle can pretend to be multiple vehicles at the same time. In this paper, we propose a novel protocol called zero-knowledge Proof of Distinct Identity (zk-PoDI,) which allows a v… ▽ More

    Submitted 3 May, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted for publication at IEEE IV 2024

  28. arXiv:2403.10287  [pdf, other

    cs.CV

    Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models

    Authors: Tian Meng, Yang Tao, Ruilin Lyu, Wuliang Yin

    Abstract: The task of few-shot image classification and segmentation (FS-CS) involves classifying and segmenting target objects in a query image, given only a few examples of the target classes. We introduce the Vision-Instructed Segmentation and Evaluation (VISE) method that transforms the FS-CS problem into the Visual Question Answering (VQA) problem, utilising Vision-Language Models (VLMs), and addresses… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  29. arXiv:2403.07191  [pdf, other

    cs.LG cs.AI cs.CL

    $\mathbf{(N,K)}$-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model

    Authors: Yufeng Zhang, Liyu Chen, Boyi Liu, Yingxiang Yang, Qiwen Cui, Yunzhe Tao, Hongxia Yang

    Abstract: Recent advances in reinforcement learning (RL) algorithms aim to enhance the performance of language models at scale. Yet, there is a noticeable absence of a cost-effective and standardized testbed tailored to evaluating and comparing these algorithms. To bridge this gap, we present a generalized version of the 24-Puzzle: the $(N,K)$-Puzzle, which challenges language models to reach a target value… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 8 pages

  30. arXiv:2403.06877  [pdf, other

    cs.RO cs.CV

    SiLVR: Scalable Lidar-Visual Reconstruction with Neural Radiance Fields for Robotic Inspection

    Authors: Yifu Tao, Yash Bhalgat, Lanke Frank Tarimo Fu, Matias Mattamala, Nived Chebrolu, Maurice Fallon

    Abstract: We present a neural-field-based large-scale reconstruction system that fuses lidar and vision data to generate high-quality reconstructions that are geometrically accurate and capture photo-realistic textures. This system adapts the state-of-the-art neural radiance field (NeRF) representation to also incorporate lidar data which adds strong geometric constraints on the depth and surface normals. W… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted at ICRA 2024; Website: https://ori-drs.github.io/projects/silvr/

  31. arXiv:2403.01487  [pdf, other

    cs.CV

    InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding

    Authors: Haogeng Liu, Quanzeng You, Xiaotian Han, Yiqi Wang, Bohan Zhai, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang

    Abstract: Multimodal Large Language Models (MLLMs) have experienced significant advancements recently. Nevertheless, challenges persist in the accurate recognition and comprehension of intricate details within high-resolution images. Despite being indispensable for the development of robust MLLMs, this area remains underinvestigated. To tackle this challenge, our work introduces InfiMM-HD, a novel architect… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  32. arXiv:2402.19273  [pdf, other

    cs.CL

    PlanGPT: Enhancing Urban Planning with Tailored Language Model and Efficient Retrieval

    Authors: He Zhu, Wenjia Zhang, Nuoxian Huang, Boyang Li, Luyao Niu, Zipei Fan, Tianle Lun, Yicheng Tao, Junyou Su, Zhaoya Gong, Chenyu Fang, Xing Liu

    Abstract: In the field of urban planning, general-purpose large language models often struggle to meet the specific needs of planners. Tasks like generating urban planning texts, retrieving related information, and evaluating planning documents pose unique challenges. To enhance the efficiency of urban professionals and overcome these obstacles, we introduce PlanGPT, the first specialized Large Language Mod… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  33. arXiv:2402.13497  [pdf, other

    cs.CV

    Push Quantization-Aware Training Toward Full Precision Performances via Consistency Regularization

    Authors: Junbiao Pang, Tianyang Cai, Baochang Zhang, Jiaqi Wu, Ye Tao

    Abstract: Existing Quantization-Aware Training (QAT) methods intensively depend on the complete labeled dataset or knowledge distillation to guarantee the performances toward Full Precision (FP) accuracies. However, empirical results show that QAT still has inferior results compared to its FP counterpart. One question is how to push QAT toward or even surpass FP performances. In this paper, we address this… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 11 pages, 5 figures

  34. 3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods

    Authors: Till Beemelmanns, Yuchen Tao, Bastian Lampe, Lennart Reiher, Raphael van Kempen, Timo Woopen, Lutz Eckstein

    Abstract: Storing and transmitting LiDAR point cloud data is essential for many AV applications, such as training data collection, remote control, cloud services or SLAM. However, due to the sparsity and unordered structure of the data, it is difficult to compress point cloud data to a low volume. Transforming the raw point cloud data into a dense 2D matrix structure is a promising way for applying compress… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: Code: https://github.com/ika-rwth-aachen/Point-Cloud-Compression

    Journal ref: 2022 IEEE Intelligent Vehicles Symposium (IV)

  35. arXiv:2402.09994  [pdf, ps, other

    cs.GT math.OC

    Approximating Competitive Equilibrium by Nash Welfare

    Authors: Jugal Garg, Yixin Tao, László A. Végh

    Abstract: We explore the relationship between two popular concepts on allocating divisible items: competitive equilibrium (CE) and allocations with maximum Nash welfare, i.e., allocations where the weighted geometric mean of the utilities is maximal. When agents have homogeneous concave utility functions, these two concepts coincide: the classical Eisenberg-Gale convex program that maximizes Nash welfare ov… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  36. arXiv:2402.09450  [pdf, other

    eess.SP cs.AI cs.LG

    Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of Electrocardiogram

    Authors: Yeongyeon Na, Minje Park, Yunwon Tae, Sunghoon Joo

    Abstract: Electrocardiograms (ECG) are widely employed as a diagnostic tool for monitoring electrical signals originating from a heart. Recent machine learning research efforts have focused on the application of screening various diseases using ECG signals. However, adapting to the application of screening disease is challenging in that labeled ECG data are limited. Achieving general representation through… ▽ More

    Submitted 19 March, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: ICLR 2024. The first three authors contribute equally

  37. arXiv:2402.07076  [pdf, other

    cs.IR cs.AI

    Enhancing Multi-field B2B Cloud Solution Matching via Contrastive Pre-training

    Authors: Haonan Chen, Zhicheng Dou, Xuetong Hao, Yunhao Tao, Shiren Song, Zhenli Sheng

    Abstract: Cloud solutions have gained significant popularity in the technology industry as they offer a combination of services and tools to tackle specific problems. However, despite their widespread use, the task of identifying appropriate company customers for a specific target solution to the sales team of a solution provider remains a complex business problem that existing matching systems have yet to… ▽ More

    Submitted 6 June, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Comments: KDD 2024, ADS Track

  38. arXiv:2402.00143  [pdf, other

    cs.CL cs.HC

    Making a Long Story Short in Conversation Modeling

    Authors: Yufei Tao, Tiernan Mines, Ameeta Agrawal

    Abstract: Conversation systems accommodate diverse users with unique personalities and distinct writing styles. Within the domain of multi-turn dialogue modeling, this work studies the impact of varied utterance lengths on the quality of subsequent responses generated by conversation models. Using GPT-3 as the base model, multiple dialogue datasets, and several metrics, we conduct a thorough exploration of… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

    Comments: This paper was accepted by TEICAI workshop at EACL 2024

  39. arXiv:2401.18064  [pdf, other

    cs.IR cs.DB

    Neural Locality Sensitive Hashing for Entity Blocking

    Authors: Runhui Wang, Luyang Kong, Yefan Tao, Andrew Borthwick, Davor Golac, Henrik Johnson, Shadie Hijazi, Dong Deng, Yongfeng Zhang

    Abstract: Locality-sensitive hashing (LSH) is a fundamental algorithmic technique widely employed in large-scale data processing applications, such as nearest-neighbor search, entity resolution, and clustering. However, its applicability in some real-world scenarios is limited due to the need for careful design of hashing functions that align with specific metrics. Existing LSH-based Entity Blocking solutio… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  40. arXiv:2401.12237  [pdf, other

    math.AT cs.LG q-bio.QM

    A distribution-guided Mapper algorithm

    Authors: Yuyang Tao, Shufei Ge

    Abstract: Motivation: The Mapper algorithm is an essential tool to explore shape of data in topology data analysis. With a dataset as an input, the Mapper algorithm outputs a graph representing the topological features of the whole dataset. This graph is often regarded as an approximation of a reeb graph of data. The classic Mapper algorithm uses fixed interval lengths and overlapping ratios, which might fa… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  41. arXiv:2401.11018  [pdf, other

    cs.LG cs.DC

    Communication Efficient and Provable Federated Unlearning

    Authors: Youming Tao, Cheng-Long Wang, Miao Pan, Dongxiao Yu, Xiuzhen Cheng, Di Wang

    Abstract: We study federated unlearning, a novel problem to eliminate the impact of specific clients or data points on the global model learned via federated learning (FL). This problem is driven by the right to be forgotten and the privacy challenges in FL. We introduce a new framework for exact federated unlearning that meets two essential criteria: \textit{communication efficiency} and \textit{exact unle… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  42. arXiv:2401.08733  [pdf, other

    cs.SI

    In the Eyes of the Bystander: Are the Stances on Different Conflicts Correlated?

    Authors: Yiyao Tao, Hengyu Zhang, Babli Dey, Selenge Tulga, Hanjia Lyu, Jiebo Luo

    Abstract: Public opinion on international conflicts, such as the concurrent Russia-Ukraine and Israel-Palestine crises, often reflects a society's values, beliefs, and history. These simultaneous conflicts have sparked heated global online discussions, offering a unique opportunity to explore the dynamics of public opinion in multiple international crises. This study investigates how public opinions toward… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  43. arXiv:2401.01516  [pdf, ps, other

    cs.GT

    A Complete Landscape for the Price of Envy-Freeness

    Authors: Zihao Li, Shengxin Liu, Xinhang Lu, Biaoshuai Tao, Yichen Tao

    Abstract: We study the efficiency of fair allocations using the well-studied price of fairness concept, which quantitatively measures the worst-case efficiency loss when imposing fairness constraints. Previous works provided partial results on the price of fairness with well-known fairness notions such as envy-freeness up to one good (EF1) and envy-freeness up to any good (EFX). In this paper, we give a com… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Appears in the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2024

  44. arXiv:2312.12797  [pdf, ps, other

    cs.DB cs.DS

    Join Sampling under Acyclic Degree Constraints and (Cyclic) Subgraph Sampling

    Authors: Ru Wang, Yufei Tao

    Abstract: Given a join with an acyclic set of degree constraints, we show how to draw a uniformly random sample from the join result in $O(\mathit{polymat}/ \max \{1, \mathrm{OUT} \})$ expected time after a preprocessing of $O(\mathrm{IN})$ expected time, where $\mathrm{IN}$, $\mathrm{OUT}$, and $\mathit{polymat}$ are the join's input size, output size, and polymatroid bound, respectively. This compares fav… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  45. Knowledge Graphs and Pre-trained Language Models enhanced Representation Learning for Conversational Recommender Systems

    Authors: Zhangchi Qiu, Ye Tao, Shirui Pan, Alan Wee-Chung Liew

    Abstract: Conversational recommender systems (CRS) utilize natural language interactions and dialogue history to infer user preferences and provide accurate recommendations. Due to the limited conversation context and background knowledge, existing CRSs rely on external sources such as knowledge graphs to enrich the context and model entities based on their inter-relations. However, these methods ignore the… ▽ More

    Submitted 1 May, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS)

  46. arXiv:2312.07948  [pdf, other

    cs.NI cs.CR

    Zero-Knowledge Proof of Traffic: A Deterministic and Privacy-Preserving Cross Verification Mechanism for Cooperative Perception Data

    Authors: Ye Tao, Ehsan Javanmardi, Pengfei Lin, Jin Nakazato, Yuze Jiang, Manabu Tsukada, Hiroshi Esaki

    Abstract: Cooperative perception is crucial for connected automated vehicles in intelligent transportation systems (ITSs); however, ensuring the authenticity of perception data remains a challenge as the vehicles cannot verify events that they do not witness independently. Various studies have been conducted on establishing the authenticity of data, such as trust-based statistical methods and plausibility-b… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  47. arXiv:2311.18724  [pdf, other

    cs.IR

    Routing-Guided Learned Product Quantization for Graph-Based Approximate Nearest Neighbor Search

    Authors: Qiang Yue, Xiaoliang Xu, Yuxiang Wang, Yikun Tao, Xuliyuan Luo

    Abstract: Given a vector dataset $\mathcal{X}$, a query vector $\vec{x}_q$, graph-based Approximate Nearest Neighbor Search (ANNS) aims to build a proximity graph (PG) as an index of $\mathcal{X}$ and approximately return vectors with minimum distances to $\vec{x}_q$ by searching over the PG index. It suffers from the large-scale $\mathcal{X}$ because a PG with full vectors is too large to fit into the memo… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: 14 pages, 12 figures

  48. arXiv:2311.16500  [pdf, other

    cs.CV

    LLMGA: Multimodal Large Language Model based Generation Assistant

    Authors: Bin Xia, Shiyin Wang, Yingfan Tao, Yitong Wang, Jiaya Jia

    Abstract: In this paper, we introduce a Multimodal Large Language Model-based Generation Assistant (LLMGA), leveraging the vast reservoir of knowledge and proficiency in reasoning, comprehension, and response inherent in Large Language Models (LLMs) to assist users in image generation and editing. Diverging from existing approaches where Multimodal Large Language Models (MLLMs) generate fixed-size embedding… ▽ More

    Submitted 11 March, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

  49. arXiv:2311.14096  [pdf, other

    cs.CL cs.AI

    Cultural Bias and Cultural Alignment of Large Language Models

    Authors: Yan Tao, Olga Viberg, Ryan S. Baker, Rene F. Kizilcec

    Abstract: Culture fundamentally shapes people's reasoning, behavior, and communication. As people increasingly use generative artificial intelligence (AI) to expedite and automate personal and professional tasks, cultural values embedded in AI models may bias people's authentic expression and contribute to the dominance of certain cultures. We conduct a disaggregated evaluation of cultural bias for five wid… ▽ More

    Submitted 26 June, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

  50. arXiv:2311.13052  [pdf, other

    eess.IV cs.CV cs.LG

    Novel OCT mosaicking pipeline with Feature- and Pixel-based registration

    Authors: Jiacheng Wang, Hao Li, Dewei Hu, Yuankai K. Tao, Ipek Oguz

    Abstract: High-resolution Optical Coherence Tomography (OCT) images are crucial for ophthalmology studies but are limited by their relatively narrow field of view (FoV). Image mosaicking is a technique for aligning multiple overlapping images to obtain a larger FoV. Current mosaicking pipelines often struggle with substantial noise and considerable displacement between the input sub-fields. In this paper, w… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.