Skip to main content

Showing 1–50 of 153 results for author: Shao, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02263  [pdf, other

    cs.LG physics.chem-ph q-bio.BM quant-ph

    FreeCG: Free the Design Space of Clebsch-Gordan Transform for Machine Learning Force Field

    Authors: Shihao Shao, Haoran Geng, Qinghua Cui

    Abstract: The Clebsch-Gordan Transform (CG transform) effectively encodes many-body interactions. Many studies have proven its accuracy in depicting atomic environments, although this comes with high computational needs. The computational burden of this challenge is hard to reduce due to the need for permutation equivariance, which limits the design space of the CG transform layer. We show that, implementin… ▽ More

    Submitted 14 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2406.14864  [pdf, other

    cs.LG stat.AP stat.ML

    A review of feature selection strategies utilizing graph data structures and knowledge graphs

    Authors: Sisi Shao, Pedro Henrique Ribeiro, Christina Ramirez, Jason H. Moore

    Abstract: Feature selection in Knowledge Graphs (KGs) are increasingly utilized in diverse domains, including biomedical research, Natural Language Processing (NLP), and personalized recommendation systems. This paper delves into the methodologies for feature selection within KGs, emphasizing their roles in enhancing machine learning (ML) model efficacy, hypothesis generation, and interpretability. Through… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  3. arXiv:2406.03229  [pdf, other

    cs.CV cs.AI cs.LG

    Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models

    Authors: Qutub Syed Sha, Michael Paulitsch, Karthik Pattabiraman, Korbinian Hagn, Fabian Oboril, Cornelius Buerkle, Kay-Ulrich Scholl, Gereon Hinz, Alois Knoll

    Abstract: As transformer-based object detection models progress, their impact in critical sectors like autonomous vehicles and aviation is expected to grow. Soft errors causing bit flips during inference have significantly impacted DNN performance, altering predictions. Traditional range restriction solutions for CNNs fall short for transformers. This study introduces the Global Clipper and Global Hybrid Cl… ▽ More

    Submitted 9 July, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at IJCAI-AISafety'24 Workshop

  4. arXiv:2406.00975  [pdf, other

    cs.CL cs.AI

    Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost

    Authors: Masha Belyi, Robert Friel, Shuai Shao, Atindriyo Sanyal

    Abstract: Retriever Augmented Generation (RAG) systems have become pivotal in enhancing the capabilities of language models by incorporating external knowledge retrieval mechanisms. However, a significant challenge in deploying these systems in industry applications is the detection and mitigation of hallucinations: instances where the model generates information that is not grounded in the retrieved contex… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  5. arXiv:2405.20620  [pdf, other

    cs.LG

    "Forgetting" in Machine Learning and Beyond: A Survey

    Authors: Alyssa Shuang Sha, Bernardo Pereira Nunes, Armin Haller

    Abstract: This survey investigates the multifaceted nature of forgetting in machine learning, drawing insights from neuroscientific research that posits forgetting as an adaptive function rather than a defect, enhancing the learning process and preventing overfitting. This survey focuses on the benefits of forgetting and its applications across various machine learning sub-fields that can help improve model… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  6. arXiv:2405.19349  [pdf, other

    eess.SP cs.CV cs.HC cs.LG

    Beyond Isolated Frames: Enhancing Sensor-Based Human Activity Recognition through Intra- and Inter-Frame Attention

    Authors: Shuai Shao, Yu Guan, Victor Sanchez

    Abstract: Human Activity Recognition (HAR) has become increasingly popular with ubiquitous computing, driven by the popularity of wearable sensors in fields like healthcare and sports. While Convolutional Neural Networks (ConvNets) have significantly contributed to HAR, they often adopt a frame-by-frame analysis, concentrating on individual frames and potentially overlooking the broader temporal dynamics in… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  7. arXiv:2405.12390  [pdf, other

    stat.ML cs.AI cs.LG stat.AP

    A Metric-based Principal Curve Approach for Learning One-dimensional Manifold

    Authors: Elvis Han Cui, Sisi Shao

    Abstract: Principal curve is a well-known statistical method oriented in manifold learning using concepts from differential geometry. In this paper, we propose a novel metric-based principal curve (MPC) method that learns one-dimensional manifold of spatial data. Synthetic datasets Real applications using MNIST dataset show that our method can learn the one-dimensional manifold well in terms of the shape.

    Submitted 20 May, 2024; originally announced May 2024.

  8. arXiv:2405.12386  [pdf, other

    stat.ML cs.LG stat.AP stat.CO

    Particle swarm optimization with Applications to Maximum Likelihood Estimation and Penalized Negative Binomial Regression

    Authors: Sisi Shao, Junhyung Park, Weng Kee Wong

    Abstract: General purpose optimization routines such as nlminb, optim (R) or nlmixed (SAS) are frequently used to estimate model parameters in nonstandard distributions. This paper presents Particle Swarm Optimization (PSO), as an alternative to many of the current algorithms used in statistics. We find that PSO can not only reproduce the same results as the above routines, it can also produce results that… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  9. arXiv:2405.08359  [pdf, other

    cs.CR cs.RO

    GPS-IDS: An Anomaly-based GPS Spoofing Attack Detection Framework for Autonomous Vehicles

    Authors: Murad Mehrab Abrar, Raian Islam, Shalaka Satam, Sicong Shao, Salim Hariri, Pratik Satam

    Abstract: Autonomous Vehicles (AVs) heavily rely on sensors and communication networks like Global Positioning System (GPS) to navigate autonomously. Prior research has indicated that networks like GPS are vulnerable to cyber-attacks such as spoofing and jamming, thus posing serious risks like navigation errors and system failures. These threats are expected to intensify with the widespread deployment of AV… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Article under review at IEEE Transactions on Dependable and Secure Computing. For associated AV-GPS-Dataset, see https://github.com/mehrab-abrar/AV-GPS-Dataset

  10. arXiv:2405.04825  [pdf, other

    cs.CR cs.AI cs.LG

    Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution

    Authors: Shuo Shao, Yiming Li, Hongwei Yao, Yiling He, Zhan Qin, Kui Ren

    Abstract: Ownership verification is currently the most critical and widely adopted post-hoc method to safeguard model copyright. In general, model owners exploit it to identify whether a given suspicious third-party model is stolen from them by examining whether it has particular properties `inherited' from their released models. Currently, backdoor-based model watermarks are the primary and cutting-edge me… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  11. arXiv:2404.19264  [pdf, other

    cs.RO

    DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets

    Authors: Xiaoyu Huang, Yufeng Chi, Ruofeng Wang, Zhongyu Li, Xue Bin Peng, Sophia Shao, Borivoje Nikolic, Koushil Sreenath

    Abstract: This work introduces DiffuseLoco, a framework for training multi-skill diffusion-based policies for dynamic legged locomotion from offline datasets, enabling real-time control of diverse skills on robots in the real world. Offline learning at scale has led to breakthroughs in computer vision, natural language processing, and robotic manipulation domains. However, scaling up learning for legged rob… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  12. arXiv:2404.13733  [pdf, other

    cs.LG cs.AI cs.CV

    Elucidating the Design Space of Dataset Condensation

    Authors: Shitong Shao, Zikai Zhou, Huanran Chen, Zhiqiang Shen

    Abstract: Dataset condensation, a concept within data-centric learning, efficiently transfers critical attributes from an original dataset to a synthetic version, maintaining both diversity and realism. This approach significantly improves model training efficiency and is adaptable across multiple application areas. Previous methods in dataset condensation have faced challenges: some incur high computationa… ▽ More

    Submitted 6 May, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

  13. arXiv:2404.09831  [pdf, other

    cs.CV

    Digging into contrastive learning for robust depth estimation with diffusion models

    Authors: Jiyuan Wang, Chunyu Lin, Lang Nie, Kang Liao, Shuwei Shao, Yao Zhao

    Abstract: Recently, diffusion-based depth estimation methods have drawn widespread attention due to their elegant denoising patterns and promising performance. However, they are typically unreliable under adverse conditions prevalent in real-world scenarios, such as rainy, snowy, etc. In this paper, we propose a novel robust depth estimation method called D4RD, featuring a custom contrastive learning mode t… ▽ More

    Submitted 19 May, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: 8 pages,6 figures

  14. arXiv:2404.07976  [pdf, other

    cs.CV cs.AI

    Self-supervised Dataset Distillation: A Good Compression Is All You Need

    Authors: Muxin Zhou, Zeyuan Yin, Shitong Shao, Zhiqiang Shen

    Abstract: Dataset distillation aims to compress information from a large-scale original dataset to a new compact dataset while striving to preserve the utmost degree of the original data informational essence. Previous studies have predominantly concentrated on aligning the intermediate statistics between the original and distilled data, such as weight trajectory, features, gradient, BatchNorm, etc. In this… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  15. arXiv:2404.04306  [pdf, other

    cs.CR cs.AI cs.CL cs.CY

    AuditGPT: Auditing Smart Contracts with ChatGPT

    Authors: Shihao Xia, Shuai Shao, Mengting He, Tingting Yu, Linhai Song, Yiying Zhang

    Abstract: To govern smart contracts running on Ethereum, multiple Ethereum Request for Comment (ERC) standards have been developed, each containing a set of rules to guide the behaviors of smart contracts. Violating the ERC rules could cause serious security issues and financial loss, signifying the importance of verifying smart contracts follow ERCs. Today's practices of such verification are to either man… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  16. arXiv:2404.03121  [pdf

    cs.CV q-bio.NC

    Utilizing Computer Vision for Continuous Monitoring of Vaccine Side Effects in Experimental Mice

    Authors: Chuang Li, Shuai Shao, Willian Mikason, Rubing Lin, Yantong Liu

    Abstract: The demand for improved efficiency and accuracy in vaccine safety assessments is increasing. Here, we explore the application of computer vision technologies to automate the monitoring of experimental mice for potential side effects after vaccine administration. Traditional observation methods are labor-intensive and lack the capability for continuous monitoring. By deploying a computer vision sys… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 1 figure

  17. arXiv:2403.18443  [pdf, other

    cs.CV

    $\mathrm{F^2Depth}$: Self-supervised Indoor Monocular Depth Estimation via Optical Flow Consistency and Feature Map Synthesis

    Authors: Xiaotong Guo, Huijie Zhao, Shuwei Shao, Xudong Li, Baochang Zhang

    Abstract: Self-supervised monocular depth estimation methods have been increasingly given much attention due to the benefit of not requiring large, labelled datasets. Such self-supervised methods require high-quality salient features and consequently suffer from severe performance drop for indoor scenes, where low-textured regions dominant in the scenes are almost indiscriminative. To address the issue, we… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  18. arXiv:2403.06900  [pdf, other

    cs.DC

    Dynamic Client Clustering, Bandwidth Allocation, and Workload Optimization for Semi-synchronous Federated Learning

    Authors: Liangkun Yu, Xiang Sun, Rana Albelaihi, Chaeeun Park, Sihua Shao

    Abstract: Federated Learning (FL) revolutionizes collaborative machine learning among Internet of Things (IoT) devices by enabling them to train models collectively while preserving data privacy. FL algorithms fall into two primary categories: synchronous and asynchronous. While synchronous FL efficiently handles straggler devices, it can compromise convergence speed and model accuracy. In contrast, asynchr… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  19. arXiv:2403.00331  [pdf, other

    cs.DC

    WindGP: Efficient Graph Partitioning on Heterogenous Machines

    Authors: Li Zeng, Haohan Huang, Binfan Zheng, Kang Yang, Shengcheng Shao, Jinhua Zhou, Jun Xie, Rongqian Zhao, Xin Chen

    Abstract: Graph Partitioning is widely used in many real-world applications such as fraud detection and social network analysis, in order to enable the distributed graph computing on large graphs. However, existing works fail to balance the computation cost and communication cost on machines with different power (including computing capability, network bandwidth and memory size), as they only consider repli… ▽ More

    Submitted 6 March, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: 19 pages, 15 figures, 18 tables

  20. arXiv:2402.11853  [pdf, other

    cs.HC cs.CY cs.RO cs.SD

    Beyond Voice Assistants: Exploring Advantages and Risks of an In-Car Social Robot in Real Driving Scenarios

    Authors: Yuanchao Li, Lachlan Urquhart, Nihan Karatas, Shun Shao, Hiroshi Ishiguro, Xun Shen

    Abstract: In-car Voice Assistants (VAs) play an increasingly critical role in automotive user interface design. However, existing VAs primarily perform simple 'query-answer' tasks, limiting their ability to sustain drivers' long-term attention. In this study, we investigate the effectiveness of an in-car Robot Assistant (RA) that offers functionalities beyond voice interaction. We aim to answer the question… ▽ More

    Submitted 20 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Submitted to ACM Transactions on Computer-Human Interaction

  21. arXiv:2402.02501  [pdf, other

    cs.IT

    Joint Data and Semantics Lossy Compression: Nonasymptotic Converse Bounds and Second-Order Asymptotics

    Authors: Huiyuan Yang, Yuxuan Shi, Shuo Shao, Xiaojun Yuan

    Abstract: This paper studies the joint data and semantics lossy compression problem, i.e., an extension of the hidden lossy source coding problem that entails recovering both the hidden and observable sources. We aim to study the nonasymptotic and second-order properties of this problem, especially the converse aspect. Specifically, we begin by deriving general nonasymptotic converse bounds valid for genera… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 13 pages, 3 figures

  22. arXiv:2402.02316  [pdf, other

    cs.LG cs.CV

    Your Diffusion Model is Secretly a Certifiably Robust Classifier

    Authors: Huanran Chen, Yinpeng Dong, Shitong Shao, Zhongkai Hao, Xiao Yang, Hang Su, Jun Zhu

    Abstract: Diffusion models are recently employed as generative classifiers for robust classification. However, a comprehensive theoretical understanding of the robustness of diffusion classifiers is still lacking, leading us to question whether they will be vulnerable to future stronger attacks. In this study, we propose a new family of diffusion classifiers, named Noised Diffusion Classifiers~(NDCs), that… ▽ More

    Submitted 13 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  23. arXiv:2402.02012  [pdf, other

    cs.CV

    Precise Knowledge Transfer via Flow Matching

    Authors: Shitong Shao, Zhiqiang Shen, Linrui Gong, Huanran Chen, Xu Dai

    Abstract: In this paper, we propose a novel knowledge transfer framework that introduces continuous normalizing flows for progressive knowledge transformation and leverages multi-step sampling strategies to achieve precision knowledge transfer. We name this framework Knowledge Transfer with Flow Matching (FM-KT), which can be integrated with a metric-based distillation method with any form (\textit{e.g.} va… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  24. arXiv:2401.18079  [pdf, other

    cs.LG

    KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

    Authors: Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Michael W. Mahoney, Yakun Sophia Shao, Kurt Keutzer, Amir Gholami

    Abstract: LLMs are seeing growing use for applications such as document analysis and summarization which require large context windows, and with these large context windows KV cache activations surface as the dominant contributor to memory consumption during inference. Quantization is a promising approach for compressing KV cache activations; however, existing solutions fail to represent activations accurat… ▽ More

    Submitted 4 July, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

  25. arXiv:2401.14962  [pdf, other

    cs.IT

    Joint Data and Semantics Lossy Compression: Nonasymptotic and Second-Order Achievability Bounds

    Authors: Huiyuan Yang, Yuxuan Shi, Shuo Shao, Xiaojun Yuan

    Abstract: This paper studies a joint data and semantics lossy compression problem in the finite blocklength regime, where the data and semantic sources are correlated, and only the data source can be observed by the encoder. We first introduce an information-theoretic nonasymptotic analysis framework to investigate the nonasymptotic fundamental limits of our studied problem. Within this framework, general n… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: 12 pages, 1 figure

  26. arXiv:2401.13980  [pdf, other

    cs.IT eess.IV

    A Nearly Information Theoretically Secure Approach for Semantic Communications over Wiretap Channel

    Authors: Weixuan Chen, Shuo Shao, Qianqian Yang, Zhaoyang Zhang, Ping Zhang

    Abstract: This paper addresses the challenge of achieving information-theoretic security in semantic communication (SeCom) over a wiretap channel, where a legitimate receiver coexists with an eavesdropper experiencing a poorer channel condition. Despite previous efforts to secure SeCom against eavesdroppers, achieving information-theoretic security in such schemes remains an open issue. In this work, we pro… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 13 pages, 16 figures

  27. arXiv:2401.11824  [pdf, other

    cs.CV

    Rethinking Centered Kernel Alignment in Knowledge Distillation

    Authors: Zikai Zhou, Yunhang Shen, Shitong Shao, Linrui Gong, Shaohui Lin

    Abstract: Knowledge distillation has emerged as a highly effective method for bridging the representation discrepancy between large-scale models and lightweight models. Prevalent approaches involve leveraging appropriate metrics to minimize the divergence or distance between the knowledge extracted from the teacher model and the knowledge learned by the student model. Centered Kernel Alignment (CKA) is wide… ▽ More

    Submitted 30 April, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

  28. arXiv:2401.09317  [pdf, other

    math-ph cs.DS math.CO math.PR

    From Zero-Freeness to Strong Spatial Mixing via a Christoffel-Darboux Type Identity

    Authors: Shuai Shao, Xiaowei Ye

    Abstract: We present a unifying approach to derive the strong spatial mixing (SSM) property for the general 2-spin system from zero-free regions of its partition function. Our approach works for the multivariate partition function over all three complex parameters $(β, γ, λ)$, and we allow the zero-free regions of $β, γ$ or $λ$ to be of arbitrary shapes. As long as the zero-free region contains a positive p… ▽ More

    Submitted 10 April, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: Main results are slightly improved: A new condition under which Theorem 1 also holds was added. The condition of Theorem 2 was updated to a more general one

  29. arXiv:2401.01564  [pdf, other

    cs.IT eess.SP

    Deep Learning Based Superposition Coded Modulation for Hierarchical Semantic Communications over Broadcast Channels

    Authors: Yufei Bo, Shuo Shao, Meixia tao

    Abstract: We consider multi-user semantic communications over broadcast channels. While most existing works consider that each receiver requires either the same or independent semantic information, this paper explores the scenario where the semantic information desired by different receivers is different but correlated. In particular, we investigate semantic communications over Gaussian broadcast channels w… ▽ More

    Submitted 12 June, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

  30. arXiv:2312.14219  [pdf, other

    cs.LG cs.AI cs.CR cs.DC

    DCFL: Non-IID awareness Data Condensation aided Federated Learning

    Authors: Shaohan Sha, YaFeng Sun

    Abstract: Federated learning is a decentralized learning paradigm wherein a central server trains a global model iteratively by utilizing clients who possess a certain amount of private datasets. The challenge lies in the fact that the client side private data may not be identically and independently distributed, significantly impacting the accuracy of the global model. Existing methods commonly address the… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: 10 pages,17 figures, haven't been published

  31. arXiv:2311.17950  [pdf, other

    cs.CV cs.AI

    Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching

    Authors: Shitong Shao, Zeyuan Yin, Muxin Zhou, Xindong Zhang, Zhiqiang Shen

    Abstract: The lightweight "local-match-global" matching introduced by SRe2L successfully creates a distilled dataset with comprehensive information on the full 224x224 ImageNet-1k. However, this one-sided approach is limited to a particular backbone, layer, and statistics, which limits the improvement of the generalization of a distilled dataset. We suggest that sufficient and various "local-match-global" m… ▽ More

    Submitted 16 March, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted by CVPR2024

  32. arXiv:2311.10251  [pdf, other

    eess.IV cs.CV cs.LG

    UniMOS: A Universal Framework For Multi-Organ Segmentation Over Label-Constrained Datasets

    Authors: Can Li, Sheng Shao, Junyi Qu, Shuchao Pang, Mehmet A. Orgun

    Abstract: Machine learning models for medical images can help physicians diagnose and manage diseases. However, due to the fact that medical image annotation requires a great deal of manpower and expertise, as well as the fact that clinical departments perform image annotation based on task orientation, there is the problem of having fewer medical image annotation data with more unlabeled data and having ma… ▽ More

    Submitted 19 November, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted by BIBM2023

  33. arXiv:2311.07198  [pdf, other

    cs.CV

    MonoDiffusion: Self-Supervised Monocular Depth Estimation Using Diffusion Model

    Authors: Shuwei Shao, Zhongcai Pei, Weihai Chen, Dingchi Sun, Peter C. Y. Chen, Zhengguo Li

    Abstract: Over the past few years, self-supervised monocular depth estimation that does not depend on ground-truth during the training phase has received widespread attention. Most efforts focus on designing different types of network architectures and loss functions or handling edge cases, e.g., occlusion and dynamic objects. In this work, we introduce a novel self-supervised depth estimation framework, du… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: 10 pages, 8 figures

  34. arXiv:2311.07166  [pdf, other

    cs.CV

    NDDepth: Normal-Distance Assisted Monocular Depth Estimation and Completion

    Authors: Shuwei Shao, Zhongcai Pei, Weihai Chen, Peter C. Y. Chen, Zhengguo Li

    Abstract: Over the past few years, monocular depth estimation and completion have been paid more and more attention from the computer vision community because of their widespread applications. In this paper, we introduce novel physics (geometry)-driven deep learning frameworks for these two tasks by assuming that 3D scenes are constituted with piece-wise planes. Instead of directly estimating the depth map… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: Extension of previous work arXiv:2309.10592

  35. arXiv:2310.19449  [pdf, other

    cs.AI

    Large-Scale Application of Fault Injection into PyTorch Models -- an Extension to PyTorchFI for Validation Efficiency

    Authors: Ralf Graafe, Qutub Syed Sha, Florian Geissler, Michael Paulitsch

    Abstract: Transient or permanent faults in hardware can render the output of Neural Networks (NN) incorrect without user-specific traces of the error, i.e. silent data errors (SDE). On the other hand, modern NNs also possess an inherent redundancy that can tolerate specific faults. To establish a safety case, it is necessary to distinguish and quantify both types of corruptions. To study the effects of hard… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: accepted in DSN2023

  36. arXiv:2310.12072  [pdf, other

    cs.CL

    SPEED: Speculative Pipelined Execution for Efficient Decoding

    Authors: Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Hasan Genc, Kurt Keutzer, Amir Gholami, Sophia Shao

    Abstract: Generative Large Language Models (LLMs) based on the Transformer architecture have recently emerged as a dominant foundation model for a wide range of Natural Language Processing tasks. Nevertheless, their application in real-time scenarios has been highly restricted due to the significant inference latency associated with these models. This is particularly pronounced due to the autoregressive nat… ▽ More

    Submitted 2 January, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: NeurIPS Workshop on Efficient Natural Language and Speech Processing (2023)

  37. arXiv:2310.09892  [pdf, other

    cs.RO

    Active Perception using Neural Radiance Fields

    Authors: Siming He, Christopher D. Hsu, Dexter Ong, Yifei Simon Shao, Pratik Chaudhari

    Abstract: We study active perception from first principles to argue that an autonomous agent performing active perception should maximize the mutual information that past observations posses about future ones. Doing so requires (a) a representation of the scene that summarizes past observations and the ability to update this representation to incorporate new observations (state estimation and mapping), (b)… ▽ More

    Submitted 30 March, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

    Report number: Proc. of the American Control Conference (ACC) 2024

  38. arXiv:2310.06690  [pdf, other

    cs.IT eess.SP

    Joint Coding-Modulation for Digital Semantic Communications via Variational Autoencoder

    Authors: Yufei Bo, Yiheng Duan, Shuo Shao, Meixia Tao

    Abstract: Semantic communications have emerged as a new paradigm for improving communication efficiency by transmitting the semantic information of a source message that is most relevant to a desired task at the receiver. Most existing approaches typically utilize neural networks (NNs) to design end-to-end semantic communication systems, where NN-based semantic encoders output continuously distributed signa… ▽ More

    Submitted 29 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  39. arXiv:2310.03971  [pdf, other

    cs.CL cs.AR

    Quantized Transformer Language Model Implementations on Edge Devices

    Authors: Mohammad Wali Ur Rahman, Murad Mehrab Abrar, Hunter Gibbons Copening, Salim Hariri, Sicong Shao, Pratik Satam, Soheil Salehi

    Abstract: Large-scale transformer-based models like the Bidirectional Encoder Representations from Transformers (BERT) are widely used for Natural Language Processing (NLP) applications, wherein these models are initially pre-trained with a large corpus with millions of parameters and then fine-tuned for a downstream NLP task. One of the major limitations of these large-scale models is that they cannot be d… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: Accepted for publication on 22nd International Conference of Machine Learning and Applications, ICMLA 2023

  40. arXiv:2309.14137  [pdf, other

    cs.CV

    IEBins: Iterative Elastic Bins for Monocular Depth Estimation

    Authors: Shuwei Shao, Zhongcai Pei, Xingming Wu, Zhong Liu, Weihai Chen, Zhengguo Li

    Abstract: Monocular depth estimation (MDE) is a fundamental topic of geometric computer vision and a core technique for many downstream applications. Recently, several methods reframe the MDE as a classification-regression problem where a linear combination of probabilistic distribution and bin centers is used to predict depth. In this paper, we propose a novel concept of iterative elastic bins (IEBins) for… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: Accepted by NeurIPS 2023

  41. arXiv:2309.13720  [pdf, other

    cs.RO

    Design and Evaluation of Motion Planners for Quadrotors in Environments with Varying Complexities

    Authors: Yifei Simon Shao, Yuwei Wu, Laura Jarin-Lipschitz, Pratik Chaudhari, Vijay Kumar

    Abstract: Motion planning techniques for quadrotors have advanced significantly over the past decade. Most successful planners have two stages: a front-end that determines a path that incorporates geometric (or kinematic or input) constraints and specifies the homotopy class of the trajectory, and a back-end that optimizes this path to respect dynamics and input constraints. While there are many different c… ▽ More

    Submitted 7 March, 2024; v1 submitted 24 September, 2023; originally announced September 2023.

  42. arXiv:2309.10592  [pdf, other

    cs.CV

    NDDepth: Normal-Distance Assisted Monocular Depth Estimation

    Authors: Shuwei Shao, Zhongcai Pei, Weihai Chen, Xingming Wu, Zhengguo Li

    Abstract: Monocular depth estimation has drawn widespread attention from the vision community due to its broad applications. In this paper, we propose a novel physics (geometry)-driven deep learning framework for monocular depth estimation by assuming that 3D scenes are constituted by piece-wise planes. Particularly, we introduce a new normal-distance head that outputs pixel-level surface normal and plane-t… ▽ More

    Submitted 24 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: Accepted by ICCV 2023 (Oral)

  43. arXiv:2308.11971  [pdf, other

    cs.CV cs.CL cs.LG cs.MM

    EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE

    Authors: Junyi Chen, Longteng Guo, Jia Sun, Shuai Shao, Zehuan Yuan, Liang Lin, Dongyu Zhang

    Abstract: Building scalable vision-language models to learn from diverse, multimodal data remains an open challenge. In this paper, we introduce an Efficient Vision-languagE foundation model, namely EVE, which is one unified multimodal Transformer pre-trained solely by one unified pre-training task. Specifically, EVE encodes both vision and language within a shared Transformer network integrated with modali… ▽ More

    Submitted 1 March, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: Accepted by AAAI 2024

  44. arXiv:2308.06954  [pdf, other

    cs.CV

    Global Features are All You Need for Image Retrieval and Reranking

    Authors: Shihao Shao, Kaifeng Chen, Arjun Karpur, Qinghua Cui, Andre Araujo, Bingyi Cao

    Abstract: Image retrieval systems conventionally use a two-stage paradigm, leveraging global features for initial retrieval and local features for reranking. However, the scalability of this method is often limited due to the significant storage and computation cost incurred by local feature matching in the reranking stage. In this paper, we present SuperGlobal, a novel approach that exclusively employs glo… ▽ More

    Submitted 19 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: ICCV23 camera-ready + appendix

  45. arXiv:2308.06410  [pdf, ps, other

    cs.PL cs.AR

    Code Transpilation for Hardware Accelerators

    Authors: Yuto Nishida, Sahil Bhatia, Shadaj Laddad, Hasan Genc, Yakun Sophia Shao, Alvin Cheung

    Abstract: DSLs and hardware accelerators have proven to be very effective in optimizing computationally expensive workloads. In this paper, we propose a solution to the challenge of manually rewriting legacy or unoptimized code in domain-specific languages and hardware accelerators. We introduce an approach that integrates two open-source tools: Metalift, a code translation framework, and Gemmini, a DNN acc… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

  46. arXiv:2307.10618  [pdf, other

    cs.OS

    FHPM: Fine-grained Huge Page Management For Virtualization

    Authors: Chuandong Li, Sai Sha, Yangqing Zeng, Xiran Yang, Yingwei Luo, Xiaolin Wang, Zhenlin Wang

    Abstract: As more data-intensive tasks with large footprints are deployed in virtual machines (VMs), huge pages are widely used to eliminate the increasing address translation overhead. However, once the huge page mapping is established, all the base page regions in the huge page share a single extended page table (EPT) entry, so that the hypervisor loses awareness of accesses to base page regions. None of… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

  47. arXiv:2307.03760  [pdf, other

    cs.DC

    CODAG: Characterizing and Optimizing Decompression Algorithms for GPUs

    Authors: Jeongmin Park, Zaid Qureshi, Vikram Mailthody, Andrew Gacek, Shunfan Shao, Mohammad AlMasri, Isaac Gelado, Jinjun Xiong, Chris Newburn, I-hsin Chung, Michael Garland, Nikolay Sakharnykh, Wen-mei Hwu

    Abstract: Data compression and decompression have become vital components of big-data applications to manage the exponential growth in the amount of data collected and stored. Furthermore, big-data applications have increasingly adopted GPUs due to their high compute throughput and memory bandwidth. Prior works presume that decompression is memory-bound and have dedicated most of the GPU's threads to data m… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  48. arXiv:2305.16103  [pdf, other

    cs.CV cs.AI cs.CL cs.MM

    ChatBridge: Bridging Modalities with Large Language Model as a Language Catalyst

    Authors: Zijia Zhao, Longteng Guo, Tongtian Yue, Sihan Chen, Shuai Shao, Xinxin Zhu, Zehuan Yuan, Jing Liu

    Abstract: Building general-purpose models that can perceive diverse real-world modalities and solve various tasks is an appealing target in artificial intelligence. In this paper, we present ChatBridge, a novel multimodal language model that leverages the expressive capabilities of language as the catalyst to bridge the gap between various modalities. We show that only language-paired two-modality data is s… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  49. arXiv:2305.13541  [pdf, ps, other

    cs.LG cs.AI cs.CV cs.HC

    ConvBoost: Boosting ConvNets for Sensor-based Activity Recognition

    Authors: Shuai Shao, Yu Guan, Bing Zhai, Paolo Missier, Thomas Ploetz

    Abstract: Human activity recognition (HAR) is one of the core research themes in ubiquitous and wearable computing. With the shift to deep learning (DL) based analysis approaches, it has become possible to extract high-level features and perform classification in an end-to-end manner. Despite their promising overall capabilities, DL-based HAR may suffer from overfitting due to the notoriously small, often i… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: 21 pages

    Journal ref: Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 7, 2, Article 75 (June 2023)

  50. arXiv:2305.10769  [pdf, other

    cs.LG cs.CV

    Catch-Up Distillation: You Only Need to Train Once for Accelerating Sampling

    Authors: Shitong Shao, Xu Dai, Shouyi Yin, Lujun Li, Huanran Chen, Yang Hu

    Abstract: Diffusion Probability Models (DPMs) have made impressive advancements in various machine learning domains. However, achieving high-quality synthetic samples typically involves performing a large number of sampling steps, which impedes the possibility of real-time sample synthesis. Traditional accelerated sampling algorithms via knowledge distillation rely on pre-trained model weights and discrete… ▽ More

    Submitted 13 June, 2023; v1 submitted 18 May, 2023; originally announced May 2023.