Zum Hauptinhalt springen

Showing 1–50 of 86 results for author: Dong, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.14331  [pdf, other

    cs.LG

    Automated Machine Learning in Insurance

    Authors: Panyi Dong, Zhiyu Quan

    Abstract: Machine Learning (ML) has gained popularity in actuarial research and insurance industrial applications. However, the performance of most ML tasks heavily depends on data preprocessing, model selection, and hyperparameter optimization, which are considered to be intensive in terms of domain knowledge, experience, and manual labor. Automated Machine Learning (AutoML) aims to automatically complete… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  2. arXiv:2408.13293  [pdf, other

    cs.LG cs.AI

    Causally-Aware Spatio-Temporal Multi-Graph Convolution Network for Accurate and Reliable Traffic Prediction

    Authors: Pingping Dong, Xiao-Lin Wang, Indranil Bose, Kam K. H. Ng, Xiaoning Zhang, Xiaoge Zhang

    Abstract: Accurate and reliable prediction has profound implications to a wide range of applications. In this study, we focus on an instance of spatio-temporal learning problem--traffic prediction--to demonstrate an advanced deep learning model developed for making accurate and reliable forecast. Despite the significant progress in traffic prediction, limited studies have incorporated both explicit and impl… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  3. arXiv:2408.09736  [pdf, other

    eess.IV cs.CV

    Coarse-Fine View Attention Alignment-Based GAN for CT Reconstruction from Biplanar X-Rays

    Authors: Zhi Qiao, Hanqiang Ouyang, Dongheng Chu, Huishu Yuan, Xiantong Zhen, Pei Dong, Zhen Qian

    Abstract: For surgical planning and intra-operation imaging, CT reconstruction using X-ray images can potentially be an important alternative when CT imaging is not available or not feasible. In this paper, we aim to use biplanar X-rays to reconstruct a 3D CT image, because biplanar X-rays convey richer information than single-view X-rays and are more commonly used by surgeons. Different from previous studi… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  4. arXiv:2408.09731  [pdf, other

    eess.IV cs.CV

    Reconstruct Spine CT from Biplanar X-Rays via Diffusion Learning

    Authors: Zhi Qiao, Xuhui Liu, Xiaopeng Wang, Runkun Liu, Xiantong Zhen, Pei Dong, Zhen Qian

    Abstract: Intraoperative CT imaging serves as a crucial resource for surgical guidance; however, it may not always be readily accessible or practical to implement. In scenarios where CT imaging is not an option, reconstructing CT scans from X-rays can offer a viable alternative. In this paper, we introduce an innovative method for 3D CT reconstruction utilizing biplanar X-rays. Distinct from previous resear… ▽ More

    Submitted 20 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  5. arXiv:2408.01803  [pdf, other

    cs.LG cs.CL

    STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs

    Authors: Peijie Dong, Lujun Li, Dayou Du, Yuhan Chen, Zhenheng Tang, Qiang Wang, Wei Xue, Wenhan Luo, Qifeng Liu, Yike Guo, Xiaowen Chu

    Abstract: In this paper, we present STBLLM, the first structural binarization framework for compressing Large Language Models (LLMs) to less than 1-bit precision. LLMs have achieved remarkable performance, but their heavy memory requirements have hindered widespread adoption, particularly on resource-constrained devices. Binarization, which quantifies weights to a mere 1-bit, achieves a milestone in increas… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

  6. arXiv:2408.00275  [pdf, other

    cs.RO

    A Reinforcement Learning Based Motion Planner for Quadrotor Autonomous Flight in Dense Environment

    Authors: Zhaohong Liu, Wenxuan Gao, Yinshuai Sun, Peng Dong

    Abstract: Quadrotor motion planning is critical for autonomous flight in complex environments, such as rescue operations. Traditional methods often employ trajectory generation optimization and passive time allocation strategies, which can limit the exploitation of the quadrotor's dynamic capabilities and introduce delays and inaccuracies. To address these challenges, we propose a novel motion planning fram… ▽ More

    Submitted 5 August, 2024; v1 submitted 1 August, 2024; originally announced August 2024.

  7. arXiv:2407.21075  [pdf, other

    cs.AI cs.CL cs.LG

    Apple Intelligence Foundation Language Models

    Authors: Tom Gunter, Zirui Wang, Chong Wang, Ruoming Pang, Andy Narayanan, Aonan Zhang, Bowen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek , et al. (130 additional authors not shown)

    Abstract: We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, the data used… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  8. arXiv:2407.20772  [pdf, other

    eess.SP cs.NI

    Edge Learning Based Collaborative Automatic Modulation Classification for Hierarchical Cognitive Radio Networks

    Authors: Peihao Dong, Chaowei He, Shen Gao, Fuhui Zhou, Qihui Wu

    Abstract: In hierarchical cognitive radio networks, edge or cloud servers utilize the data collected by edge devices for modulation classification, which, however, is faced with problems of the computation load, transmission overhead, and data privacy. In this article, an edge learning (EL) based framework jointly mobilizing the edge device and the edge server for intelligent co-inference is proposed to rea… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

    Comments: Accepted by IEEE Internet of Things Journal

  9. arXiv:2407.18209  [pdf, other

    cs.ET cs.AR

    SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum-Flux-Parametron Superconducting Circuits

    Authors: Yanyue Xie, Peiyan Dong, Geng Yuan, Zhengang Li, Masoud Zabihi, Chao Wu, Sung-En Chang, Xufeng Zhang, Xue Lin, Caiwen Ding, Nobuyuki Yoshikawa, Olivia Chen, Yanzhi Wang

    Abstract: Superconducting circuits, like Adiabatic Quantum-Flux-Parametron (AQFP), offer exceptional energy efficiency but face challenges in physical design due to sophisticated spacing and timing constraints. Current design tools often neglect the importance of constraint adherence throughout the entire design flow. In this paper, we propose SuperFlow, a fully-customized RTL-to-GDS design flow tailored fo… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: Accepted by DATE 2024

  10. arXiv:2407.18175  [pdf, other

    cs.LG cs.AI cs.CV

    Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers

    Authors: Zhengang Li, Alec Lu, Yanyue Xie, Zhenglun Kong, Mengshu Sun, Hao Tang, Zhong Jia Xue, Peiyan Dong, Caiwen Ding, Yanzhi Wang, Xue Lin, Zhenman Fang

    Abstract: Vision transformers (ViTs) have demonstrated their superior accuracy for computer vision tasks compared to convolutional neural networks (CNNs). However, ViT models are often computation-intensive for efficient deployment on resource-limited edge devices. This work proposes Quasar-ViT, a hardware-oriented quantization-aware architecture search framework for ViTs, to design efficient ViT models for… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: Accepted by ICS 2024

  11. arXiv:2407.05758  [pdf, other

    eess.IV cs.AI cs.CV

    Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports

    Authors: Yutong Zhang, Yi Pan, Tianyang Zhong, Peixin Dong, Kangni Xie, Yuxiao Liu, Hanqi Jiang, Zhengliang Liu, Shijie Zhao, Tuo Zhang, Xi Jiang, Dinggang Shen, Tianming Liu, Xin Zhang

    Abstract: Medical images and radiology reports are crucial for diagnosing medical conditions, highlighting the importance of quantitative analysis for clinical decision-making. However, the diversity and cross-source heterogeneity of these data challenge the generalizability of current data-mining methods. Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecti… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  12. arXiv:2407.02846  [pdf, other

    cs.CV

    Multi-Task Domain Adaptation for Language Grounding with 3D Objects

    Authors: Penglei Sun, Yaoxian Song, Xinglin Pan, Peijie Dong, Xiaofei Yang, Qiang Wang, Zhixu Li, Tiefeng Li, Xiaowen Chu

    Abstract: The existing works on object-level language grounding with 3D objects mostly focus on improving performance by utilizing the off-the-shelf pre-trained models to capture features, such as viewpoint selection or geometric priors. However, they have failed to consider exploring the cross-modal representation of language-vision alignment in the cross-domain field. To answer this problem, we propose a… ▽ More

    Submitted 5 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

  13. arXiv:2406.15609  [pdf, other

    physics.med-ph cs.AI

    Automated radiotherapy treatment planning guided by GPT-4Vision

    Authors: Sheng Liu, Oscar Pastor-Serrano, Yizheng Chen, Matthew Gopaulchan, Weixing Liang, Mark Buyyounouski, Erqi Pollom, Quynh-Thu Le, Michael Gensheimer, Peng Dong, Yong Yang, James Zou, Lei Xing

    Abstract: Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives. Recent advancements in large foundation models offer promising avenues for addressing the challenges in planning and clinical decision-making. This study introduces GPT-RadPlan, a fully automated treatment plan… ▽ More

    Submitted 1 July, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

    Comments: 12 pages, 4 figures

  14. arXiv:2406.05017  [pdf, other

    cs.LG cs.AI

    Adaptively Learning to Select-Rank in Online Platforms

    Authors: Jingyuan Wang, Perry Dong, Ying Jin, Ruohan Zhan, Zhengyuan Zhou

    Abstract: Ranking algorithms are fundamental to various online platforms across e-commerce sites to content streaming services. Our research addresses the challenge of adaptively ranking items from a candidate pool for heterogeneous users, a key component in personalizing user experience. We develop a user response model that considers diverse user preferences and the varying effects of item positions, aimi… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 25 pages in total. Includes 4 figures and a pdf. International conference on machine learning. PMLR, 2024

  15. arXiv:2406.02924  [pdf, other

    cs.LG cs.CL cs.NE

    Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models

    Authors: Peijie Dong, Lujun Li, Zhenheng Tang, Xiang Liu, Xinglin Pan, Qiang Wang, Xiaowen Chu

    Abstract: Despite the remarkable capabilities, Large Language Models (LLMs) face deployment challenges due to their extensive size. Pruning methods drop a subset of weights to accelerate, but many of them require retraining, which is prohibitively expensive and computationally demanding. Recently, post-training pruning approaches introduced novel metrics, enabling the pruning of LLMs without retraining. How… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML2024, 29 pages, 4 figures

  16. arXiv:2403.19591  [pdf, other

    cs.LG cs.AR cs.NE

    Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers

    Authors: Pingcheng Dong, Yonghao Tan, Dong Zhang, Tianwei Ni, Xuejiao Liu, Yu Liu, Peng Luo, Luhong Liang, Shih-Yang Liu, Xijie Huang, Huaiyu Zhu, Yun Pan, Fengwei An, Kwang-Ting Cheng

    Abstract: Non-linear functions are prevalent in Transformers and their lightweight variants, incurring substantial and frequently underestimated hardware costs. Previous state-of-the-art works optimize these operations by piece-wise linear approximation and store the parameters in look-up tables (LUT), but most of them require unfriendly high-precision arithmetics such as FP/INT 32 and lack consideration of… ▽ More

    Submitted 29 March, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: 61st ACM/IEEE Design Automation Conference (DAC) 2024

  17. arXiv:2403.16536  [pdf, other

    cs.CV

    VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting

    Authors: Yujin Tang, Peijie Dong, Zhenheng Tang, Xiaowen Chu, Junwei Liang

    Abstract: Combining CNNs or ViTs, with RNNs for spatiotemporal forecasting, has yielded unparalleled results in predicting temporal and spatial dynamics. However, modeling extensive global information remains a formidable challenge; CNNs are limited by their narrow receptive fields, and ViTs struggle with the intensive computational demands of their attention mechanisms. The emergence of recent Mamba-based… ▽ More

    Submitted 29 June, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: CVPR2024 Precognition Workshop

  18. Joint Resource Allocation and Trajectory Design for Resilient Multi-UAV Communication Networks

    Authors: Linghui Ge, Xiao Liang, Hua Zhang, Peihao Dong, Jianxin Liao, Jingyu Wang

    Abstract: In contrast to terrestrial wireless networks, dynamic Unmanned Aerial Vehicle (UAV) networks are susceptible to unexpected link failures arising from UAV breakdowns or the depletion of its batteries. Drastic user rate fluctuations and sum rate drops can occur due to the unexpected UAV link failures. Previous research has focused primarily on re-establishing these links to maintain service continui… ▽ More

    Submitted 20 January, 2024; originally announced February 2024.

    Journal ref: IEEE Wireless Communications Letters, 2024

  19. arXiv:2402.14983  [pdf, other

    cs.LG cs.CR q-fin.RM

    Privacy-Enhancing Collaborative Information Sharing through Federated Learning -- A Case of the Insurance Industry

    Authors: Panyi Dong, Zhiyu Quan, Brandon Edwards, Shih-han Wang, Runhuan Feng, Tianyang Wang, Patrick Foley, Prashant Shah

    Abstract: The report demonstrates the benefits (in terms of improved claims loss modeling) of harnessing the value of Federated Learning (FL) to learn a single model across multiple insurance industry datasets without requiring the datasets themselves to be shared from one company to another. The application of FL addresses two of the most pressing concerns: limited data volume and data variety, which are c… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  20. arXiv:2402.10787  [pdf, other

    cs.LG cs.AI cs.CL

    EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge

    Authors: Xuan Shen, Zhenglun Kong, Changdi Yang, Zhaoyang Han, Lei Lu, Peiyan Dong, Cheng Lyu, Chih-hsiang Li, Xuehang Guo, Zhihao Shu, Wei Niu, Miriam Leeser, Pu Zhao, Yanzhi Wang

    Abstract: Despite the remarkable strides of Large Language Models (LLMs) in various fields, the wide applications of LLMs on edge devices are limited due to their massive parameters and computations. To address this, quantization is commonly adopted to generate lightweight LLMs with efficient computations and fast inference. However, Post-Training Quantization (PTQ) methods dramatically degrade in quality w… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Preprint

  21. arXiv:2402.08916  [pdf, other

    eess.SP cs.IT

    Lightweight Deep Learning Based Channel Estimation for Extremely Large-Scale Massive MIMO Systems

    Authors: Shen Gao, Peihao Dong, Zhiwen Pan, Xiaohu You

    Abstract: Extremely large-scale massive multiple-input multiple-output (XL-MIMO) systems introduce the much higher channel dimensionality and incur the additional near-field propagation effect, aggravating the computation load and the difficulty to acquire the prior knowledge for channel estimation. In this article, an XL-MIMO channel network (XLCNet) is developed to estimate the high-dimensional channel, w… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE Transactions on Vehicular Technology

  22. arXiv:2402.02105  [pdf, other

    cs.CV

    ParZC: Parametric Zero-Cost Proxies for Efficient NAS

    Authors: Peijie Dong, Lujun Li, Xinglin Pan, Zimian Wei, Xiang Liu, Qiang Wang, Xiaowen Chu

    Abstract: Recent advancements in Zero-shot Neural Architecture Search (NAS) highlight the efficacy of zero-cost proxies in various NAS benchmarks. Several studies propose the automated design of zero-cost proxies to achieve SOTA performance but require tedious searching progress. Furthermore, we identify a critical issue with current zero-cost proxies: they aggregate node-wise zero-cost statistics without c… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  23. arXiv:2401.15061  [pdf, other

    cs.NE cs.ET physics.optics

    Digital-analog hybrid matrix multiplication processor for optical neural networks

    Authors: Xiansong Meng, Deming Kong, Kwangwoong Kim, Qiuchi Li, Po Dong, Ingemar J. Cox, Christina Lioma, Hao Hu

    Abstract: The computational demands of modern AI have spurred interest in optical neural networks (ONNs) which offer the potential benefits of increased speed and lower power consumption. However, current ONNs face various challenges,most significantly a limited calculation precision (typically around 4 bits) and the requirement for high-resolution signal format converters (digital-to-analogue conversions (… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  24. arXiv:2401.13174  [pdf, other

    cs.CV

    Boundary and Relation Distillation for Semantic Segmentation

    Authors: Dong Zhang, Pingcheng Dong, Xinting Hu, Long Chen, Kwang-Ting Cheng

    Abstract: Recently, it has been revealed that small semantic segmentation (SS) models exhibit a tendency to make errors in maintaining boundary region completeness and preserving target region connectivity, despite their effective segmentation of the main object regions. To address these errors, we propose a targeted boundary and relation distillation (BRD) strategy using knowledge distillation from large t… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  25. arXiv:2401.01069  [pdf, other

    math.NA cs.CE

    A prediction-correction based iterative convolution-thresholding method for topology optimization of heat transfer problems

    Authors: Huangxin Chen, Piaopiao Dong, Dong Wang, Xiao-Ping Wang

    Abstract: In this paper, we propose an iterative convolution-thresholding method (ICTM) based on prediction-correction for solving the topology optimization problem in steady-state heat transfer equations. The problem is formulated as a constrained minimization problem of the complementary energy, incorporating a perimeter/surface-area regularization term, while satisfying a steady-state heat transfer equat… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 29 pages, 25 figures

  26. arXiv:2312.09059  [pdf, other

    cs.CV

    Auto-Prox: Training-Free Vision Transformer Architecture Search via Automatic Proxy Discovery

    Authors: Zimian Wei, Lujun Li, Peijie Dong, Zheng Hui, Anggeng Li, Menglong Lu, Hengyue Pan, Zhiliang Tian, Dongsheng Li

    Abstract: The substantial success of Vision Transformer (ViT) in computer vision tasks is largely attributed to the architecture design. This underscores the necessity of efficient architecture search for designing better ViTs automatically. As training-based architecture search methods are computationally intensive, there is a growing interest in training-free methods that use zero-cost proxies to score Vi… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepetd by AAAI2024

  27. arXiv:2312.05693  [pdf, other

    cs.LG cs.AI cs.CL

    Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge

    Authors: Xuan Shen, Peiyan Dong, Lei Lu, Zhenglun Kong, Zhengang Li, Ming Lin, Chao Wu, Yanzhi Wang

    Abstract: Large Language Models (LLMs) stand out for their impressive performance in intricate language modeling tasks. However, their demanding computational and memory needs pose obstacles for broad use on edge devices. Quantization is then introduced to boost LLMs' on-device efficiency. Recent works show that 8-bit or lower weight quantization is feasible with minimal impact on end-to-end task performanc… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  28. arXiv:2312.05256  [pdf, other

    eess.IV cs.AI

    Holistic Evaluation of GPT-4V for Biomedical Imaging

    Authors: Zhengliang Liu, Hanqi Jiang, Tianyang Zhong, Zihao Wu, Chong Ma, Yiwei Li, Xiaowei Yu, Yutong Zhang, Yi Pan, Peng Shu, Yanjun Lyu, Lu Zhang, Junjie Yao, Peixin Dong, Chao Cao, Zhenxiang Xiao, Jiaqi Wang, Huan Zhao, Shaochen Xu, Yaonai Wei, Jingyuan Chen, Haixing Dai, Peilong Wang, Hao He, Zewei Wang , et al. (25 additional authors not shown)

    Abstract: In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and mor… ▽ More

    Submitted 10 November, 2023; originally announced December 2023.

  29. arXiv:2311.14337  [pdf, other

    cs.CV

    TVT: Training-Free Vision Transformer Search on Tiny Datasets

    Authors: Zimian Wei, Hengyue Pan, Lujun Li, Peijie Dong, Zhiliang Tian, Xin Niu, Dongsheng Li

    Abstract: Training-free Vision Transformer (ViT) architecture search is presented to search for a better ViT with zero-cost proxies. While ViTs achieve significant distillation gains from CNN teacher models on small datasets, the current zero-cost proxies in ViTs do not generalize well to the distillation training paradigm according to our experimental observations. In this paper, for the first time, we inv… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

  30. arXiv:2311.13198  [pdf, other

    cs.CV

    DoubleAUG: Single-domain Generalized Object Detector in Urban via Color Perturbation and Dual-style Memory

    Authors: Lei Qi, Peng Dong, Tan Xiong, Hui Xue, Xin Geng

    Abstract: Object detection in urban scenarios is crucial for autonomous driving in intelligent traffic systems. However, unlike conventional object detection tasks, urban-scene images vary greatly in style. For example, images taken on sunny days differ significantly from those taken on rainy days. Therefore, models trained on sunny day images may not generalize well to rainy day images. In this paper, we a… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: Accepted by ACM Transactions on Multimedia Computing, Communications, and Applications

  31. arXiv:2311.12996  [pdf, other

    cs.AI cs.RO

    RLIF: Interactive Imitation Learning as Reinforcement Learning

    Authors: Jianlan Luo, Perry Dong, Yuexiang Zhai, Yi Ma, Sergey Levine

    Abstract: Although reinforcement learning methods offer a powerful framework for automatic skill acquisition, for practical learning-based control problems in domains such as robotics, imitation learning often provides a more convenient and accessible alternative. In particular, an interactive imitation learning method such as DAgger, which queries a near-optimal expert to intervene online to collect correc… ▽ More

    Submitted 18 March, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: ICLR 2024

  32. arXiv:2311.03687  [pdf, other

    cs.PF cs.CL cs.LG

    Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models

    Authors: Longteng Zhang, Xiang Liu, Zeyu Li, Xinglin Pan, Peijie Dong, Ruibo Fan, Rui Guo, Xin Wang, Qiong Luo, Shaohuai Shi, Xiaowen Chu

    Abstract: Large Language Models (LLMs) have seen great advance in both academia and industry, and their popularity results in numerous open-source frameworks and techniques in accelerating LLM pre-training, fine-tuning, and inference. Training and deploying LLMs are expensive as it requires considerable computing resources and memory, hence many efficient approaches have been developed for improving system… ▽ More

    Submitted 1 December, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

  33. arXiv:2310.16836  [pdf, other

    cs.CL cs.AI cs.AR cs.CV

    LLM-FP4: 4-Bit Floating-Point Quantized Transformers

    Authors: Shih-yang Liu, Zechun Liu, Xijie Huang, Pingcheng Dong, Kwang-Ting Cheng

    Abstract: We propose LLM-FP4 for quantizing both weights and activations in large language models (LLMs) down to 4-bit floating-point values, in a post-training manner. Existing post-training quantization (PTQ) solutions are primarily integer-based and struggle with bit widths below 8 bits. Compared to integer quantization, floating-point (FP) quantization is more flexible and can better handle long-tail or… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Main Conference

    Journal ref: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

  34. arXiv:2310.11731  [pdf, other

    cs.AI

    Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning

    Authors: Jianlan Luo, Perry Dong, Jeffrey Wu, Aviral Kumar, Xinyang Geng, Sergey Levine

    Abstract: The offline reinforcement learning (RL) paradigm provides a general recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data. While policy constraints, conservatism, and other methods for mitigating distributional shifts have made offline reinforcement learning more effective, the continuous action setting often necessitates various a… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  35. arXiv:2310.05242  [pdf, other

    cs.CL cs.AI

    ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Report Generation Based on Multi-institution and Multi-system Data

    Authors: Tianyang Zhong, Wei Zhao, Yutong Zhang, Yi Pan, Peixin Dong, Zuowei Jiang, Xiaoyan Kui, Youlan Shang, Li Yang, Yaonai Wei, Longtao Yang, Hao Chen, Huan Zhao, Yuxiao Liu, Ning Zhu, Yiwei Li, Yisong Wang, Jiaqi Yao, Jiaqi Wang, Ying Zeng, Lei He, Chao Zheng, Zhixue Zhang, Ming Li, Zhengliang Liu , et al. (17 additional authors not shown)

    Abstract: Radiology report generation, as a key step in medical image analysis, is critical to the quantitative analysis of clinically informed decision-making levels. However, complex and diverse radiology reports with cross-source heterogeneity pose a huge generalizability challenge to the current methods under massive data volume, mainly because the style and normativity of radiology reports are obviousl… ▽ More

    Submitted 9 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

  36. arXiv:2310.00795  [pdf

    cs.LG cs.AI

    A Comprehensive Review of Generative AI in Healthcare

    Authors: Yasin Shokrollahi, Sahar Yarmohammadtoosky, Matthew M. Nikahd, Pengfei Dong, Xianqi Li, Linxia Gu

    Abstract: The advancement of Artificial Intelligence (AI) has catalyzed revolutionary changes across various sectors, notably in healthcare. Among the significant developments in this field are the applications of generative AI models, specifically transformers and diffusion models. These models have played a crucial role in analyzing diverse forms of data, including medical imaging (encompassing image reco… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

    Comments: 47 pages, 16 figures, 1table

  37. arXiv:2309.12212  [pdf, other

    cs.ET cs.AR cs.LG

    SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson Devices

    Authors: Zhengang Li, Geng Yuan, Tomoharu Yamauchi, Zabihi Masoud, Yanyue Xie, Peiyan Dong, Xulong Tang, Nobuyuki Yoshikawa, Devesh Tiwari, Yanzhi Wang, Olivia Chen

    Abstract: Adiabatic Quantum-Flux-Parametron (AQFP) is a superconducting logic with extremely high energy efficiency. By employing the distinct polarity of current to denote logic `0' and `1', AQFP devices serve as excellent carriers for binary neural network (BNN) computations. Although recent research has made initial strides toward developing an AQFP-based BNN accelerator, several critical challenges rema… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: Accepted by MICRO'23 (56th IEEE/ACM International Symposium on Microarchitecture)

  38. arXiv:2309.00923  [pdf, other

    cs.CV cs.LG

    GBE-MLZSL: A Group Bi-Enhancement Framework for Multi-Label Zero-Shot Learning

    Authors: Ziming Liu, Jingcai Guo, Xiaocheng Lu, Song Guo, Peiran Dong, Jiewei Zhang

    Abstract: This paper investigates a challenging problem of zero-shot learning in the multi-label scenario (MLZSL), wherein, the model is trained to recognize multiple unseen classes within a sample (e.g., an image) based on seen classes and auxiliary knowledge, e.g., semantic information. Existing methods usually resort to analyzing the relationship of various seen classes residing in a sample from the dime… ▽ More

    Submitted 14 September, 2023; v1 submitted 2 September, 2023; originally announced September 2023.

    Comments: 11 pages, 8 figures

  39. arXiv:2308.14324  [pdf, other

    cs.CV

    CPFES: Physical Fitness Evaluation Based on Canadian Agility and Movement Skill Assessment

    Authors: Pengcheng Dong, Xiaojin Mao, Lixia Fan, Wenbo Wan, Jiande Sun

    Abstract: In recent years, the assessment of fundamental movement skills integrated with physical education has focused on both teaching practice and the feasibility of assessment. The object of assessment has shifted from multiple ages to subdivided ages, while the content of assessment has changed from complex and time-consuming to concise and efficient. Therefore, we apply deep learning to physical fitne… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  40. arXiv:2307.13693  [pdf, other

    cs.CL

    Evaluating Large Language Models for Radiology Natural Language Processing

    Authors: Zhengliang Liu, Tianyang Zhong, Yiwei Li, Yutong Zhang, Yi Pan, Zihao Zhao, Peixin Dong, Chao Cao, Yuxiao Liu, Peng Shu, Yaonai Wei, Zihao Wu, Chong Ma, Jiaqi Wang, Sheng Wang, Mengyue Zhou, Zuowei Jiang, Chunlin Li, Jason Holmes, Shaochen Xu, Lu Zhang, Haixing Dai, Kai Zhang, Lin Zhao, Yuanhao Chen , et al. (20 additional authors not shown)

    Abstract: The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of these models exhibit bilingual capabilities, proficient in both English and Chinese. However, a compreh… ▽ More

    Submitted 27 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

  41. arXiv:2307.12216  [pdf, other

    cs.ET

    A Life-Cycle Energy and Inventory Analysis of Adiabatic Quantum-Flux-Parametron Circuits

    Authors: Masoud Zabihi, Yanyue Xie, Zhengang Li, Peiyan Dong, Geng Yuan, Olivia Chen, Massoud Pedram, Yanzhi Wang

    Abstract: The production process of superconductive integrated circuits is complex and consumes significant amounts of resources and energy. Therefore, it is crucial to evaluate the environmental impact of this emerging technology. An attractive option for the next generation of superconductive technology is Adiabatic Quantum-Flux-Parametron (AQFP) devices. This study is the first to present a comprehensive… ▽ More

    Submitted 22 July, 2023; originally announced July 2023.

  42. arXiv:2307.10554  [pdf, other

    cs.CV cs.AI

    EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization

    Authors: Peijie Dong, Lujun Li, Zimian Wei, Xin Niu, Zhiliang Tian, Hengyue Pan

    Abstract: Mixed-Precision Quantization~(MQ) can achieve a competitive accuracy-complexity trade-off for models. Conventional training-based search methods require time-consuming candidate training to search optimized per-layer bit-width configurations in MQ. Recently, some training-free approaches have presented various MQ proxies and significantly improve search efficiency. However, the correlation between… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted by ICCV2023

  43. arXiv:2303.15678  [pdf, other

    cs.CV cs.AI cs.LG

    DisWOT: Student Architecture Search for Distillation WithOut Training

    Authors: Peijie Dong, Lujun Li, Zimian Wei

    Abstract: Knowledge distillation (KD) is an effective training strategy to improve the lightweight student models under the guidance of cumbersome teachers. However, the large architecture difference across the teacher-student pairs limits the distillation gains. In contrast to previous adaptive distillation methods to reduce the teacher-student gap, we explore a novel training-free framework to search for… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR2023

  44. arXiv:2301.10038  [pdf, other

    cs.CV

    Progressive Meta-Pooling Learning for Lightweight Image Classification Model

    Authors: Peijie Dong, Xin Niu, Zhiliang Tian, Lujun Li, Xiaodong Wang, Zimian Wei, Hengyue Pan, Dongsheng Li

    Abstract: Practical networks for edge devices adopt shallow depth and small convolutional kernels to save memory and computational cost, which leads to a restricted receptive field. Conventional efficient learning methods focus on lightweight convolution designs, ignoring the role of the receptive field in neural network design. In this paper, we propose the Meta-Pooling framework to make the receptive fiel… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

    Comments: 5 pages, 2 figures, ICASSP23

  45. arXiv:2301.09850  [pdf, other

    cs.CV

    RD-NAS: Enhancing One-shot Supernet Ranking Ability via Ranking Distillation from Zero-cost Proxies

    Authors: Peijie Dong, Xin Niu, Lujun Li, Zhiliang Tian, Xiaodong Wang, Zimian Wei, Hengyue Pan, Dongsheng Li

    Abstract: Neural architecture search (NAS) has made tremendous progress in the automatic design of effective neural network structures but suffers from a heavy computational burden. One-shot NAS significantly alleviates the burden through weight sharing and improves computational efficiency. Zero-shot NAS further reduces the cost by predicting the performance of the network from its initial state, which con… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

    Comments: 6 pages, 2 figures, 4 tables, ICASSP 2023

  46. arXiv:2211.10801  [pdf, other

    cs.CV cs.AI cs.LG

    Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training

    Authors: Zhenglun Kong, Haoyu Ma, Geng Yuan, Mengshu Sun, Yanyue Xie, Peiyan Dong, Xin Meng, Xuan Shen, Hao Tang, Minghai Qin, Tianlong Chen, Xiaolong Ma, Xiaohui Xie, Zhangyang Wang, Yanzhi Wang

    Abstract: Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage at both training and inference time limit their generalization. Previous compression algorithms usually start from the pre-trained dense models and only focus on efficient inference, while time-consuming training is still unavoidable. In contrast, this paper points… ▽ More

    Submitted 19 November, 2022; originally announced November 2022.

    Comments: AAAI 2023

  47. arXiv:2211.08428  [pdf, other

    eess.IV cs.CV cs.LG cs.MM

    CaDM: Codec-aware Diffusion Modeling for Neural-enhanced Video Streaming

    Authors: Qihua Zhou, Ruibin Li, Song Guo, Peiran Dong, Yi Liu, Jingcai Guo, Zhenda Xu

    Abstract: Recent years have witnessed the dramatic growth of Internet video traffic, where the video bitstreams are often compressed and delivered in low quality to fit the streamer's uplink bandwidth. To alleviate the quality degradation, it comes the rise of Neural-enhanced Video Streaming (NVS), which shows great prospects for recovering low-quality videos by mostly deploying neural super-resolution (SR)… ▽ More

    Submitted 8 March, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

  48. arXiv:2211.08110  [pdf, other

    cs.AR cs.AI cs.CV

    HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers

    Authors: Peiyan Dong, Mengshu Sun, Alec Lu, Yanyue Xie, Kenneth Liu, Zhenglun Kong, Xin Meng, Zhengang Li, Xue Lin, Zhenman Fang, Yanzhi Wang

    Abstract: While vision transformers (ViTs) have continuously achieved new milestones in the field of computer vision, their sophisticated network architectures with high computation and memory costs have impeded their deployment on resource-limited edge devices. In this paper, we propose a hardware-efficient image-adaptive token pruning framework called HeatViT for efficient yet accurate ViT acceleration on… ▽ More

    Submitted 24 February, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: HPCA 2023

  49. arXiv:2211.01484  [pdf, other

    cs.CV cs.LG

    Data Level Lottery Ticket Hypothesis for Vision Transformers

    Authors: Xuan Shen, Zhenglun Kong, Minghai Qin, Peiyan Dong, Geng Yuan, Xin Meng, Hao Tang, Xiaolong Ma, Yanzhi Wang

    Abstract: The conventional lottery ticket hypothesis (LTH) claims that there exists a sparse subnetwork within a dense neural network and a proper random initialization method called the winning ticket, such that it can be trained from scratch to almost as good as the dense counterpart. Meanwhile, the research of LTH in vision transformers (ViTs) is scarcely evaluated. In this paper, we first show that the… ▽ More

    Submitted 29 May, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: Accepted by IJCAI 2023

  50. arXiv:2209.07738  [pdf, other

    cs.CV cs.AI cs.LG

    DMFormer: Closing the Gap Between CNN and Vision Transformers

    Authors: Zimian Wei, Hengyue Pan, Lujun Li, Menglong Lu, Xin Niu, Peijie Dong, Dongsheng Li

    Abstract: Vision transformers have shown excellent performance in computer vision tasks. As the computation cost of their self-attention mechanism is expensive, recent works tried to replace the self-attention mechanism in vision transformers with convolutional operations, which is more efficient with built-in inductive bias. However, these efforts either ignore multi-level features or lack dynamic prosperi… ▽ More

    Submitted 28 November, 2022; v1 submitted 16 September, 2022; originally announced September 2022.