Zum Hauptinhalt springen

Showing 1–9 of 9 results for author: Jie, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.06798  [pdf, other

    cs.CV

    Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning

    Authors: Shibo Jie, Yehui Tang, Jianyuan Guo, Zhi-Hong Deng, Kai Han, Yunhe Wang

    Abstract: Token compression expedites the training and inference of Vision Transformers (ViTs) by reducing the number of the redundant tokens, e.g., pruning inattentive tokens or merging similar tokens. However, when applied to downstream tasks, these approaches suffer from significant performance drop when the compression degrees are mismatched between training and inference stages, which limits the applic… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: Accepted to ECCV2024

  2. arXiv:2405.05615  [pdf, other

    cs.CV cs.CL cs.LG

    Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning

    Authors: Shibo Jie, Yehui Tang, Ning Ding, Zhi-Hong Deng, Kai Han, Yunhe Wang

    Abstract: Current solutions for efficiently constructing large vision-language (VL) models follow a two-step paradigm: projecting the output of pre-trained vision encoders to the input space of pre-trained language models as visual prompts; and then transferring the models to downstream VL tasks via end-to-end parameter-efficient fine-tuning (PEFT). However, this paradigm still exhibits inefficiency since i… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Accepted to ICML2024

  3. arXiv:2307.16867  [pdf, other

    cs.CV

    Revisiting the Parameter Efficiency of Adapters from the Perspective of Precision Redundancy

    Authors: Shibo Jie, Haoqing Wang, Zhi-Hong Deng

    Abstract: Current state-of-the-art results in computer vision depend in part on fine-tuning large pre-trained vision models. However, with the exponential growth of model sizes, the conventional full fine-tuning, which needs to store a individual network copy for each tasks, leads to increasingly huge storage and transmission overhead. Adapter-based Parameter-Efficient Tuning (PET) methods address this chal… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: Accepted to ICCV 2023

  4. arXiv:2302.11730  [pdf, ps, other

    cs.LG

    Detachedly Learn a Classifier for Class-Incremental Learning

    Authors: Ziheng Li, Shibo Jie, Zhi-Hong Deng

    Abstract: In continual learning, model needs to continually learn a feature extractor and classifier on a sequence of tasks. This paper focuses on how to learn a classifier based on a pretrained feature extractor under continual learning setting. We present an probabilistic analysis that the failure of vanilla experience replay (ER) comes from unnecessary re-learning of previous tasks and incompetence to di… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

  5. arXiv:2212.03145  [pdf, other

    cs.CV

    FacT: Factor-Tuning for Lightweight Adaptation on Vision Transformer

    Authors: Shibo Jie, Zhi-Hong Deng

    Abstract: Recent work has explored the potential to adapt a pre-trained vision transformer (ViT) by updating only a few parameters so as to improve storage efficiency, called parameter-efficient transfer learning (PETL). Current PETL methods have shown that by tuning only 0.5% of the parameters, ViT can be adapted to downstream tasks with even better performance than full fine-tuning. In this paper, we aim… ▽ More

    Submitted 10 June, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

    Comments: AAAI 2023 Oral. Code: https://github.com/JieShibo/PETL-ViT

  6. arXiv:2207.07039  [pdf, other

    cs.CV

    Convolutional Bypasses Are Better Vision Transformer Adapters

    Authors: Shibo Jie, Zhi-Hong Deng

    Abstract: The pretrain-then-finetune paradigm has been widely adopted in computer vision. But as the size of Vision Transformer (ViT) grows exponentially, the full finetuning becomes prohibitive in view of the heavier storage overhead. Motivated by parameter-efficient transfer learning (PETL) on language transformers, recent studies attempt to insert lightweight adaptation modules (e.g., adapter layers or p… ▽ More

    Submitted 9 August, 2022; v1 submitted 14 July, 2022; originally announced July 2022.

  7. arXiv:2205.09347  [pdf, other

    cs.LG cs.AI

    Bypassing Logits Bias in Online Class-Incremental Learning with a Generative Framework

    Authors: Gehui Shen, Shibo Jie, Ziheng Li, Zhi-Hong Deng

    Abstract: Continual learning requires the model to maintain the learned knowledge while learning from a non-i.i.d data stream continually. Due to the single-pass training setting, online continual learning is very challenging, but it is closer to the real-world scenarios where quick adaptation to new data is appealing. In this paper, we focus on online class-incremental learning setting in which new classes… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  8. arXiv:2204.10535  [pdf, other

    cs.CV cs.AI

    Alleviating Representational Shift for Continual Fine-tuning

    Authors: Shibo Jie, Zhi-Hong Deng, Ziheng Li

    Abstract: We study a practical setting of continual learning: fine-tuning on a pre-trained model continually. Previous work has found that, when training on new tasks, the features (penultimate layer representations) of previous data will change, called representational shift. Besides the shift of features, we reveal that the intermediate layers' representational shift (IRS) also matters since it disrupts b… ▽ More

    Submitted 8 May, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

  9. arXiv:2012.06209  [pdf, other

    cs.IR

    KOSMOS: Knowledge-graph Oriented Social media and Mainstream media Overview System

    Authors: Chua Hao Yang, Yong Shan Jie, Boon Kok Chin, Lander Chin, Lynnette Hui Xian Ng

    Abstract: We introduce KOSMOS, a knowledge retrieval system based on the constructed knowledge graph of social media and mainstream media documents. The system first identifies key events from the documents at each time frame through clustering, extracting a document to represent each cluster, then describing the document in terms of 5W1H (Who, What, When, Where, Why, How). The event centric knowledge graph… ▽ More

    Submitted 17 December, 2020; v1 submitted 11 December, 2020; originally announced December 2020.