Zum Hauptinhalt springen

Showing 1–5 of 5 results for author: Asokan, A R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.02900  [pdf, other

    cs.CV cs.AI cs.LG

    DeiT-LT Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets

    Authors: Harsh Rangwani, Pradipto Mondal, Mayank Mishra, Ashish Ramayee Asokan, R. Venkatesh Babu

    Abstract: Vision Transformer (ViT) has emerged as a prominent architecture for various computer vision tasks. In ViT, we divide the input image into patch tokens and process them through a stack of self attention blocks. However, unlike Convolutional Neural Networks (CNN), ViTs simple architecture has no informative inductive bias (e.g., locality,etc. ). Due to this, ViT requires a large amount of data for… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: CVPR 2024. Project Page: https://rangwani-harsh.github.io/DeiT-LT

  2. arXiv:2311.16294  [pdf, other

    cs.CV

    Aligning Non-Causal Factors for Transformer-Based Source-Free Domain Adaptation

    Authors: Sunandini Sanyal, Ashish Ramayee Asokan, Suvaansh Bhambri, Pradyumna YM, Akshay Kulkarni, Jogendra Nath Kundu, R Venkatesh Babu

    Abstract: Conventional domain adaptation algorithms aim to achieve better generalization by aligning only the task-discriminative causal factors between a source and target domain. However, we find that retaining the spurious correlation between causal and non-causal factors plays a vital role in bridging the domain gap and improving target adaptation. Therefore, we propose to build a framework that disenta… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: WACV 2024. Project Page: https://val.cds.iisc.ac.in/C-SFTrans/

  3. arXiv:2310.08255  [pdf, other

    cs.CV

    Leveraging Vision-Language Models for Improving Domain Generalization in Image Classification

    Authors: Sravanti Addepalli, Ashish Ramayee Asokan, Lakshay Sharma, R. Venkatesh Babu

    Abstract: Vision-Language Models (VLMs) such as CLIP are trained on large amounts of image-text pairs, resulting in remarkable generalization across several data distributions. However, in several cases, their expensive training and data collection/curation costs do not justify the end application. This motivates a vendor-client paradigm, where a vendor trains a large-scale VLM and grants only input-output… ▽ More

    Submitted 9 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Project page: http://val.cds.iisc.ac.in/VL2V-ADiP/

  4. arXiv:2308.14023  [pdf, other

    cs.CV

    Domain-Specificity Inducing Transformers for Source-Free Domain Adaptation

    Authors: Sunandini Sanyal, Ashish Ramayee Asokan, Suvaansh Bhambri, Akshay Kulkarni, Jogendra Nath Kundu, R. Venkatesh Babu

    Abstract: Conventional Domain Adaptation (DA) methods aim to learn domain-invariant feature representations to improve the target adaptation performance. However, we motivate that domain-specificity is equally important since in-domain trained models hold crucial domain-specific properties that are beneficial for adaptation. Hence, we propose to build a framework that supports disentanglement and learning o… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

    Comments: ICCV 2023. Project page: http://val.cds.iisc.ac.in/DSiT-SFDA

  5. arXiv:2202.01072  [pdf, other

    cs.LG

    Interpretability for Multimodal Emotion Recognition using Concept Activation Vectors

    Authors: Ashish Ramayee Asokan, Nidarshan Kumar, Anirudh Venkata Ragam, Shylaja S Sharath

    Abstract: Multimodal Emotion Recognition refers to the classification of input video sequences into emotion labels based on multiple input modalities (usually video, audio and text). In recent years, Deep Neural networks have shown remarkable performance in recognizing human emotions, and are on par with human-level performance on this task. Despite the recent advancements in this field, emotion recognition… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible