Zum Hauptinhalt springen

Showing 1–3 of 3 results for author: Marjit, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.17412  [pdf, other

    cs.CV

    DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models

    Authors: Shyam Marjit, Harshit Singh, Nityanand Mathur, Sayak Paul, Chia-Mu Yu, Pin-Yu Chen

    Abstract: In the realm of subject-driven text-to-image (T2I) generative models, recent developments like DreamBooth and BLIP-Diffusion have led to impressive results yet encounter limitations due to their intensive fine-tuning demands and substantial parameter requirements. While the low-rank adaptation (LoRA) module within DreamBooth offers a reduction in trainable parameters, it introduces a pronounced se… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Project Page: https://diffusekrona.github.io/

  2. arXiv:2312.02345  [pdf, other

    cs.CV

    CLIPDrawX: Primitive-based Explanations for Text Guided Sketch Synthesis

    Authors: Nityanand Mathur, Shyam Marjit, Abhra Chaudhuri, Anjan Dutta

    Abstract: With the goal of understanding the visual concepts that CLIP associates with text prompts, we show that the latent space of CLIP can be visualized solely in terms of linear transformations on simple geometric primitives like circles and straight lines. Although existing approaches achieve this by sketch-synthesis-through-optimization, they do so on the space of Bézier curves, which exhibit a waste… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  3. arXiv:2310.20704  [pdf, other

    cs.CV cs.AI

    Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders

    Authors: Srijan Das, Tanmay Jain, Dominick Reilly, Pranav Balaji, Soumyajit Karmakar, Shyam Marjit, Xiang Li, Abhijit Das, Michael S. Ryoo

    Abstract: Vision Transformers (ViTs) have become ubiquitous in computer vision. Despite their success, ViTs lack inductive biases, which can make it difficult to train them with limited data. To address this challenge, prior studies suggest training ViTs with self-supervised learning (SSL) and fine-tuning sequentially. However, we observe that jointly optimizing ViTs for the primary task and a Self-Supervis… ▽ More

    Submitted 27 December, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: Accepted to WACV 2024