Zum Hauptinhalt springen

Showing 1–13 of 13 results for author: Bhanu, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.19306  [pdf, other

    cs.CV

    Symmetrical Joint Learning Support-query Prototypes for Few-shot Segmentation

    Authors: Qun Li, Baoquan Sun, Fu Xiao, Yonggang Qi, Bir Bhanu

    Abstract: We propose Sym-Net, a novel framework for Few-Shot Segmentation (FSS) that addresses the critical issue of intra-class variation by jointly learning both query and support prototypes in a symmetrical manner. Unlike previous methods that generate query prototypes solely by matching query features to support prototypes, which is a form of bias learning towards the few-shot support samples, Sym-Net l… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

  2. arXiv:2309.03240  [pdf, other

    cs.CV

    RepSGG: Novel Representations of Entities and Relationships for Scene Graph Generation

    Authors: Hengyue Liu, Bir Bhanu

    Abstract: Scene Graph Generation (SGG) has achieved significant progress recently. However, most previous works rely heavily on fixed-size entity representations based on bounding box proposals, anchors, or learnable queries. As each representation's cardinality has different trade-offs between performance and computation overhead, extracting highly representative features efficiently and dynamically is bot… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

  3. arXiv:2304.06028  [pdf, other

    cs.CV

    RECLIP: Resource-efficient CLIP by Training with Small Images

    Authors: Runze Li, Dahun Kim, Bir Bhanu, Weicheng Kuo

    Abstract: We present RECLIP (Resource-efficient CLIP), a simple method that minimizes computational resource footprint for CLIP (Contrastive Language Image Pretraining). Inspired by the notion of coarse-to-fine in computer vision, we leverage small images to learn from large-scale language supervision efficiently, and finetune the model with high-resolution data in the end. Since the complexity of the visio… ▽ More

    Submitted 31 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Published at Transactions on Machine Learning Research

  4. arXiv:2207.08951  [pdf, other

    cs.CV

    MonoIndoor++:Towards Better Practice of Self-Supervised Monocular Depth Estimation for Indoor Environments

    Authors: Runze Li, Pan Ji, Yi Xu, Bir Bhanu

    Abstract: Self-supervised monocular depth estimation has seen significant progress in recent years, especially in outdoor environments. However, depth prediction results are not satisfying in indoor scenes where most of the existing data are captured with hand-held devices. As compared to outdoor environments, estimating depth of monocular videos for indoor environments, using self-supervised methods, resul… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: Journal version of "MonoIndoor: Towards Good Practice of Self-Supervised Monocular Depth Estimation for Indoor Environments"(ICCV-2021). arXiv admin note: substantial text overlap with arXiv:2107.12429

  5. Dite-HRNet: Dynamic Lightweight High-Resolution Network for Human Pose Estimation

    Authors: Qun Li, Ziyi Zhang, Fu Xiao, Feng Zhang, Bir Bhanu

    Abstract: A high-resolution network exhibits remarkable capability in extracting multi-scale features for human pose estimation, but fails to capture long-range interactions between joints and has high computational complexity. To address these problems, we present a Dynamic lightweight High-Resolution Network (Dite-HRNet), which can efficiently extract multi-scale contextual information and model long-rang… ▽ More

    Submitted 24 May, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: Accepted by IJCAI-ECAI 2022

  6. arXiv:2108.07466  [pdf, other

    cs.CV eess.IV

    Transferring Knowledge with Attention Distillation for Multi-Domain Image-to-Image Translation

    Authors: Runze Li, Tomaso Fontanini, Luca Donati, Andrea Prati, Bir Bhanu

    Abstract: Gradient-based attention modeling has been used widely as a way to visualize and understand convolutional neural networks. However, exploiting these visual explanations during the training of generative adversarial networks (GANs) is an unexplored area in computer vision research. Indeed, we argue that this kind of information can be used to influence GANs training in a positive way. For this reas… ▽ More

    Submitted 17 August, 2021; originally announced August 2021.

    Comments: Preprint

  7. arXiv:2108.02832  [pdf, other

    eess.IV cs.CV

    Ada-VSR: Adaptive Video Super-Resolution with Meta-Learning

    Authors: Akash Gupta, Padmaja Jonnalagedda, Bir Bhanu, Amit K. Roy-Chowdhury

    Abstract: Most of the existing works in supervised spatio-temporal video super-resolution (STVSR) heavily rely on a large-scale external dataset consisting of paired low-resolution low-frame rate (LR-LFR)and high-resolution high-frame-rate (HR-HFR) videos. Despite their remarkable performance, these methods make a prior assumption that the low-resolution video is obtained by down-scaling the high-resolution… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

  8. arXiv:2107.12847  [pdf, other

    cs.CV cs.LG cs.RO

    Learning Local Recurrent Models for Human Mesh Recovery

    Authors: Runze Li, Srikrishna Karanam, Ren Li, Terrence Chen, Bir Bhanu, Ziyan Wu

    Abstract: We consider the problem of estimating frame-level full human body meshes given a video of a person with natural motion dynamics. While much progress in this field has been in single image-based mesh estimation, there has been a recent uptick in efforts to infer mesh dynamics from video given its role in alleviating issues such as depth ambiguity and occlusions. However, a key limitation of existin… ▽ More

    Submitted 27 July, 2021; originally announced July 2021.

    Comments: 10 pages, 6 figures, 2 tables

  9. arXiv:2107.12429  [pdf, other

    cs.CV

    MonoIndoor: Towards Good Practice of Self-Supervised Monocular Depth Estimation for Indoor Environments

    Authors: Pan Ji, Runze Li, Bir Bhanu, Yi Xu

    Abstract: Self-supervised depth estimation for indoor environments is more challenging than its outdoor counterpart in at least the following two aspects: (i) the depth range of indoor sequences varies a lot across different frames, making it difficult for the depth network to induce consistent depth cues, whereas the maximum distance in outdoor scenes mostly stays the same as the camera usually sees the sk… ▽ More

    Submitted 27 July, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: ICCV 2021

  10. arXiv:2103.16083  [pdf, other

    cs.CV

    Fully Convolutional Scene Graph Generation

    Authors: Hengyue Liu, Ning Yan, Masood S. Mortazavi, Bir Bhanu

    Abstract: This paper presents a fully convolutional scene graph generation (FCSGG) model that detects objects and relations simultaneously. Most of the scene graph generation frameworks use a pre-trained two-stage object detector, like Faster R-CNN, and build scene graphs using bounding box features. Such pipeline usually has a large number of parameters and low inference speed. Unlike these approaches, FCS… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: CVPR 2021 Oral

  11. arXiv:2011.02836  [pdf, other

    cs.LG

    Dynamically Throttleable Neural Networks (TNN)

    Authors: Hengyue Liu, Samyak Parajuli, Jesse Hostetler, Sek Chai, Bir Bhanu

    Abstract: Conditional computation for Deep Neural Networks (DNNs) reduce overall computational load and improve model accuracy by running a subset of the network. In this work, we present a runtime throttleable neural network (TNN) that can adaptively self-regulate its own performance target and computing resources. We designed TNN with several properties that enable more flexibility for dynamic execution b… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: arXiv admin note: text overlap with arXiv:1905.13179

  12. arXiv:2005.07225  [pdf, other

    eess.IV cs.CV

    SAGE: Sequential Attribute Generator for Analyzing Glioblastomas using Limited Dataset

    Authors: Padmaja Jonnalagedda, Brent Weinberg, Jason Allen, Taejin L. Min, Shiv Bhanu, Bir Bhanu

    Abstract: While deep learning approaches have shown remarkable performance in many imaging tasks, most of these methods rely on availability of large quantities of data. Medical image data, however, is scarce and fragmented. Generative Adversarial Networks (GANs) have recently been very effective in handling such datasets by generating more data. If the datasets are very small, however, GANs cannot learn th… ▽ More

    Submitted 3 June, 2022; v1 submitted 14 May, 2020; originally announced May 2020.

  13. arXiv:1911.07389  [pdf, other

    cs.CV cs.LG

    Towards Visually Explaining Variational Autoencoders

    Authors: Wenqian Liu, Runze Li, Meng Zheng, Srikrishna Karanam, Ziyan Wu, Bir Bhanu, Richard J. Radke, Octavia Camps

    Abstract: Recent advances in Convolutional Neural Network (CNN) model interpretability have led to impressive progress in visualizing and understanding model predictions. In particular, gradient-based visual attention methods have driven much recent effort in using visual attention maps as a means for visual explanations. A key problem, however, is these methods are designed for classification and categoriz… ▽ More

    Submitted 14 April, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

    Comments: 10 pages, 9 figures, 2 tables, CVPR 2020