Zum Hauptinhalt springen

Showing 1–13 of 13 results for author: Dave, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.10478  [pdf, other

    cs.CV cs.LG

    CodaMal: Contrastive Domain Adaptation for Malaria Detection in Low-Cost Microscopes

    Authors: Ishan Rajendrakumar Dave, Tristan de Blegiers, Chen Chen, Mubarak Shah

    Abstract: Malaria is a major health issue worldwide, and its diagnosis requires scalable solutions that can work effectively with low-cost microscopes (LCM). Deep learning-based methods have shown success in computer-aided diagnosis from microscopic images. However, these methods need annotated images that show cells affected by malaria parasites and their life stages. Annotating images from LCM significant… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Under Review. Project Page: https://daveishan.github.io/codamal-webpage/

  2. arXiv:2312.13008  [pdf, other

    cs.CV cs.AI cs.LG

    No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

    Authors: Ishan Rajendrakumar Dave, Simon Jenni, Mubarak Shah

    Abstract: Self-supervised approaches for video have shown impressive results in video understanding tasks. However, unlike early works that leverage temporal self-supervision, current state-of-the-art methods primarily rely on tasks from the image domain (e.g., contrastive learning) that do not explicitly promote the learning of temporal features. We identify two factors that limit existing temporal self-su… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: AAAI 2024 (Main Technical Track)

  3. arXiv:2309.13962  [pdf, other

    cs.CV eess.IV

    Egocentric RGB+Depth Action Recognition in Industry-Like Settings

    Authors: Jyoti Kini, Sarah Fleischer, Ishan Dave, Mubarak Shah

    Abstract: Action recognition from an egocentric viewpoint is a crucial perception task in robotics and enables a wide range of human-robot interactions. While most computer vision approaches prioritize the RGB camera, the Depth modality - which can further amplify the subtleties of actions from an egocentric perspective - remains underexplored. Our work focuses on recognizing actions from egocentric RGB and… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

  4. arXiv:2308.13711  [pdf, other

    cs.CV cs.RO

    EventTransAct: A video transformer-based framework for Event-camera based action recognition

    Authors: Tristan de Blegiers, Ishan Rajendrakumar Dave, Adeel Yousaf, Mubarak Shah

    Abstract: Recognizing and comprehending human actions and gestures is a crucial perception requirement for robots to interact with humans and carry out tasks in diverse domains, including service robotics, healthcare, and manufacturing. Event cameras, with their ability to capture fast-moving objects at a high temporal resolution, offer new opportunities compared to standard action recognition in RGB videos… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: IROS 2023; The first two authors contributed equally

  5. arXiv:2308.11072  [pdf, other

    cs.CV cs.CR

    TeD-SPAD: Temporal Distinctiveness for Self-supervised Privacy-preservation for video Anomaly Detection

    Authors: Joseph Fioresi, Ishan Rajendrakumar Dave, Mubarak Shah

    Abstract: Video anomaly detection (VAD) without human monitoring is a complex computer vision task that can have a positive impact on society if implemented successfully. While recent advances have made significant progress in solving this task, most existing approaches overlook a critical real-world concern: privacy. With the increasing popularity of artificial intelligence technologies, it becomes crucial… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  6. arXiv:2308.05430  [pdf, other

    cs.CV

    Ensemble Modeling for Multimodal Visual Action Recognition

    Authors: Jyoti Kini, Sarah Fleischer, Ishan Dave, Mubarak Shah

    Abstract: In this work, we propose an ensemble modeling approach for multimodal action recognition. We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset. Based on the underlying principle of focal loss, which captures the relationship between tail (scarce) classes and their prediction difficulties, we prop… ▽ More

    Submitted 25 September, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

    Comments: 22nd International Conference on Image Analysis and Processing Workshops - Multimodal Action Recognition on the MECCANO Dataset, 2023

  7. arXiv:2303.16268  [pdf, other

    cs.CV cs.LG

    TimeBalance: Temporally-Invariant and Temporally-Distinctive Video Representations for Semi-Supervised Action Recognition

    Authors: Ishan Rajendrakumar Dave, Mamshad Nayeem Rizve, Chen Chen, Mubarak Shah

    Abstract: Semi-Supervised Learning can be more beneficial for the video domain compared to images because of its higher annotation cost and dimensionality. Besides, any video understanding task requires reasoning over both spatial and temporal dimensions. In order to learn both the static and motion related features for the semi-supervised action recognition task, existing methods rely on hard input inducti… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: CVPR-2023

  8. arXiv:2210.08423  [pdf, other

    cs.CV cs.RO

    TransVisDrone: Spatio-Temporal Transformer for Vision-based Drone-to-Drone Detection in Aerial Videos

    Authors: Tushar Sangam, Ishan Rajendrakumar Dave, Waqas Sultani, Mubarak Shah

    Abstract: Drone-to-drone detection using visual feed has crucial applications, such as detecting drone collisions, detecting drone attacks, or coordinating flight with other drones. However, existing methods are computationally costly, follow non-end-to-end optimization, and have complex multi-stage pipelines, making them less suitable for real-time deployment on edge devices. In this work, we propose a sim… ▽ More

    Submitted 25 August, 2023; v1 submitted 15 October, 2022; originally announced October 2022.

    Comments: ICRA 2023

  9. arXiv:2203.15205  [pdf, other

    cs.CV cs.CR cs.LG

    SPAct: Self-supervised Privacy Preservation for Action Recognition

    Authors: Ishan Rajendrakumar Dave, Chen Chen, Mubarak Shah

    Abstract: Visual private information leakage is an emerging key issue for the fast growing applications of video understanding like activity recognition. Existing approaches for mitigating privacy leakage in action recognition require privacy labels along with the action labels from the video dataset. However, annotating frames of video dataset for privacy labels is not feasible. Recent developments of self… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: CVPR-2022

  10. arXiv:2203.02035  [pdf, other

    cs.HC

    Baba is Y'all 2.0: Design and Investigation of a Collaborative Mixed-Initiative System

    Authors: M Charity, Isha Dave, Ahmed Khalifa, Julian Togelius

    Abstract: This paper describes a new version of the mixed-initiative collaborative level designing system: Baba is Y'all, as well as the results of a user study on the system. Baba is Y'all is a prototype for AI-assisted game design in collaboration with others. The updated version includes a more user-friendly interface, a better level-evolver and recommendation system, and extended site features. The syst… ▽ More

    Submitted 10 October, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: 15 pages

  11. arXiv:2110.07758  [pdf, other

    cs.CV

    "Knights": First Place Submission for VIPriors21 Action Recognition Challenge at ICCV 2021

    Authors: Ishan Dave, Naman Biyani, Brandon Clark, Rohit Gupta, Yogesh Rawat, Mubarak Shah

    Abstract: This technical report presents our approach "Knights" to solve the action recognition task on a small subset of Kinetics-400 i.e. Kinetics400ViPriors without using any extra-data. Our approach has 3 main components: state-of-the-art Temporal Contrastive self-supervised pretraining, video transformer models, and optical flow modality. Along with the use of standard test-time augmentation, our propo… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: Challenge results are available at https://vipriors.github.io/challenges/#action-recognition

  12. TCLR: Temporal Contrastive Learning for Video Representation

    Authors: Ishan Dave, Rohit Gupta, Mamshad Nayeem Rizve, Mubarak Shah

    Abstract: Contrastive learning has nearly closed the gap between supervised and self-supervised learning of image representations, and has also been explored for videos. However, prior work on contrastive learning for video data has not explored the effect of explicitly encouraging the features to be distinct across the temporal dimension. We develop a new temporal contrastive learning framework consisting… ▽ More

    Submitted 30 March, 2022; v1 submitted 20 January, 2021; originally announced January 2021.

    Comments: Accepted to Computer Vision and Image Understanding (CVIU) Journal

  13. arXiv:2004.11475  [pdf, other

    cs.CV eess.IV

    Gabriella: An Online System for Real-Time Activity Detection in Untrimmed Security Videos

    Authors: Mamshad Nayeem Rizve, Ugur Demir, Praveen Tirupattur, Aayush Jung Rana, Kevin Duarte, Ishan Dave, Yogesh Singh Rawat, Mubarak Shah

    Abstract: Activity detection in security videos is a difficult problem due to multiple factors such as large field of view, presence of multiple activities, varying scales and viewpoints, and its untrimmed nature. The existing research in activity detection is mainly focused on datasets, such as UCF-101, JHMDB, THUMOS, and AVA, which partially address these issues. The requirement of processing the security… ▽ More

    Submitted 19 May, 2020; v1 submitted 23 April, 2020; originally announced April 2020.

    Comments: 9 pages