Search | arXiv e-print repository

SynthForge: Synthesizing High-Quality Face Dataset with Controllable 3D Generative Models

Authors: Abhay Rawat, Shubham Dokania, Astitva Srivastava, Shuaib Ahmed, Haiwen Feng, Rahul Tallamraju

Abstract: Recent advancements in generative models have unlocked the capabilities to render photo-realistic data in a controllable fashion. Trained on the real data, these generative models are capable of producing realistic samples with minimal to no domain gap, as compared to the traditional graphics rendering. However, using the data generated using such models for training downstream tasks remains under… ▽ More Recent advancements in generative models have unlocked the capabilities to render photo-realistic data in a controllable fashion. Trained on the real data, these generative models are capable of producing realistic samples with minimal to no domain gap, as compared to the traditional graphics rendering. However, using the data generated using such models for training downstream tasks remains under-explored, mainly due to the lack of 3D consistent annotations. Moreover, controllable generative models are learned from massive data and their latent space is often too vast to obtain meaningful sample distributions for downstream task with limited generation. To overcome these challenges, we extract 3D consistent annotations from an existing controllable generative model, making the data useful for downstream tasks. Our experiments show competitive performance against state-of-the-art models using only generated synthetic data, demonstrating potential for solving downstream tasks. Project page: https://synth-forge.github.io △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 11 pages, 4 figures, 3 tables. Under Review

arXiv:2210.12878 [pdf, other]

IDD-3D: Indian Driving Dataset for 3D Unstructured Road Scenes

Authors: Shubham Dokania, A. H. Abdul Hafez, Anbumani Subramanian, Manmohan Chandraker, C. V. Jawahar

Abstract: Autonomous driving and assistance systems rely on annotated data from traffic and road scenarios to model and learn the various object relations in complex real-world scenarios. Preparation and training of deploy-able deep learning architectures require the models to be suited to different traffic scenarios and adapt to different situations. Currently, existing datasets, while large-scale, lack su… ▽ More Autonomous driving and assistance systems rely on annotated data from traffic and road scenarios to model and learn the various object relations in complex real-world scenarios. Preparation and training of deploy-able deep learning architectures require the models to be suited to different traffic scenarios and adapt to different situations. Currently, existing datasets, while large-scale, lack such diversities and are geographically biased towards mainly developed cities. An unstructured and complex driving layout found in several developing countries such as India poses a challenge to these models due to the sheer degree of variations in the object types, densities, and locations. To facilitate better research toward accommodating such scenarios, we build a new dataset, IDD-3D, which consists of multi-modal data from multiple cameras and LiDAR sensors with 12k annotated driving LiDAR frames across various traffic scenarios. We discuss the need for this dataset through statistical comparisons with existing datasets and highlight benchmarks on standard 3D object detection and tracking tasks in complex layouts. Code and data available at https://github.com/shubham1810/idd3d_kit.git △ Less

Submitted 23 October, 2022; originally announced October 2022.

Comments: 10 pages, 8 figures, 5 tables, Accepted in Winter Conference on Applications of Computer Vision (WACV 2023)

arXiv:2208.07943 [pdf, other]

TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments

Authors: Shubham Dokania, Anbumani Subramanian, Manmohan Chandraker, C. V. Jawahar

Abstract: High-quality structured data with rich annotations are critical components in intelligent vehicle systems dealing with road scenes. However, data curation and annotation require intensive investments and yield low-diversity scenarios. The recently growing interest in synthetic data raises questions about the scope of improvement in such systems and the amount of manual work still required to produ… ▽ More High-quality structured data with rich annotations are critical components in intelligent vehicle systems dealing with road scenes. However, data curation and annotation require intensive investments and yield low-diversity scenarios. The recently growing interest in synthetic data raises questions about the scope of improvement in such systems and the amount of manual work still required to produce high volumes and variations of simulated data. This work proposes a synthetic data generation pipeline that utilizes existing datasets, like nuScenes, to address the difficulties and domain-gaps present in simulated datasets. We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation, mimicking real scene properties with high-fidelity, along with mechanisms to diversify samples in a physically meaningful way. We demonstrate improvements in mIoU metrics by presenting qualitative and quantitative experiments with real and synthetic data for semantic segmentation on the Cityscapes and KITTI-STEP datasets. All relevant code and data is released on github (https://github.com/shubham1810/trove_toolkit). △ Less

Submitted 16 August, 2022; originally announced August 2022.

Comments: 18 pages, 5 figures, Accepted in European Conference on Computer Vision (ECCV 2022)

arXiv:1910.11117 [pdf, other]

Graph Representation learning for Audio & Music genre Classification

Authors: Shubham Dokania, Vasudev Singh

Abstract: Music genre is arguably one of the most important and discriminative information for music and audio content. Visual representation based approaches have been explored on spectrograms for music genre classification. However, lack of quality data and augmentation techniques makes it difficult to employ deep learning techniques successfully. We discuss the application of graph neural networks on suc… ▽ More Music genre is arguably one of the most important and discriminative information for music and audio content. Visual representation based approaches have been explored on spectrograms for music genre classification. However, lack of quality data and augmentation techniques makes it difficult to employ deep learning techniques successfully. We discuss the application of graph neural networks on such task due to their strong inductive bias, and show that combination of CNN and GNN is able to achieve state-of-the-art results on GTZAN, and AudioSet (Imbalanced Music) datasets. We also discuss the role of Siamese Neural Networks as an analogous to GNN for learning edge similarity weights. Furthermore, we also perform visual analysis to understand the field-of-view of our model into the spectrogram based on genre labels. △ Less

Submitted 23 October, 2019; originally announced October 2019.

arXiv:1805.08978 [pdf, ps, other]

Distributed Approximation Algorithms for the Combinatorial Motion Planning Problem

Authors: Simran Dokania, Aditya Paliwal, Shrisha Rao

Abstract: We present a new $4$-approximation algorithm for the Combinatorial Motion Planning problem which runs in $\mathcal{O}(n^2α(n^2,n))$ time, where $α$ is the functional inverse of the Ackermann function, and a fully distributed version for the same in asynchronous message passing systems, which runs in $\mathcal{O}(n\log_2n)$ time with a message complexity of $\mathcal{O}(n^2)$. This also includes th… ▽ More We present a new $4$-approximation algorithm for the Combinatorial Motion Planning problem which runs in $\mathcal{O}(n^2α(n^2,n))$ time, where $α$ is the functional inverse of the Ackermann function, and a fully distributed version for the same in asynchronous message passing systems, which runs in $\mathcal{O}(n\log_2n)$ time with a message complexity of $\mathcal{O}(n^2)$. This also includes the first fully distributed algorithm in asynchronous message passing systems to perform "shortcut" operations on paths, a procedure which is important in approximation algorithms for the vehicle routing problem and its variants. We also show that our algorithm gives feasible solutions to the $k$-TSP problem with an approximation factor of $2$ in both centralized and distributed environments. The broad idea of the algorithm is to distribute the set of vertices into two subsets and construct paths for each salesman over each of the two subsets. Finally we combine these pairwise disjoint paths for each salesman to obtain a set of paths that span the entire graph. This is similar to the algorithm by Yadlapalli et. al. \cite{3.66} but differs in respect to the fact that it does not require us to use minimum cost matching as a subroutine, and hence can be easily distributed. △ Less

Submitted 23 May, 2018; originally announced May 2018.

MSC Class: 68W15; 68W25 ACM Class: G.2.2; I.1.2; C.2.4

arXiv:1709.03793 [pdf, other]

doi 10.1109/CISS.2017.7926065

Opportunistic Self Organizing Migrating Algorithm for Real-Time Dynamic Traveling Salesman Problem

Authors: Shubham Dokania, Sunyam Bagga, Rohit Sharma

Abstract: Self Organizing Migrating Algorithm (SOMA) is a meta-heuristic algorithm based on the self-organizing behavior of individuals in a simulated social environment. SOMA performs iterative computations on a population of potential solutions in the given search space to obtain an optimal solution. In this paper, an Opportunistic Self Organizing Migrating Algorithm (OSOMA) has been proposed that introdu… ▽ More Self Organizing Migrating Algorithm (SOMA) is a meta-heuristic algorithm based on the self-organizing behavior of individuals in a simulated social environment. SOMA performs iterative computations on a population of potential solutions in the given search space to obtain an optimal solution. In this paper, an Opportunistic Self Organizing Migrating Algorithm (OSOMA) has been proposed that introduces a novel strategy to generate perturbations effectively. This strategy allows the individual to span across more possible solutions and thus, is able to produce better solutions. A comprehensive analysis of OSOMA on multi-dimensional unconstrained benchmark test functions is performed. OSOMA is then applied to solve real-time Dynamic Traveling Salesman Problem (DTSP). The problem of real-time DTSP has been stipulated and simulated using real-time data from Google Maps with a varying cost-metric between any two cities. Although DTSP is a very common and intuitive model in the real world, its presence in literature is still very limited. OSOMA performs exceptionally well on the problems mentioned above. To substantiate this claim, the performance of OSOMA is compared with SOMA, Differential Evolution and Particle Swarm Optimization. △ Less

Submitted 12 September, 2017; originally announced September 2017.

Comments: 6 pages, published in CISS 2017

arXiv:1702.05308 [pdf, other]

Hierarchy Influenced Differential Evolution: A Motor Operation Inspired Approach

Authors: Shubham Dokania, Ayush Chopra, Feroz Ahmad, Anil Singh Parihar

Abstract: Operational maturity of biological control systems have fuelled the inspiration for a large number of mathematical and logical models for control, automation and optimisation. The human brain represents the most sophisticated control architecture known to us and is a central motivation for several research attempts across various domains. In the present work, we introduce an algorithm for mathemat… ▽ More Operational maturity of biological control systems have fuelled the inspiration for a large number of mathematical and logical models for control, automation and optimisation. The human brain represents the most sophisticated control architecture known to us and is a central motivation for several research attempts across various domains. In the present work, we introduce an algorithm for mathematical optimisation that derives its intuition from the hierarchical and distributed operations of the human motor system. The system comprises global leaders, local leaders and an effector population that adapt dynamically to attain global optimisation via a feedback mechanism coupled with the structural hierarchy. The hierarchical system operation is distributed into local control for movement and global controllers that facilitate gross motion and decision making. We present our algorithm as a variant of the classical Differential Evolution algorithm, introducing a hierarchical crossover operation. The discussed approach is tested exhaustively on standard test functions as well as the CEC 2017 benchmark. Our algorithm significantly outperforms various standard algorithms as well as their popular variants as discussed in the results. △ Less

Submitted 12 September, 2017; v1 submitted 17 February, 2017; originally announced February 2017.

Comments: 8 pages, accepted at IJCCI 2017

Showing 1–7 of 7 results for author: Dokania, S