Search | arXiv e-print repository

Bangladeshi Native Vehicle Detection in Wild

Authors: Bipin Saha, Md. Johirul Islam, Shaikh Khaled Mostaque, Aditya Bhowmik, Tapodhir Karmakar Taton, Md. Nakib Hayat Chowdhury, Mamun Bin Ibne Reaz

Abstract: The success of autonomous navigation relies on robust and precise vehicle recognition, hindered by the scarcity of region-specific vehicle detection datasets, impeding the development of context-aware systems. To advance terrestrial object detection research, this paper proposes a native vehicle detection dataset for the most commonly appeared vehicle classes in Bangladesh. 17 distinct vehicle cla… ▽ More The success of autonomous navigation relies on robust and precise vehicle recognition, hindered by the scarcity of region-specific vehicle detection datasets, impeding the development of context-aware systems. To advance terrestrial object detection research, this paper proposes a native vehicle detection dataset for the most commonly appeared vehicle classes in Bangladesh. 17 distinct vehicle classes have been taken into account, with fully annotated 81542 instances of 17326 images. Each image width is set to at least 1280px. The dataset's average vehicle bounding box-to-image ratio is 4.7036. This Bangladesh Native Vehicle Dataset (BNVD) has accounted for several geographical, illumination, variety of vehicle sizes, and orientations to be more robust on surprised scenarios. In the context of examining the BNVD dataset, this work provides a thorough assessment with four successive You Only Look Once (YOLO) models, namely YOLO v5, v6, v7, and v8. These dataset's effectiveness is methodically evaluated and contrasted with other vehicle datasets already in use. The BNVD dataset exhibits mean average precision(mAP) at 50% intersection over union (IoU) is 0.848 corresponding precision and recall values of 0.841 and 0.774. The research findings indicate a mAP of 0.643 at an IoU range of 0.5 to 0.95. The experiments show that the BNVD dataset serves as a reliable representation of vehicle distribution and presents considerable complexities. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 13 pages, 8 figures

arXiv:2311.18512 [pdf, other]

Revisiting Proposal-based Object Detection

Authors: Aritra Bhowmik, Martin R. Oswald, Pascal Mettes, Cees G. M. Snoek

Abstract: This paper revisits the pipeline for detecting objects in images with proposals. For any object detector, the obtained box proposals or queries need to be classified and regressed towards ground truth boxes. The common solution for the final predictions is to directly maximize the overlap between each proposal and the ground truth box, followed by a winner-takes-all ranking or non-maximum suppress… ▽ More This paper revisits the pipeline for detecting objects in images with proposals. For any object detector, the obtained box proposals or queries need to be classified and regressed towards ground truth boxes. The common solution for the final predictions is to directly maximize the overlap between each proposal and the ground truth box, followed by a winner-takes-all ranking or non-maximum suppression. In this work, we propose a simple yet effective alternative. For proposal regression, we solve a simpler problem where we regress to the area of intersection between proposal and ground truth. In this way, each proposal only specifies which part contains the object, avoiding a blind inpainting problem where proposals need to be regressed beyond their visual scope. In turn, we replace the winner-takes-all strategy and obtain the final prediction by taking the union over the regressed intersections of a proposal group surrounding an object. Our revisited approach comes with minimal changes to the detection pipeline and can be plugged into any existing method. We show that our approach directly improves canonical object detection and instance segmentation architectures, highlighting the utility of intersection-based regression and grouping. △ Less

Submitted 30 November, 2023; originally announced November 2023.

Comments: 10 pages, 7 figures

arXiv:2311.15741 [pdf]

Machine Learning-Based Jamun Leaf Disease Detection: A Comprehensive Review

Authors: Auvick Chandra Bhowmik, Dr. Md. Taimur Ahad, Yousuf Rayhan Emon

Abstract: Jamun leaf diseases pose a significant threat to agricultural productivity, negatively impacting both yield and quality in the jamun industry. The advent of machine learning has opened up new avenues for tackling these diseases effectively. Early detection and diagnosis are essential for successful crop management. While no automated systems have yet been developed specifically for jamun leaf dise… ▽ More Jamun leaf diseases pose a significant threat to agricultural productivity, negatively impacting both yield and quality in the jamun industry. The advent of machine learning has opened up new avenues for tackling these diseases effectively. Early detection and diagnosis are essential for successful crop management. While no automated systems have yet been developed specifically for jamun leaf disease detection, various automated systems have been implemented for similar types of disease detection using image processing techniques. This paper presents a comprehensive review of machine learning methodologies employed for diagnosing plant leaf diseases through image classification, which can be adapted for jamun leaf disease detection. It meticulously assesses the strengths and limitations of various Vision Transformer models, including Transfer learning model and vision transformer (TLMViT), SLViT, SE-ViT, IterationViT, Tiny-LeViT, IEM-ViT, GreenViT, and PMViT. Additionally, the paper reviews models such as Dense Convolutional Network (DenseNet), Residual Neural Network (ResNet)-50V2, EfficientNet, Ensemble model, Convolutional Neural Network (CNN), and Locally Reversible Transformer. These machine-learning models have been evaluated on various datasets, demonstrating their real-world applicability. This review not only sheds light on current advancements in the field but also provides valuable insights for future research directions in machine learning-based jamun leaf disease detection and classification. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.10782 [pdf, other]

A BERT based Ensemble Approach for Sentiment Classification of Customer Reviews and its Application to Nudge Marketing in e-Commerce

Authors: Sayan Putatunda, Anwesha Bhowmik, Girish Thiruvenkadam, Rahul Ghosh

Abstract: According to the literature, Product reviews are an important source of information for customers to support their buying decision. Product reviews improve customer trust and loyalty. Reviews help customers in understanding what other customers think about a particular product and helps in driving purchase decisions. Therefore, for an e-commerce platform it is important to understand the sentiment… ▽ More According to the literature, Product reviews are an important source of information for customers to support their buying decision. Product reviews improve customer trust and loyalty. Reviews help customers in understanding what other customers think about a particular product and helps in driving purchase decisions. Therefore, for an e-commerce platform it is important to understand the sentiments in customer reviews to understand their products and services, and it also allows them to potentially create positive consumer interaction as well as long lasting relationships. Reviews also provide innovative ways to market the products for an ecommerce company. One such approach is Nudge Marketing. Nudge marketing is a subtle way for an ecommerce company to help their customers make better decisions without hesitation. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: Submitted to a Journal for review

arXiv:2303.02776 [pdf, other]

A Low-Cost Portable Apparatus to Analyze Oral Fluid Droplets and Quantify the Efficacy of Masks

Authors: Ava Tan Bhowmik

Abstract: Every year, about 4 million people die from upper respiratory infections. Mask-wearing is crucial in preventing the spread of pathogen-containing droplets, which is the primary cause of these illnesses. However, most techniques for mask efficacy evaluation are expensive to set up and complex to operate. In this work, a novel, low-cost, and quantitative metrology to visualize, track, and analyze or… ▽ More Every year, about 4 million people die from upper respiratory infections. Mask-wearing is crucial in preventing the spread of pathogen-containing droplets, which is the primary cause of these illnesses. However, most techniques for mask efficacy evaluation are expensive to set up and complex to operate. In this work, a novel, low-cost, and quantitative metrology to visualize, track, and analyze orally-generated fluid droplets is developed. The project has four stages: setup optimization, data collection, data analysis, and application development. The metrology was initially developed in a dark closet as a proof of concept using common household materials and was subsequently implemented into a portable apparatus. Tonic water and UV darklight tube lights are selected to visualize fluorescent droplet and aerosol propagation with automated analysis developed using open-source software. The dependencies of oral fluid droplet generation and propagation on various factors are studied in detail and established using this metrology. Additionally, the smallest detectable droplet size was mathematically correlated to height and airborne time. The efficacy of different types of masks is evaluated and associated with fabric microstructures. It is found that masks with smaller-sized pores and thicker material are more effective. This technique can easily be constructed at home using materials that total to a cost of below \$60, thereby enabling a low-cost and accurate metrology. △ Less

Submitted 5 March, 2023; originally announced March 2023.

Comments: 13 pages, 15 figures. arXiv admin note: substantial text overlap with arXiv:2201.03993

arXiv:2212.12395 [pdf, other]

Detecting Objects with Context-Likelihood Graphs and Graph Refinement

Authors: Aritra Bhowmik, Yu Wang, Nora Baka, Martin R. Oswald, Cees G. M. Snoek

Abstract: The goal of this paper is to detect objects by exploiting their interrelationships. Contrary to existing methods, which learn objects and relations separately, our key idea is to learn the object-relation distribution jointly. We first propose a novel way of creating a graphical representation of an image from inter-object relation priors and initial class predictions, we call a context-likelihood… ▽ More The goal of this paper is to detect objects by exploiting their interrelationships. Contrary to existing methods, which learn objects and relations separately, our key idea is to learn the object-relation distribution jointly. We first propose a novel way of creating a graphical representation of an image from inter-object relation priors and initial class predictions, we call a context-likelihood graph. We then learn the joint distribution with an energy-based modeling technique which allows to sample and refine the context-likelihood graph iteratively for a given image. Our formulation of jointly learning the distribution enables us to generate a more accurate graph representation of an image which leads to a better object detection performance. We demonstrate the benefits of our context-likelihood graph formulation and the energy-based graph refinement via experiments on the Visual Genome and MS-COCO datasets where we achieve a consistent improvement over object detectors like DETR and Faster-RCNN, as well as alternative methods modeling object interrelationships separately. Our method is detector agnostic, end-to-end trainable, and especially beneficial for rare object classes. △ Less

Submitted 27 September, 2023; v1 submitted 23 December, 2022; originally announced December 2022.

Comments: 13 pages, 8 figures. In Proceedings of International Conference on Computer Vision (ICCV) 2023

arXiv:2212.12300 [pdf, ps, other]

Matrix Based Adaptive Short Block Cipher

Authors: Awnon Bhowmik

Abstract: Every day, millions of credit cards are swiped and transactions are carried out across the world. Due to numerous forms of unethical digital activities, users are vulnerable to credit card fraud, phishing, identity theft, etc. This paper outlines a novel block encryption algorithm involving multiple private keys and a resilient trapdoor function that ensures data security while maintaining an opti… ▽ More Every day, millions of credit cards are swiped and transactions are carried out across the world. Due to numerous forms of unethical digital activities, users are vulnerable to credit card fraud, phishing, identity theft, etc. This paper outlines a novel block encryption algorithm involving multiple private keys and a resilient trapdoor function that ensures data security while maintaining an optimal run time and space complexity. The proposed scheme consists of an irrepressible trapdoor based on a depressed cubic function and a unique key generation algorithm that uses Fibonacci sequences and invertible square matrices for improved security. The paper involves data obtained from comprehensive cryptanalysis exploiting the strengths and weaknesses of the system and comments on its potential large-scale industry applications. △ Less

Submitted 9 October, 2022; originally announced December 2022.

MSC Class: 94A60; 11Y40

arXiv:2208.06002 [pdf, other]

A review of cryptosystems based on multi layer chaotic mappings

Authors: Awnon Bhowmik, Emon Hossain, Mahmudul Hasan

Abstract: In recent years, a lot of research has gone into creating multi-layer chaotic mapping-based cryptosystems. Random-like behavior, a continuous broadband power spectrum, and a weak baseline condition dependency are all characteristics of chaotic systems. Chaos could be helpful in the three functional components of compression, encryption, and modulation in a digital communication system. To successf… ▽ More In recent years, a lot of research has gone into creating multi-layer chaotic mapping-based cryptosystems. Random-like behavior, a continuous broadband power spectrum, and a weak baseline condition dependency are all characteristics of chaotic systems. Chaos could be helpful in the three functional components of compression, encryption, and modulation in a digital communication system. To successfully use chaos theory in cryptography, chaotic maps must be built in such a way that the entropy they produce can provide the necessary confusion and diffusion. A chaotic map is used in the first layer of such cryptosystems to create confusion, and a second chaotic map is used in the second layer to create diffusion and create a ciphertext from a plaintext. A secret key generation mechanism and a key exchange method are frequently left out, and many researchers just assume that these essential components of any effective cryptosystem are always accessible. We review such cryptosystems by using a cryptosystem of our design, in which confusion in plaintext is created using Arnold's Cat Map, and logistic mapping is employed to create sufficient dispersion and ultimately get a matching ciphertext. We also address the development of key exchange protocols and secret key schemes for these cryptosystems, as well as the possible outcomes of using cryptanalysis techniques on such a system. △ Less

Submitted 17 July, 2022; originally announced August 2022.

Comments: 10 pages, 1 figure, 3 tables

arXiv:2207.12858 [pdf, other]

Transition1x -- a Dataset for Building Generalizable Reactive Machine Learning Potentials

Authors: Mathias Schreiner, Arghya Bhowmik, Tejs Vegge, Jonas Busk, Ole Winther

Abstract: Machine Learning (ML) models have, in contrast to their usefulness in molecular dynamics studies, had limited success as surrogate potentials for reaction barrier search. It is due to the scarcity of training data in relevant transition state regions of chemical space. Currently, available datasets for training ML models on small molecular systems almost exclusively contain configurations at or ne… ▽ More Machine Learning (ML) models have, in contrast to their usefulness in molecular dynamics studies, had limited success as surrogate potentials for reaction barrier search. It is due to the scarcity of training data in relevant transition state regions of chemical space. Currently, available datasets for training ML models on small molecular systems almost exclusively contain configurations at or near equilibrium. In this work, we present the dataset Transition1x containing 9.6 million Density Functional Theory (DFT) calculations of forces and energies of molecular configurations on and around reaction pathways at the wB97x/6-31G(d) level of theory. The data was generated by running Nudged Elastic Band (NEB) calculations with DFT on 10k reactions while saving intermediate calculations. We train state-of-the-art equivariant graph message-passing neural network models on Transition1x and cross-validate on the popular ANI1x and QM9 datasets. We show that ML models cannot learn features in transition-state regions solely by training on hitherto popular benchmark datasets. Transition1x is a new challenging benchmark that will provide an important step towards developing next-generation ML force fields that also work far away from equilibrium configurations and reactive systems. △ Less

Submitted 1 September, 2022; v1 submitted 25 July, 2022; originally announced July 2022.

arXiv:2207.09971 [pdf, other]

NeuralNEB -- Neural Networks can find Reaction Paths Fast

Authors: Mathias Schreiner, Arghya Bhowmik, Tejs Vegge, Peter Bjørn Jørgensen, Ole Winther

Abstract: Quantum mechanical methods like Density Functional Theory (DFT) are used with great success alongside efficient search algorithms for studying kinetics of reactive systems. However, DFT is prohibitively expensive for large scale exploration. Machine Learning (ML) models have turned out to be excellent emulators of small molecule DFT calculations and could possibly replace DFT in such tasks. For ki… ▽ More Quantum mechanical methods like Density Functional Theory (DFT) are used with great success alongside efficient search algorithms for studying kinetics of reactive systems. However, DFT is prohibitively expensive for large scale exploration. Machine Learning (ML) models have turned out to be excellent emulators of small molecule DFT calculations and could possibly replace DFT in such tasks. For kinetics, success relies primarily on the models capability to accurately predict the Potential Energy Surface (PES) around transition-states and Minimal Energy Paths (MEPs). Previously this has not been possible due to scarcity of relevant data in the literature. In this paper we train state of the art equivariant Graph Neural Network (GNN)-based models on around 10.000 elementary reactions from the Transition1x dataset. We apply the models as potentials for the Nudged Elastic Band (NEB) algorithm and achieve a Mean Average Error (MAE) of 0.13+/-0.03 eV on barrier energies on unseen reactions. We compare the results against equivalent models trained on QM9 and ANI1x. We also compare with and outperform Density Functional based Tight Binding (DFTB) on both accuracy and computational resource. The implication is that ML models, given relevant data, are now at a level where they can be applied for downstream tasks in quantum chemistry transcending prediction of simple molecular features. △ Less

Submitted 1 September, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

arXiv:2201.03993 [pdf, other]

A Novel Home-Built Metrology to Analyze Oral Fluid Droplets and Quantify the Efficacy of Masks

Authors: Ava Tan Bhowmik

Abstract: Wearing masks is crucial to preventing the spread of potentially pathogen-containing droplets, especially amidst the COVID-19 pandemic. However, not all face coverings are equally effective and most experiments evaluating mask efficacy are very expensive and complex to operate. In this work, a novel, home-built, low-cost, and accurate metrology to visualize orally-generated fluid droplets has been… ▽ More Wearing masks is crucial to preventing the spread of potentially pathogen-containing droplets, especially amidst the COVID-19 pandemic. However, not all face coverings are equally effective and most experiments evaluating mask efficacy are very expensive and complex to operate. In this work, a novel, home-built, low-cost, and accurate metrology to visualize orally-generated fluid droplets has been developed. The project includes setup optimization, data collection, data analysis, and applications. The final materials chosen were quinine-containing tonic water, 397-402 nm wavelength UV tube lights, an iPhone and tripod, string, and a spray bottle. The experiment took place in a dark closet with a dark background. During data collection, the test subject first wets their mouth with an ingestible fluorescent liquid (tonic water) and speaks, sneezes, or coughs under UV darklight. The fluorescence from the tonic water droplets generated can be visualized, recorded by an iPhone 8+ camera in slo-mo (240 fps), and analyzed. The software VLC is used for frame separation and Fiji/ImageJ is used for image processing and analysis. The dependencies of oral fluid droplet generation and propagation on different phonics, the loudness of speech, and the type of expiratory event were studied in detail and established using the metrology developed. The efficacy of different types of masks was evaluated and correlated with fabric microstructures. All masks blocked droplets to varying extent. Masks with smaller-sized pores and thicker material were found to block the most droplets. This low-cost technique can be easily constructed at home using materials that total to a cost of less than $50. Despite the minimal cost, the method is very accurate and the data is quantifiable. △ Less

Submitted 3 January, 2022; originally announced January 2022.

Comments: 9 pages, 12 figures

arXiv:2107.06068 [pdf, ps, other]

Calibrated Uncertainty for Molecular Property Prediction using Ensembles of Message Passing Neural Networks

Authors: Jonas Busk, Peter Bjørn Jørgensen, Arghya Bhowmik, Mikkel N. Schmidt, Ole Winther, Tejs Vegge

Abstract: Data-driven methods based on machine learning have the potential to accelerate computational analysis of atomic structures. In this context, reliable uncertainty estimates are important for assessing confidence in predictions and enabling decision making. However, machine learning models can produce badly calibrated uncertainty estimates and it is therefore crucial to detect and handle uncertainty… ▽ More Data-driven methods based on machine learning have the potential to accelerate computational analysis of atomic structures. In this context, reliable uncertainty estimates are important for assessing confidence in predictions and enabling decision making. However, machine learning models can produce badly calibrated uncertainty estimates and it is therefore crucial to detect and handle uncertainty carefully. In this work we extend a message passing neural network designed specifically for predicting properties of molecules and materials with a calibrated probabilistic predictive distribution. The method presented in this paper differs from previous work by considering both aleatoric and epistemic uncertainty in a unified framework, and by recalibrating the predictive distribution on unseen data. Through computer experiments, we show that our approach results in accurate models for predicting molecular formation energies with well calibrated uncertainty in and out of the training data distribution on two public molecular benchmark datasets, QM9 and PC9. The proposed method provides a general framework for training and evaluating neural network ensemble models that are able to produce accurate predictions of properties of molecules with well calibrated uncertainty estimates. △ Less

Submitted 3 November, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

arXiv:2011.03346 [pdf, other]

DeepDFT: Neural Message Passing Network for Accurate Charge Density Prediction

Authors: Peter Bjørn Jørgensen, Arghya Bhowmik

Abstract: We introduce DeepDFT, a deep learning model for predicting the electronic charge density around atoms, the fundamental variable in electronic structure simulations from which all ground state properties can be calculated. The model is formulated as neural message passing on a graph, consisting of interacting atom vertices and special query point vertices for which the charge density is predicted.… ▽ More We introduce DeepDFT, a deep learning model for predicting the electronic charge density around atoms, the fundamental variable in electronic structure simulations from which all ground state properties can be calculated. The model is formulated as neural message passing on a graph, consisting of interacting atom vertices and special query point vertices for which the charge density is predicted. The accuracy and scalability of the model are demonstrated for molecules, solids and liquids. The trained model achieves lower average prediction errors than the observed variations in charge density obtained from density functional theory simulations using different exchange correlation functionals. △ Less

Submitted 4 November, 2020; originally announced November 2020.

Comments: Workshop paper presented at Machine Learning for Molecules Workshop at NeurIPS 2020. Implementation and pretrained model are available at https://github.com/peterbjorgensen/DeepDFT

arXiv:2008.12645 [pdf]

doi 10.5120/ijca2020920331

Dragon Crypto -- An Innovative Cryptosystem

Authors: Awnon Bhowmik, Unnikrishnan Menon

Abstract: In recent years cyber-attacks are continuously developing. This means that hackers can find their way around the traditional cryptosystems. This calls for new and more secure cryptosystems to take their place. This paper outlines a new cryptosystem based on the dragon curve fractal. The security level of this scheme is based on multiple private keys, that are crucial for effective encryption and d… ▽ More In recent years cyber-attacks are continuously developing. This means that hackers can find their way around the traditional cryptosystems. This calls for new and more secure cryptosystems to take their place. This paper outlines a new cryptosystem based on the dragon curve fractal. The security level of this scheme is based on multiple private keys, that are crucial for effective encryption and decryption of data. This paper discusses, how core concepts emerging from fractal geometry can be used as a trapdoor function for this cryptosystem. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Comments: 5 pages, 6 figures, 1 table

Journal ref: International Journal of Computer Applications 176(29):37-41, June 2020

arXiv:2003.11774 [pdf, other]

Image Generation Via Minimizing Fréchet Distance in Discriminator Feature Space

Authors: Khoa D. Doan, Saurav Manchanda, Fengjiao Wang, Sathiya Keerthi, Avradeep Bhowmik, Chandan K. Reddy

Abstract: For a given image generation problem, the intrinsic image manifold is often low dimensional. We use the intuition that it is much better to train the GAN generator by minimizing the distributional distance between real and generated images in a small dimensional feature space representing such a manifold than on the original pixel-space. We use the feature space of the GAN discriminator for such a… ▽ More For a given image generation problem, the intrinsic image manifold is often low dimensional. We use the intuition that it is much better to train the GAN generator by minimizing the distributional distance between real and generated images in a small dimensional feature space representing such a manifold than on the original pixel-space. We use the feature space of the GAN discriminator for such a representation. For distributional distance, we employ one of two choices: the Fréchet distance or direct optimal transport (OT); these respectively lead us to two new GAN methods: Fréchet-GAN and OT-GAN. The idea of employing Fréchet distance comes from the success of Fréchet Inception Distance as a solid evaluation metric in image generation. Fréchet-GAN is attractive in several ways. We propose an efficient, numerically stable approach to calculate the Fréchet distance and its gradient. The Fréchet distance estimation requires a significantly less computation time than OT; this allows Fréchet-GAN to use much larger mini-batch size in training than OT. More importantly, we conduct experiments on a number of benchmark datasets and show that Fréchet-GAN (in particular) and OT-GAN have significantly better image generation capabilities than the existing representative primal and dual GAN approaches based on the Wasserstein distance. △ Less

Submitted 30 March, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

arXiv:2002.07971 [pdf, other]

Gradient Boosting Neural Networks: GrowNet

Authors: Sarkhan Badirli, Xuanqing Liu, Zhengming Xing, Avradeep Bhowmik, Khoa Doan, Sathiya S. Keerthi

Abstract: A novel gradient boosting framework is proposed where shallow neural networks are employed as ``weak learners''. General loss functions are considered under this unified framework with specific examples presented for classification, regression, and learning to rank. A fully corrective step is incorporated to remedy the pitfall of greedy function approximation of classic gradient boosting decision… ▽ More A novel gradient boosting framework is proposed where shallow neural networks are employed as ``weak learners''. General loss functions are considered under this unified framework with specific examples presented for classification, regression, and learning to rank. A fully corrective step is incorporated to remedy the pitfall of greedy function approximation of classic gradient boosting decision tree. The proposed model rendered outperforming results against state-of-the-art boosting methods in all three tasks on multiple datasets. An ablation study is performed to shed light on the effect of each model components and model hyperparameters. △ Less

Submitted 14 June, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

Comments: Supplementary material starts after references

arXiv:1912.00623 [pdf, other]

Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

Authors: Aritra Bhowmik, Stefan Gumhold, Carsten Rother, Eric Brachmann

Abstract: We address a core problem of computer vision: Detection and description of 2D feature points for image matching. For a long time, hand-crafted designs, like the seminal SIFT algorithm, were unsurpassed in accuracy and efficiency. Recently, learned feature detectors emerged that implement detection and description using neural networks. Training these networks usually resorts to optimizing low-leve… ▽ More We address a core problem of computer vision: Detection and description of 2D feature points for image matching. For a long time, hand-crafted designs, like the seminal SIFT algorithm, were unsurpassed in accuracy and efficiency. Recently, learned feature detectors emerged that implement detection and description using neural networks. Training these networks usually resorts to optimizing low-level matching scores, often pre-defining sets of image patches which should or should not match, or which should or should not contain key points. Unfortunately, increased accuracy for these low-level matching scores does not necessarily translate to better performance in high-level vision tasks. We propose a new training methodology which embeds the feature detector in a complete vision pipeline, and where the learnable parameters are trained in an end-to-end fashion. We overcome the discrete nature of key point selection and descriptor matching using principles from reinforcement learning. As an example, we address the task of relative pose estimation between a pair of images. We demonstrate that the accuracy of a state-of-the-art learning-based feature detector can be increased when trained for the task it is supposed to solve at test time. Our training methodology poses little restrictions on the task to learn, and works for any architecture which predicts key point heat maps, and descriptors for key point locations. △ Less

Submitted 20 March, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

Comments: CVPR 2020 (oral)

arXiv:1805.08631 [pdf]

Analysis of the Veracities of Industry Used Software Development Life Cycle Methodologies

Authors: AZM Ehtesham Chowdhury, Abhijit Bhowmik, Hasibul Hasan, Md Shamsur Rahim

Abstract: Currently, software industries are using different SDLC (software development life cycle) models which are designed for specific purposes. The use of technology is booming in every perspective of life and the software behind the technology plays an enormous role. As the technical complexities are increasing, successful development of software solely depends on the proper management of development… ▽ More Currently, software industries are using different SDLC (software development life cycle) models which are designed for specific purposes. The use of technology is booming in every perspective of life and the software behind the technology plays an enormous role. As the technical complexities are increasing, successful development of software solely depends on the proper management of development processes. So, it is inevitable to introduce improved methodologies in the industry so that modern human centred software applications development can be managed and delivered to the user successfully. So, in this paper, we have explored the facts of different SDLC models and perform their comparative analysis. △ Less

Submitted 15 July, 2018; v1 submitted 22 May, 2018; originally announced May 2018.

Comments: 7 pages

arXiv:1705.05548 [pdf, other]

Intel RealSense Stereoscopic Depth Cameras

Authors: Leonid Keselman, John Iselin Woodfill, Anders Grunnet-Jepsen, Achintya Bhowmik

Abstract: We present a comprehensive overview of the stereoscopic Intel RealSense RGBD imaging systems. We discuss these systems' mode-of-operation, functional behavior and include models of their expected performance, shortcomings, and limitations. We provide information about the systems' optical characteristics, their correlation algorithms, and how these properties can affect different applications, inc… ▽ More We present a comprehensive overview of the stereoscopic Intel RealSense RGBD imaging systems. We discuss these systems' mode-of-operation, functional behavior and include models of their expected performance, shortcomings, and limitations. We provide information about the systems' optical characteristics, their correlation algorithms, and how these properties can affect different applications, including 3D reconstruction and gesture recognition. Our discussion covers the Intel RealSense R200 and the Intel RealSense D400 (formally RS400). △ Less

Submitted 29 October, 2017; v1 submitted 16 May, 2017; originally announced May 2017.

Comments: Accepted to CCD 2017, a CVPR 2017 Workshop

ACM Class: I.4.8

arXiv:1605.04764 [pdf, other]

Geometry Aware Mappings for High Dimensional Sparse Factors

Authors: Avradeep Bhowmik, Nathan Liu, Erheng Zhong, Badri Narayan Bhaskar, Suju Rajan

Abstract: While matrix factorisation models are ubiquitous in large scale recommendation and search, real time application of such models requires inner product computations over an intractably large set of item factors. In this manuscript we present a novel framework that uses the inverted index representation to exploit structural properties of sparse vectors to significantly reduce the run time computati… ▽ More While matrix factorisation models are ubiquitous in large scale recommendation and search, real time application of such models requires inner product computations over an intractably large set of item factors. In this manuscript we present a novel framework that uses the inverted index representation to exploit structural properties of sparse vectors to significantly reduce the run time computational cost of factorisation models. We develop techniques that use geometry aware permutation maps on a tessellated unit sphere to obtain high dimensional sparse embeddings for latent factors with sparsity patterns related to angular closeness of the original latent factors. We also design several efficient and deterministic realisations within this framework and demonstrate with experiments that our techniques lead to faster run time operation with minimal loss of accuracy. △ Less

Submitted 16 May, 2016; originally announced May 2016.

Comments: AISTATS 2016, 13 pages, 5 figures

arXiv:1605.04466 [pdf, other]

Generalized Linear Models for Aggregated Data

Authors: Avradeep Bhowmik, Joydeep Ghosh, Oluwasanmi Koyejo

Abstract: Databases in domains such as healthcare are routinely released to the public in aggregated form. Unfortunately, naive modeling with aggregated data may significantly diminish the accuracy of inferences at the individual level. This paper addresses the scenario where features are provided at the individual level, but the target variables are only available as histogram aggregates or order statistic… ▽ More Databases in domains such as healthcare are routinely released to the public in aggregated form. Unfortunately, naive modeling with aggregated data may significantly diminish the accuracy of inferences at the individual level. This paper addresses the scenario where features are provided at the individual level, but the target variables are only available as histogram aggregates or order statistics. We consider a limiting case of generalized linear modeling when the target variables are only known up to permutation, and explore how this relates to permutation testing; a standard technique for assessing statistical dependency. Based on this relationship, we propose a simple algorithm to estimate the model parameters and individual level inferences via alternating imputation and standard generalized linear model fitting. Our results suggest the effectiveness of the proposed approach when, in the original data, permutation testing accurately ascertains the veracity of the linear relationship. The framework is extended to general histogram data with larger bins - with order statistics such as the median as a limiting case. Our experimental results on simulated data and aggregated healthcare data suggest a diminishing returns property with respect to the granularity of the histogram - when a linear relationship holds in the original data, the targets can be predicted accurately given relatively coarse histograms. △ Less

Submitted 14 May, 2016; originally announced May 2016.

Comments: AISTATS 2015, 9 pages, 6 figures

arXiv:1605.04465 [pdf, other]

Monotone Retargeting for Unsupervised Rank Aggregation with Object Features

Authors: Avradeep Bhowmik, Joydeep Ghosh

Abstract: Learning the true ordering between objects by aggregating a set of expert opinion rank order lists is an important and ubiquitous problem in many applications ranging from social choice theory to natural language processing and search aggregation. We study the problem of unsupervised rank aggregation where no ground truth ordering information in available, neither about the true preference orderin… ▽ More Learning the true ordering between objects by aggregating a set of expert opinion rank order lists is an important and ubiquitous problem in many applications ranging from social choice theory to natural language processing and search aggregation. We study the problem of unsupervised rank aggregation where no ground truth ordering information in available, neither about the true preference ordering between any set of objects nor about the quality of individual rank lists. Aggregating the often inconsistent and poor quality rank lists in such an unsupervised manner is a highly challenging problem, and standard consensus-based methods are often ill-defined, and difficult to solve. In this manuscript we propose a novel framework to bypass these issues by using object attributes to augment the standard rank aggregation framework. We design algorithms that learn joint models on both rank lists and object features to obtain an aggregated rank ordering that is more accurate and robust, and also helps weed out rank lists of dubious validity. We validate our techniques on synthetic datasets where our algorithm is able to estimate the true rank ordering even when the rank lists are corrupted. Experiments on three real datasets, MQ2008, MQ2008 and OHSUMED, show that using object features can result in significant improvement in performance over existing rank aggregation methods that do not use object information. Furthermore, when at least some of the rank lists are of high quality, our methods are able to effectively exploit their high expertise to output an aggregated rank ordering of great accuracy. △ Less

Submitted 14 May, 2016; originally announced May 2016.

Comments: 15 pages, 2 figures, 1 table

Showing 1–22 of 22 results for author: Bhowmik, A