Search | arXiv e-print repository

Physics Informed Kolmogorov-Arnold Neural Networks for Dynamical Analysis via Efficent-KAN and WAV-KAN

Authors: Subhajit Patra, Sonali Panda, Bikram Keshari Parida, Mahima Arya, Kurt Jacobs, Denys I. Bondar, Abhijit Sen

Abstract: Physics-informed neural networks have proven to be a powerful tool for solving differential equations, leveraging the principles of physics to inform the learning process. However, traditional deep neural networks often face challenges in achieving high accuracy without incurring significant computational costs. In this work, we implement the Physics-Informed Kolmogorov-Arnold Neural Networks (PIK… ▽ More Physics-informed neural networks have proven to be a powerful tool for solving differential equations, leveraging the principles of physics to inform the learning process. However, traditional deep neural networks often face challenges in achieving high accuracy without incurring significant computational costs. In this work, we implement the Physics-Informed Kolmogorov-Arnold Neural Networks (PIKAN) through efficient-KAN and WAV-KAN, which utilize the Kolmogorov-Arnold representation theorem. PIKAN demonstrates superior performance compared to conventional deep neural networks, achieving the same level of accuracy with fewer layers and reduced computational overhead. We explore both B-spline and wavelet-based implementations of PIKAN and benchmark their performance across various ordinary and partial differential equations using unsupervised (data-free) and supervised (data-driven) techniques. For certain differential equations, the data-free approach suffices to find accurate solutions, while in more complex scenarios, the data-driven method enhances the PIKAN's ability to converge to the correct solution. We validate our results against numerical solutions and achieve $99 \%$ accuracy in most scenarios. △ Less

Submitted 28 July, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

arXiv:2407.14224 [pdf, other]

Hierarchical Windowed Graph Attention Network and a Large Scale Dataset for Isolated Indian Sign Language Recognition

Authors: Suvajit Patra, Arkadip Maitra, Megha Tiwari, K. Kumaran, Swathy Prabhu, Swami Punyeshwarananda, Soumitra Samanta

Abstract: Automatic Sign Language (SL) recognition is an important task in the computer vision community. To build a robust SL recognition system, we need a considerable amount of data which is lacking particularly in Indian sign language (ISL). In this paper, we propose a large-scale isolated ISL dataset and a novel SL recognition model based on skeleton graph structure. The dataset covers 2,002 daily used… ▽ More Automatic Sign Language (SL) recognition is an important task in the computer vision community. To build a robust SL recognition system, we need a considerable amount of data which is lacking particularly in Indian sign language (ISL). In this paper, we propose a large-scale isolated ISL dataset and a novel SL recognition model based on skeleton graph structure. The dataset covers 2,002 daily used common words in the deaf community recorded by 20 (10 male and 10 female) deaf adult signers (contains 40033 videos). We propose a SL recognition model namely Hierarchical Windowed Graph Attention Network (HWGAT) by utilizing the human upper body skeleton graph structure. The HWGAT tries to capture distinctive motions by giving attention to different body parts induced by the human skeleton graph structure. The utility of the proposed dataset and the usefulness of our model are evaluated through extensive experiments. We pre-trained the proposed model on the proposed dataset and fine-tuned it across different sign language datasets further boosting the performance of 1.10, 0.46, 0.78, and 6.84 percentage points on INCLUDE, LSA64, AUTSL and WLASL respectively compared to the existing state-of-the-art skeleton-based models. △ Less

Submitted 19 July, 2024; originally announced July 2024.

arXiv:2406.16077 [pdf, other]

doi 10.1145/3637528.3671623

Detecting Abnormal Operations in Concentrated Solar Power Plants from Irregular Sequences of Thermal Images

Authors: Sukanya Patra, Nicolas Sournac, Souhaib Ben Taieb

Abstract: Concentrated Solar Power (CSP) plants store energy by heating a storage medium with an array of mirrors that focus sunlight onto solar receivers atop a central tower. Operating at high temperatures these receivers face risks such as freezing, deformation, and corrosion, leading to operational failures, downtime, or costly equipment damage. We study the problem of anomaly detection (AD) in sequence… ▽ More Concentrated Solar Power (CSP) plants store energy by heating a storage medium with an array of mirrors that focus sunlight onto solar receivers atop a central tower. Operating at high temperatures these receivers face risks such as freezing, deformation, and corrosion, leading to operational failures, downtime, or costly equipment damage. We study the problem of anomaly detection (AD) in sequences of thermal images collected over a year from an operational CSP plant. These images are captured at irregular intervals ranging from one to five minutes throughout the day by infrared cameras mounted on solar receivers. Our goal is to develop a method to extract useful representations from high-dimensional thermal images for AD. It should be able to handle temporal features of the data, which include irregularity, temporal dependency between images and non-stationarity due to a strong daily seasonal pattern. The co-occurrence of low-temperature anomalies that resemble normal images from the start and the end of the operational cycle with high-temperature anomalies poses an additional challenge. We first evaluate state-of-the-art deep image-based AD methods, which have been shown to be effective in deriving meaningful image representations for the detection of anomalies. Then, we introduce a forecasting-based AD method that predicts future thermal images from past sequences and timestamps via a deep sequence model. This method effectively captures specific temporal data features and distinguishes between difficult-to-detect temperature-based anomalies. Our experiments demonstrate the effectiveness of our approach compared to multiple SOTA baselines across multiple evaluation metrics. We have also successfully deployed our solution on five months of unseen data, providing critical insights for the maintenance of the CSP plant. Our code is available at: https://tinyurl.com/ForecastAD △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: Accepted in KDD 2024

arXiv:2406.12908 [pdf, other]

Rating Multi-Modal Time-Series Forecasting Models (MM-TSFM) for Robustness Through a Causal Lens

Authors: Kausik Lakkaraju, Rachneet Kaur, Zhen Zeng, Parisa Zehtabi, Sunandita Patra, Biplav Srivastava, Marco Valtorta

Abstract: AI systems are notorious for their fragility; minor input changes can potentially cause major output swings. When such systems are deployed in critical areas like finance, the consequences of their uncertain behavior could be severe. In this paper, we focus on multi-modal time-series forecasting, where imprecision due to noisy or incorrect data can lead to erroneous predictions, impacting stakehol… ▽ More AI systems are notorious for their fragility; minor input changes can potentially cause major output swings. When such systems are deployed in critical areas like finance, the consequences of their uncertain behavior could be severe. In this paper, we focus on multi-modal time-series forecasting, where imprecision due to noisy or incorrect data can lead to erroneous predictions, impacting stakeholders such as analysts, investors, and traders. Recently, it has been shown that beyond numeric data, graphical transformations can be used with advanced visual models to achieve better performance. In this context, we introduce a rating methodology to assess the robustness of Multi-Modal Time-Series Forecasting Models (MM-TSFM) through causal analysis, which helps us understand and quantify the isolated impact of various attributes on the forecasting accuracy of MM-TSFM. We apply our novel rating method on a variety of numeric and multi-modal forecasting models in a large experimental setup (six input settings of control and perturbations, ten data distributions, time series from six leading stocks in three industries over a year of data, and five time-series forecasters) to draw insights on robust forecasting models and the context of their strengths. Within the scope of our study, our main result is that multi-modal (numeric + visual) forecasting, which was found to be more accurate than numeric forecasting in previous studies, can also be more robust in diverse settings. Our work will help different stakeholders of time-series forecasting understand the models` behaviors along trust (robustness) and accuracy dimensions to select an appropriate model for forecasting using our rating method, leading to improved decision-making. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2403.13064 [pdf, other]

SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model

Authors: Armen Avetisyan, Christopher Xie, Henry Howard-Jenkins, Tsun-Yi Yang, Samir Aroudj, Suvam Patra, Fuyang Zhang, Duncan Frost, Luke Holland, Campbell Orme, Jakob Engel, Edward Miller, Richard Newcombe, Vasileios Balntas

Abstract: We introduce SceneScript, a method that directly produces full scene models as a sequence of structured language commands using an autoregressive, token-based approach. Our proposed scene representation is inspired by recent successes in transformers & LLMs, and departs from more traditional methods which commonly describe scenes as meshes, voxel grids, point clouds or radiance fields. Our method… ▽ More We introduce SceneScript, a method that directly produces full scene models as a sequence of structured language commands using an autoregressive, token-based approach. Our proposed scene representation is inspired by recent successes in transformers & LLMs, and departs from more traditional methods which commonly describe scenes as meshes, voxel grids, point clouds or radiance fields. Our method infers the set of structured language commands directly from encoded visual data using a scene language encoder-decoder architecture. To train SceneScript, we generate and release a large-scale synthetic dataset called Aria Synthetic Environments consisting of 100k high-quality in-door scenes, with photorealistic and ground-truth annotated renders of egocentric scene walkthroughs. Our method gives state-of-the art results in architectural layout estimation, and competitive results in 3D object detection. Lastly, we explore an advantage for SceneScript, which is the ability to readily adapt to new commands via simple additions to the structured language, which we illustrate for tasks such as coarse 3D object part reconstruction. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: see project page, https://projectaria.com/scenescript

arXiv:2311.02480 [pdf, other]

doi 10.1109/SSP53291.2023.10207960

A Strictly Bounded Deep Network for Unpaired Cyclic Translation of Medical Images

Authors: Swati Rai, Jignesh S. Bhatt, Sarat Kumar Patra

Abstract: Medical image translation is an ill-posed problem. Unlike existing paired unbounded unidirectional translation networks, in this paper, we consider unpaired medical images and provide a strictly bounded network that yields a stable bidirectional translation. We propose a patch-level concatenated cyclic conditional generative adversarial network (pCCGAN) embedded with adaptive dictionary learning.… ▽ More Medical image translation is an ill-posed problem. Unlike existing paired unbounded unidirectional translation networks, in this paper, we consider unpaired medical images and provide a strictly bounded network that yields a stable bidirectional translation. We propose a patch-level concatenated cyclic conditional generative adversarial network (pCCGAN) embedded with adaptive dictionary learning. It consists of two cyclically connected CGANs of 47 layers each; where both generators (each of 32 layers) are conditioned with concatenation of alternate unpaired patches from input and target modality images (not ground truth) of the same organ. The key idea is to exploit cross-neighborhood contextual feature information that bounds the translation space and boosts generalization. The generators are further equipped with adaptive dictionaries learned from the contextual patches to reduce possible degradation. Discriminators are 15-layer deep networks that employ minimax function to validate the translated imagery. A combined loss function is formulated with adversarial, non-adversarial, forward-backward cyclic, and identity losses that further minimize the variance of the proposed learning machine. Qualitative, quantitative, and ablation analysis show superior results on real CT and MRI. △ Less

Submitted 4 November, 2023; originally announced November 2023.

Journal ref: 2023 IEEE Statistical Signal Processing Workshop (SSP), Hanoi, Vietnam, 2023, pp. 61-65

arXiv:2309.15642 [pdf, other]

doi 10.1103/PhysRevResearch.6.013326

Efficient tensor network simulation of IBM's largest quantum processors

Authors: Siddhartha Patra, Saeed S. Jahromi, Sukhbinder Singh, Roman Orus

Abstract: We show how quantum-inspired 2d tensor networks can be used to efficiently and accurately simulate the largest quantum processors from IBM, namely Eagle (127 qubits), Osprey (433 qubits) and Condor (1121 qubits). We simulate the dynamics of a complex quantum many-body system -- specifically, the kicked Ising experiment considered recently by IBM in Nature 618, p. 500-505 (2023) -- using graph-base… ▽ More We show how quantum-inspired 2d tensor networks can be used to efficiently and accurately simulate the largest quantum processors from IBM, namely Eagle (127 qubits), Osprey (433 qubits) and Condor (1121 qubits). We simulate the dynamics of a complex quantum many-body system -- specifically, the kicked Ising experiment considered recently by IBM in Nature 618, p. 500-505 (2023) -- using graph-based Projected Entangled Pair States (gPEPS), which was proposed by some of us in PRB 99, 195105 (2019). Our results show that simple tensor updates are already sufficient to achieve very large unprecedented accuracy with remarkably low computational resources for this model. Apart from simulating the original experiment for 127 qubits, we also extend our results to 433 and 1121 qubits, and for evolution times around 8 times longer, thus setting a benchmark for the newest IBM quantum machines. We also report accurate simulations for infinitely-many qubits. Our results show that gPEPS are a natural tool to efficiently simulate quantum computers with an underlying lattice-based qubit connectivity, such as all quantum processors based on superconducting qubits. △ Less

Submitted 2 April, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

Comments: 7 pages, 8 figures, revised version

Journal ref: Phys. Rev. Research 6, 013326 (2024)

arXiv:2309.00379 [pdf, other]

Anomaly detection with semi-supervised classification based on risk estimators

Authors: Le Thi Khanh Hien, Sukanya Patra, Souhaib Ben Taieb

Abstract: A significant limitation of one-class classification anomaly detection methods is their reliance on the assumption that unlabeled training data only contains normal instances. To overcome this impractical assumption, we propose two novel classification-based anomaly detection methods. Firstly, we introduce a semi-supervised shallow anomaly detection method based on an unbiased risk estimator. Seco… ▽ More A significant limitation of one-class classification anomaly detection methods is their reliance on the assumption that unlabeled training data only contains normal instances. To overcome this impractical assumption, we propose two novel classification-based anomaly detection methods. Firstly, we introduce a semi-supervised shallow anomaly detection method based on an unbiased risk estimator. Secondly, we present a semi-supervised deep anomaly detection method utilizing a nonnegative (biased) risk estimator. We establish estimation error bounds and excess risk bounds for both risk minimizers. Additionally, we propose techniques to select appropriate regularization parameters that ensure the nonnegativity of the empirical risk in the shallow model under specific loss functions. Our extensive experiments provide strong evidence of the effectiveness of the risk-based anomaly detection methods. △ Less

Submitted 1 September, 2023; originally announced September 2023.

arXiv:2308.13561 [pdf, other]

Project Aria: A New Tool for Egocentric Multi-Modal AI Research

Authors: Jakob Engel, Kiran Somasundaram, Michael Goesele, Albert Sun, Alexander Gamino, Andrew Turner, Arjang Talattof, Arnie Yuan, Bilal Souti, Brighid Meredith, Cheng Peng, Chris Sweeney, Cole Wilson, Dan Barnes, Daniel DeTone, David Caruso, Derek Valleroy, Dinesh Ginjupalli, Duncan Frost, Edward Miller, Elias Mueggler, Evgeniy Oleinik, Fan Zhang, Guruprasad Somasundaram, Gustavo Solaira , et al. (49 additional authors not shown)

Abstract: Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, mul… ▽ More Egocentric, multi-modal data as available on future augmented reality (AR) devices provides unique challenges and opportunities for machine perception. These future devices will need to be all-day wearable in a socially acceptable form-factor to support always available, context-aware and personalized AI applications. Our team at Meta Reality Labs Research built the Aria device, an egocentric, multi-modal data recording and streaming device with the goal to foster and accelerate research in this area. In this paper, we describe the Aria device hardware including its sensor configuration and the corresponding software tools that enable recording and processing of such data. △ Less

Submitted 1 October, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.12367 [pdf, other]

doi 10.1609/aaai.v38i14.29522

SafeAR: Safe Algorithmic Recourse by Risk-Aware Policies

Authors: Haochen Wu, Shubham Sharma, Sunandita Patra, Sriram Gopalakrishnan

Abstract: With the growing use of machine learning (ML) models in critical domains such as finance and healthcare, the need to offer recourse for those adversely affected by the decisions of ML models has become more important; individuals ought to be provided with recommendations on actions to take for improving their situation and thus receiving a favorable decision. Prior work on sequential algorithmic r… ▽ More With the growing use of machine learning (ML) models in critical domains such as finance and healthcare, the need to offer recourse for those adversely affected by the decisions of ML models has become more important; individuals ought to be provided with recommendations on actions to take for improving their situation and thus receiving a favorable decision. Prior work on sequential algorithmic recourse -- which recommends a series of changes -- focuses on action feasibility and uses the proximity of feature changes to determine action costs. However, the uncertainties of feature changes and the risk of higher than average costs in recourse have not been considered. It is undesirable if a recourse could (with some probability) result in a worse situation from which recovery requires an extremely high cost. It is essential to incorporate risks when computing and evaluating recourse. We call the recourse computed with such risk considerations as Safe Algorithmic Recourse (SafeAR). The objective is to empower people to choose a recourse based on their risk tolerance. In this work, we discuss and show how existing recourse desiderata can fail to capture the risk of higher costs. We present a method to compute recourse policies that consider variability in cost and connect algorithmic recourse literature with risk-sensitive reinforcement learning. We also adopt measures "Value at Risk" and "Conditional Value at Risk" from the financial literature to summarize risk concisely. We apply our method to two real-world datasets and compare policies with different risk-aversion levels using risk measures and recourse desiderata (sparsity and proximity). △ Less

Submitted 12 February, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

Comments: Accepted to AAAI 2024 main track with oral presentation; Supplemental material appended to main paper

Journal ref: AAAI 2024, 38(14), 15915-15923,

arXiv:2305.08887 [pdf]

Covariate-distance Weighted Regression (CWR): A Case Study for Estimation of House Prices

Authors: Hone-Jay Chu, Po-Hung Chen, Sheng-Mao Chang, Muhammad Zeeshan Ali, Sumriti Ranjan Patra

Abstract: Geographically weighted regression (GWR) is a popular tool for modeling spatial heterogeneity in a regression model. However, the current weighting function used in GWR only considers the geographical distance, while the attribute similarity is totally ignored. In this study, we proposed a covariate weighting function that combines the geographical distance and attribute distance. The covariate-di… ▽ More Geographically weighted regression (GWR) is a popular tool for modeling spatial heterogeneity in a regression model. However, the current weighting function used in GWR only considers the geographical distance, while the attribute similarity is totally ignored. In this study, we proposed a covariate weighting function that combines the geographical distance and attribute distance. The covariate-distance weighted regression (CWR) is the extension of GWR including geographical distance and attribute distance. House prices are affected by numerous factors, such as house age, floor area, and land use. Prediction model is used to help understand the characteristics of regional house prices. The CWR was used to understand the relationship between the house price and controlling factors. The CWR can consider the geological and attribute distances, and produce accurate estimates of house price that preserve the weight matrix for geological and attribute distance functions. Results show that the house attributes/conditions and the characteristics of the house, such as floor area and house age, might affect the house price. After factor selection, in which only house age and floor area of a building are considered, the RMSE of the CWR model can be improved by 2.9%-26.3% for skyscrapers when compared to the GWR. CWR can effectively reduce estimation errors from traditional spatial regression models and provide novel and feasible models for spatial estimation. △ Less

Submitted 14 May, 2023; originally announced May 2023.

arXiv:2303.13594 [pdf]

Digital Library Initiatives in India: A Comprehensive Study

Authors: Sankhayan Mukherjee, Swapan Kumar Patra

Abstract: This study is a survey of digital library initiatives in India collecting secondary information from about fifty digital libraries from their respective websites. The findings show that in most cases the actual conception of the digital library is still in a nascent stage. Online subscriptions and links to third-party websites are also considered digital libraries. However, many digital libraries… ▽ More This study is a survey of digital library initiatives in India collecting secondary information from about fifty digital libraries from their respective websites. The findings show that in most cases the actual conception of the digital library is still in a nascent stage. Online subscriptions and links to third-party websites are also considered digital libraries. However, many digital libraries do have not any proper search interface on their respective website due to improper arrangement of metadata. In some cases, they do not have their own digitized collection and provided other collections or referred to their users to some third-party website. Moreover, there are many digital libraries that cannot be accessed outside (remote access) of the organization. Hence, regular website maintenance, remote access facility, and proper training of information professionals are required. Moreover, the so-called digital libraries in India have not developed their own standards or are not following any global standards. However, the usage statistics for government digital libraries are far better than the usage statistics of academic or public libraries. Users are perhaps more interested in government rules, laws, orders, etc. That is perhaps a positive sign of digital governance and reaching the public. There are several important observations and policy suggestions that may be helpful for students, scholars, library professionals, and the decision-makers in the government. △ Less

Submitted 23 March, 2023; originally announced March 2023.

Comments: 14 Pages, 6 Figures

arXiv:2210.00203 [pdf]

Digital Library Initiatives in North East India: A Survey

Authors: Sankhayan Mukherjee, Swapan Kumar Patra

Abstract: This is a survey of digital library initiative of North East India. The recent initiative by the government of India towards the digitization is reflected in various digitation programs. The secondary sources of data are used to map the 16 digital library initiatives in eight north east state of India. The study has observed that digital library in true sense is perhaps lacking. Many of the digita… ▽ More This is a survey of digital library initiative of North East India. The recent initiative by the government of India towards the digitization is reflected in various digitation programs. The secondary sources of data are used to map the 16 digital library initiatives in eight north east state of India. The study has observed that digital library in true sense is perhaps lacking. Many of the digital libraries are not accessible from the outside and lack regular maintenance. In this context, a national level policy initiative is the need of the hour including various stakeholders from the academics, library professionals etc. The study also comes up with various important observations and policy suggestions which may be helpful for scholar, librarians, policy and decision makers in the government. △ Less

Submitted 1 October, 2022; originally announced October 2022.

Comments: 14 pages, 2 tables

arXiv:2207.07788 [pdf, other]

doi 10.1145/3534678.3539165

Greykite: Deploying Flexible Forecasting at Scale at LinkedIn

Authors: Reza Hosseini, Albert Chen, Kaixu Yang, Sayan Patra, Yi Su, Saad Eddin Al Orjany, Sishi Tang, Parvez Ahammad

Abstract: Forecasts help businesses allocate resources and achieve objectives. At LinkedIn, product owners use forecasts to set business targets, track outlook, and monitor health. Engineers use forecasts to efficiently provision hardware. Developing a forecasting solution to meet these needs requires accurate and interpretable forecasts on diverse time series with sub-hourly to quarterly frequencies. We pr… ▽ More Forecasts help businesses allocate resources and achieve objectives. At LinkedIn, product owners use forecasts to set business targets, track outlook, and monitor health. Engineers use forecasts to efficiently provision hardware. Developing a forecasting solution to meet these needs requires accurate and interpretable forecasts on diverse time series with sub-hourly to quarterly frequencies. We present Greykite, an open-source Python library for forecasting that has been deployed on over twenty use cases at LinkedIn. Its flagship algorithm, Silverkite, provides interpretable, fast, and highly flexible univariate forecasts that capture effects such as time-varying growth and seasonality, autocorrelation, holidays, and regressors. The library enables self-serve accuracy and trust by facilitating data exploration, model configuration, execution, and interpretation. Our benchmark results show excellent out-of-the-box speed and accuracy on datasets from a variety of domains. Over the past two years, Greykite forecasts have been trusted by Finance, Engineering, and Product teams for resource planning and allocation, target setting and progress tracking, anomaly detection and root cause analysis. We expect Greykite to be useful to forecast practitioners with similar applications who need accurate, interpretable forecasts that capture complex dynamics common to time series related to human activity. △ Less

Submitted 15 July, 2022; originally announced July 2022.

Comments: In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '22), August 14-18, 2022, Washington, DC, USA. ACM, New York, NY, USA, 11 pages

ACM Class: G.3

arXiv:2112.03580 [pdf]

Disability and Library Services: Global Research Trend

Authors: Swapan Kumar Patra

Abstract: The research on differently abled persons, and their use of library is getting global attention in recent years. The field has shown a modest, continuous but wide-scale growth. This research paper aimed at capturing the dynamics of the field using various bibliometrics and text mining tools. The bibliographic data of journal articles published in the field were collected from the Web of Science (W… ▽ More The research on differently abled persons, and their use of library is getting global attention in recent years. The field has shown a modest, continuous but wide-scale growth. This research paper aimed at capturing the dynamics of the field using various bibliometrics and text mining tools. The bibliographic data of journal articles published in the field were collected from the Web of Science (WoS) database. The records were collected form the year 1991 to 2021 and analysed to observed the trends of literature growth, core journals, institutes from where most of the literature is being published, prominent keywords and so on. The results show that there is a significant growth of publications since the year 2000. The trends shows that the research in these areas is mostly emerging from developed countries. The developing countries should also pay more attention to do research in this area because differently abled peoples need in developed countries may vary with respect to developed countries. △ Less

Submitted 7 December, 2021; originally announced December 2021.

arXiv:2110.15279 [pdf, other]

SVM and ANN based Classification of EMG signals by using PCA and LDA

Authors: Hritam Basak, Alik Roy, Jeet Bandhu Lahiri, Sayantan Bose, Soumyadeep Patra

Abstract: In recent decades, biomedical signals have been used for communication in Human-Computer Interfaces (HCI) for medical applications; an instance of these signals are the myoelectric signals (MES), which are generated in the muscles of the human body as unidimensional patterns. Because of this, the methods and algorithms developed for pattern recognition in signals can be applied for their analyses… ▽ More In recent decades, biomedical signals have been used for communication in Human-Computer Interfaces (HCI) for medical applications; an instance of these signals are the myoelectric signals (MES), which are generated in the muscles of the human body as unidimensional patterns. Because of this, the methods and algorithms developed for pattern recognition in signals can be applied for their analyses once these signals have been sampled and turned into electromyographic (EMG) signals. Additionally, in recent years, many researchers have dedicated their efforts to studying prosthetic control utilizing EMG signal classification, that is, by logging a set of MES in a proper range of frequencies to classify the corresponding EMG signals. The feature classification can be carried out on the time domain or by using other domains such as the frequency domain (also known as the spectral domain), time scale, and time-frequency, amongst others. One of the main methods used for pattern recognition in myoelectric signals is the Support Vector Machines (SVM) technique whose primary function is to identify an n-dimensional hyperplane to separate a set of input feature points into different classes. This technique has the potential to recognize complex patterns and on several occasions, it has proven its worth when compared to other classifiers such as Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), and Principal Component Analysis(PCA). The key concepts underlying the SVM are (a) the hyperplane separator; (b) the kernel function; (c) the optimal separation hyperplane; and (d) a soft margin (hyperplane tolerance). △ Less

Submitted 22 October, 2021; originally announced October 2021.

arXiv:2107.13238 [pdf]

Library and Information Science Research in Indian Universities: Growth, Core Journals, Keywords and Collaboration Patterns

Authors: Swapan Kumar Patra

Abstract: This article maps Library and Information Science (LIS) research in Indian universities. As the two prominent citation databases, Web of Science and Scopus have very limited coverage of Indian LIS journals, the publications generated by the library and science departments of about 114 selected Indian universities and the two national institutions of importance in LIS research were extracted from L… ▽ More This article maps Library and Information Science (LIS) research in Indian universities. As the two prominent citation databases, Web of Science and Scopus have very limited coverage of Indian LIS journals, the publications generated by the library and science departments of about 114 selected Indian universities and the two national institutions of importance in LIS research were extracted from Library, Information Science & Technology Abstracts (LISTA). The relevant publication records were analyzed using scientometrics and Social Network Analysis (SNA) tools. The study traces the growth of publications, prominent keywords, leading journals where the articles are published and the institutional collaboration patterns of Indian university publications. The results show that there is a growth in scholarly publications from Indian universities in LIS. However, the numbers of publications are limited to only a few universities and national institutes of importance. The maximum LIS research outputs are published in Indian journals. Bibliometrics related investigations are the most important research areas. Located in major cities of India, the productive institutes show healthy collaboration. The study concludes with some observations which may be useful for formulating policies in LIS research in India. △ Less

Submitted 28 July, 2021; originally announced July 2021.

Comments: 4 Figures, 3 Tables

arXiv:2103.06575 [pdf, other]

An unsupervised deep learning framework for medical image denoising

Authors: Swati Rai, Jignesh S. Bhatt, S. K. Patra

Abstract: Medical image acquisition is often intervented by unwanted noise that corrupts the information content. This paper introduces an unsupervised medical image denoising technique that learns noise characteristics from the available images and constructs denoised images. It comprises of two blocks of data processing, viz., patch-based dictionaries that indirectly learn the noise and residual learning… ▽ More Medical image acquisition is often intervented by unwanted noise that corrupts the information content. This paper introduces an unsupervised medical image denoising technique that learns noise characteristics from the available images and constructs denoised images. It comprises of two blocks of data processing, viz., patch-based dictionaries that indirectly learn the noise and residual learning (RL) that directly learns the noise. The model is generalized to account for both 2D and 3D images considering different medical imaging instruments. The images are considered one-by-one from the stack of MRI/CT images as well as the entire stack is considered, and decomposed into overlapping image/volume patches. These patches are given to the patch-based dictionary learning to learn noise characteristics via sparse representation while given to the RL part to directly learn the noise properties. K-singular value decomposition (K-SVD) algorithm for sparse representation is used for training patch-based dictionaries. On the other hand, residue in the patches is trained using the proposed deep residue network. Iterating on these two parts, an optimum noise characterization for each image/volume patch is captured and in turn it is subtracted from the available respective image/volume patch. The obtained denoised image/volume patches are finally assembled to a denoised image or 3D stack. We provide an analysis of the proposed approach with other approaches. Experiments on MRI/CT datasets are run on a GPU-based supercomputer and the comparative results show that the proposed algorithm preserves the critical information in the images as well as improves the visual quality of the images. △ Less

Submitted 11 March, 2021; originally announced March 2021.

Comments: 22 pages, 7 figures, 4 tables

arXiv:2010.01909 [pdf, other]

doi 10.1016/j.artint.2021.103523

Deliberative Acting, Online Planning and Learning with Hierarchical Operational Models

Authors: Sunandita Patra, James Mason, Malik Ghallab, Dana Nau, Paolo Traverso

Abstract: In AI research, synthesizing a plan of action has typically used descriptive models of the actions that abstractly specify what might happen as a result of an action, and are tailored for efficiently computing state transitions. However, executing the planned actions has needed operational models, in which rich computational control structures and closed-loop online decision-making are used to spe… ▽ More In AI research, synthesizing a plan of action has typically used descriptive models of the actions that abstractly specify what might happen as a result of an action, and are tailored for efficiently computing state transitions. However, executing the planned actions has needed operational models, in which rich computational control structures and closed-loop online decision-making are used to specify how to perform an action in a nondeterministic execution context, react to events and adapt to an unfolding situation. Deliberative actors, which integrate acting and planning, have typically needed to use both of these models together -- which causes problems when attempting to develop the different models, verify their consistency, and smoothly interleave acting and planning. As an alternative, we define and implement an integrated acting and planning system in which both planning and acting use the same operational models. These rely on hierarchical task-oriented refinement methods offering rich control structures. The acting component, called Reactive Acting Engine (RAE), is inspired by the well-known PRS system. At each decision step, RAE can get advice from a planner for a near-optimal choice with respect to a utility function. The anytime planner uses a UCT-like Monte Carlo Tree Search procedure, called UPOM, whose rollouts are simulations of the actor's operational models. We also present learning strategies for use with RAE and UPOM that acquire, from online acting experiences and/or simulated planning results, a mapping from decision contexts to method instances as well as a heuristic function to guide UPOM. We demonstrate the asymptotic convergence of UPOM towards optimal methods in static domains, and show experimentally that UPOM and the learning strategies significantly improve the acting efficiency and robustness. △ Less

Submitted 15 November, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

Comments: Published in Artificial Intelligence (AIJ). Please cite as: Sunandita Patra, James Mason, Malik Ghallab, Dana Nau, Paolo Traverso. Deliberative Acting, Planning and Learning with Hierarchical Operational Models. Artificial Intelligence, Elsevier, 2021, 299, pp.103523. 10.1016/j.artint.2021.103523. arXiv admin note: text overlap with arXiv:2003.03932

Journal ref: Artificial Intelligence, Elsevier, 2021, 299, pp.103523

arXiv:2003.03932 [pdf, other]

Integrating Acting, Planning and Learning in Hierarchical Operational Models

Authors: Sunandita Patra, James Mason, Amit Kumar, Malik Ghallab, Paolo Traverso, Dana Nau

Abstract: We present new planning and learning algorithms for RAE, the Refinement Acting Engine. RAE uses hierarchical operational models to perform tasks in dynamically changing environments. Our planning procedure, UPOM, does a UCT-like search in the space of operational models in order to find a near-optimal method to use for the task and context at hand. Our learning strategies acquire, from online acti… ▽ More We present new planning and learning algorithms for RAE, the Refinement Acting Engine. RAE uses hierarchical operational models to perform tasks in dynamically changing environments. Our planning procedure, UPOM, does a UCT-like search in the space of operational models in order to find a near-optimal method to use for the task and context at hand. Our learning strategies acquire, from online acting experiences and/or simulated planning results, a mapping from decision contexts to method instances as well as a heuristic function to guide UPOM. Our experimental results show that UPOM and our learning strategies significantly improve RAE's performance in four test domains using two different metrics: efficiency and success ratio. △ Less

Submitted 9 March, 2020; originally announced March 2020.

Comments: Accepted in ICAPS 2020 (30th International Conference on Automated Planning and Scheduling)

arXiv:1711.02144 [pdf, other]

doi 10.1109/WACV.2018.00076

A Joint 3D-2D based Method for Free Space Detection on Roads

Authors: Suvam Patra, Pranjal Maheshwari, Shashank Yadav, Chetan Arora, Subhashis Banerjee

Abstract: In this paper, we address the problem of road segmentation and free space detection in the context of autonomous driving. Traditional methods either use 3-dimensional (3D) cues such as point clouds obtained from LIDAR, RADAR or stereo cameras or 2-dimensional (2D) cues such as lane markings, road boundaries and object detection. Typical 3D point clouds do not have enough resolution to detect fine… ▽ More In this paper, we address the problem of road segmentation and free space detection in the context of autonomous driving. Traditional methods either use 3-dimensional (3D) cues such as point clouds obtained from LIDAR, RADAR or stereo cameras or 2-dimensional (2D) cues such as lane markings, road boundaries and object detection. Typical 3D point clouds do not have enough resolution to detect fine differences in heights such as between road and pavement. Image based 2D cues fail when encountering uneven road textures such as due to shadows, potholes, lane markings or road restoration. We propose a novel free road space detection technique combining both 2D and 3D cues. In particular, we use CNN based road segmentation from 2D images and plane/box fitting on sparse depth data obtained from SLAM as priors to formulate an energy minimization using conditional random field (CRF), for road pixels classification. While the CNN learns the road texture and is unaffected by depth boundaries, the 3D information helps in overcoming texture based classification failures. Finally, we use the obtained road segmentation with the 3D depth data from monocular SLAM to detect the free space for the navigation purposes. Our experiments on KITTI odometry dataset, Camvid dataset, as well as videos captured by us, validate the superiority of the proposed approach over the state of the art. △ Less

Submitted 15 January, 2018; v1 submitted 6 November, 2017; originally announced November 2017.

Comments: Accepted for publication at IEEE WACV 2018

arXiv:1707.05564 [pdf, other]

Robust Monocular SLAM for Egocentric Videos

Authors: Suvam Patra, Kartikeya Gupta, Faran Ahmad, Chetan Arora, Subhashis Banerjee

Abstract: Regardless of the tremendous progress, a truly general purpose pipeline for Simultaneous Localization and Mapping (SLAM) remains a challenge. We investigate the reported failure of state of the art (SOTA) SLAM techniques on egocentric videos. We find that the dominant 3D rotations, low parallax between successive frames, and primarily forward motion in egocentric videos are the most common causes… ▽ More Regardless of the tremendous progress, a truly general purpose pipeline for Simultaneous Localization and Mapping (SLAM) remains a challenge. We investigate the reported failure of state of the art (SOTA) SLAM techniques on egocentric videos. We find that the dominant 3D rotations, low parallax between successive frames, and primarily forward motion in egocentric videos are the most common causes of failures. The incremental nature of SOTA SLAM, in the presence of unreliable pose and 3D estimates in egocentric videos, with no opportunities for global loop closures, generates drifts and leads to the eventual failures of such techniques. Taking inspiration from batch mode Structure from Motion (SFM) techniques, we propose to solve SLAM as an SFM problem over the sliding temporal windows. This makes the problem well constrained. Further, we propose to initialize the camera poses using 2D rotation averaging, followed by translation averaging before structure estimation using bundle adjustment. This helps in stabilizing the camera poses when 3D estimates are not reliable. We show that the proposed SLAM technique, incorporating the two key ideas works successfully for long, shaky egocentric videos where other SOTA techniques have been reported to fail. Qualitative and quantitative comparisons on publicly available egocentric video datasets validate our results. △ Less

Submitted 17 November, 2018; v1 submitted 18 July, 2017; originally announced July 2017.

Comments: Accepted for publication at IEEE WACV 2019

arXiv:1701.04743 [pdf, other]

doi 10.1109/WACV.2017.57

Computing Egomotion with Local Loop Closures for Egocentric Videos

Authors: Suvam Patra, Himanshu Aggarwal, Himani Arora, Chetan Arora, Subhashis Banerjee

Abstract: Finding the camera pose is an important step in many egocentric video applications. It has been widely reported that, state of the art SLAM algorithms fail on egocentric videos. In this paper, we propose a robust method for camera pose estimation, designed specifically for egocentric videos. In an egocentric video, the camera views the same scene point multiple times as the wearer's head sweeps ba… ▽ More Finding the camera pose is an important step in many egocentric video applications. It has been widely reported that, state of the art SLAM algorithms fail on egocentric videos. In this paper, we propose a robust method for camera pose estimation, designed specifically for egocentric videos. In an egocentric video, the camera views the same scene point multiple times as the wearer's head sweeps back and forth. We use this specific motion profile to perform short loop closures aligned with wearer's footsteps. For egocentric videos, depth estimation is usually noisy. In an important departure, we use 2D computations for rotation averaging which do not rely upon depth estimates. The two modification results in much more stable algorithm as is evident from our experiments on various egocentric video datasets for different egocentric applications. The proposed algorithm resolves a long standing problem in egocentric vision and unlocks new usage scenarios for future applications. △ Less

Submitted 17 January, 2017; originally announced January 2017.

Comments: Accepted in WACV 2017

arXiv:1610.00001 [pdf]

doi 10.13140/RG.2.2.16849.33129

Bacterial Foraging Optimized STATCOM for Stability Assessment in Power System

Authors: Shiba R. Paital, Prakash K. Ray, Asit Mohanty, Sandipan Patra, Harishchandra Dubey

Abstract: This paper presents a study of improvement in stability in a single machine connected to infinite bus (SMIB) power system by using static compensator (STATCOM). The gains of Proportional-Integral-Derivative (PID) controller in STATCOM are being optimized by heuristic technique based on Particle swarm optimization (PSO). Further, Bacterial Foraging Optimization (BFO) as an alternative heuristic met… ▽ More This paper presents a study of improvement in stability in a single machine connected to infinite bus (SMIB) power system by using static compensator (STATCOM). The gains of Proportional-Integral-Derivative (PID) controller in STATCOM are being optimized by heuristic technique based on Particle swarm optimization (PSO). Further, Bacterial Foraging Optimization (BFO) as an alternative heuristic method is also applied to select optimal gains of PID controller. The performance of STATCOM with the above soft-computing techniques are studied and compared with the conventional PID controller under various scenarios. The simulation results are accompanied with performance indices based quantitative analysis. The analysis clearly signifies the robustness of the new scheme in terms of stability and voltage regulation when compared with conventional PID. △ Less

Submitted 1 October, 2016; originally announced October 2016.

Comments: 5 pages, 7 figures, 2016 IEEE Students' Technology Symposium (TechSym 2016), At IIT Kharagpur, India

Showing 1–24 of 24 results for author: Patra, S