Search | arXiv e-print repository

Learning to Learn Faster from Human Feedback with Language Model Predictive Control

Authors: Jacky Liang, Fei Xia, Wenhao Yu, Andy Zeng, Montserrat Gonzalez Arenas, Maria Attarian, Maria Bauza, Matthew Bennice, Alex Bewley, Adil Dostmohamed, Chuyuan Kelly Fu, Nimrod Gileadi, Marissa Giustina, Keerthana Gopalakrishnan, Leonard Hasenclever, Jan Humplik, Jasmine Hsu, Nikhil Joshi, Ben Jyenis, Chase Kew, Sean Kirmani, Tsang-Wei Edward Lee, Kuang-Huei Lee, Assaf Hurwitz Michaely, Joss Moore , et al. (25 additional authors not shown)

Abstract: Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for o… ▽ More Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for only as long as it fits within the context size of the LLM, and can be forgotten over longer interactions. In this work, we investigate fine-tuning the robot code-writing LLMs, to remember their in-context interactions and improve their teachability i.e., how efficiently they adapt to human inputs (measured by average number of corrections before the user considers the task successful). Our key observation is that when human-robot interactions are viewed as a partially observable Markov decision process (in which human language inputs are observations, and robot code outputs are actions), then training an LLM to complete previous interactions is training a transition dynamics model -- that can be combined with classic robotics techniques such as model predictive control (MPC) to discover shorter paths to success. This gives rise to Language Model Predictive Control (LMPC), a framework that fine-tunes PaLM 2 to improve its teachability on 78 tasks across 5 robot embodiments -- improving non-expert teaching success rates of unseen tasks by 26.9% while reducing the average number of human corrections from 2.4 to 1.9. Experiments show that LMPC also produces strong meta-learners, improving the success rate of in-context learning new tasks on unseen robot embodiments and APIs by 31.5%. See videos, code, and demos at: https://robot-teaching.github.io/. △ Less

Submitted 31 May, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

arXiv:2312.04423 [pdf, other]

Scalable Knowledge Graph Construction and Inference on Human Genome Variants

Authors: Shivika Prasanna, Deepthi Rao, Eduardo Simoes, Praveen Rao

Abstract: Real-world knowledge can be represented as a graph consisting of entities and relationships between the entities. The need for efficient and scalable solutions arises when dealing with vast genomic data, like RNA-sequencing. Knowledge graphs offer a powerful approach for various tasks in such large-scale genomic data, such as analysis and inference. In this work, variant-level information extracte… ▽ More Real-world knowledge can be represented as a graph consisting of entities and relationships between the entities. The need for efficient and scalable solutions arises when dealing with vast genomic data, like RNA-sequencing. Knowledge graphs offer a powerful approach for various tasks in such large-scale genomic data, such as analysis and inference. In this work, variant-level information extracted from the RNA-sequences of vaccine-naïve COVID-19 patients have been represented as a unified, large knowledge graph. Variant call format (VCF) files containing the variant-level information were annotated to include further information for each variant. The data records in the annotated files were then converted to Resource Description Framework (RDF) triples. Each VCF file obtained had an associated CADD scores file that contained the raw and Phred-scaled scores for each variant. An ontology was defined for the VCF and CADD scores files. Using this ontology and the extracted information, a large, scalable knowledge graph was created. Available graph storage was then leveraged to query and create datasets for further downstream tasks. We also present a case study using the knowledge graph and perform a classification task using graph machine learning. We also draw comparisons between different Graph Neural Networks (GNNs) for the case study. △ Less

Submitted 7 December, 2023; originally announced December 2023.

arXiv:2311.06261 [pdf, other]

With ChatGPT, do we have to rewrite our learning objectives -- CASE study in Cybersecurity

Authors: Peter Jamieson, Suman Bhunia, Dhananjai M. Rao

Abstract: With the emergence of Artificial Intelligent chatbot tools such as ChatGPT and code writing AI tools such as GitHub Copilot, educators need to question what and how we should teach our courses and curricula in the future. In reality, automated tools may result in certain academic fields being deeply reduced in the number of employable people. In this work, we make a case study of cybersecurity und… ▽ More With the emergence of Artificial Intelligent chatbot tools such as ChatGPT and code writing AI tools such as GitHub Copilot, educators need to question what and how we should teach our courses and curricula in the future. In reality, automated tools may result in certain academic fields being deeply reduced in the number of employable people. In this work, we make a case study of cybersecurity undergrad education by using the lens of ``Understanding by Design'' (UbD). First, we provide a broad understanding of learning objectives (LOs) in cybersecurity from a computer science perspective. Next, we dig a little deeper into a curriculum with an undergraduate emphasis on cybersecurity and examine the major courses and their LOs for our cybersecurity program at Miami University. With these details, we perform a thought experiment on how attainable the LOs are with the above-described tools, asking the key question ``what needs to be enduring concepts?'' learned in this process. If an LO becomes something that the existence of automation tools might be able to do, we then ask ``what level is attainable for the LO that is not a simple query to the tools?''. With this exercise, we hope to establish an example of how to prompt ChatGPT to accelerate students in their achievements of LOs given the existence of these new AI tools, and our goal is to push all of us to leverage and teach these tools as powerful allies in our quest to improve human existence and knowledge. △ Less

Submitted 26 September, 2023; originally announced November 2023.

arXiv:2309.11512 [pdf, other]

Multidimensional well-being of US households at a fine spatial scale using fused household surveys: fusionACS

Authors: Kevin Ummel, Miguel Poblete-Cazenave, Karthik Akkiraju, Nick Graetz, Hero Ashman, Cora Kingdon, Steven Herrera Tenorio, Aaryaman "Sunny" Singhal, Daniel Aldana Cohen, Narasimha D. Rao

Abstract: Social science often relies on surveys of households and individuals. Dozens of such surveys are regularly administered by the U.S. government. However, they field independent, unconnected samples with specialized questions, limiting research questions to those that can be answered by a single survey. The fusionACS project seeks to integrate data from multiple U.S. household surveys by statistical… ▽ More Social science often relies on surveys of households and individuals. Dozens of such surveys are regularly administered by the U.S. government. However, they field independent, unconnected samples with specialized questions, limiting research questions to those that can be answered by a single survey. The fusionACS project seeks to integrate data from multiple U.S. household surveys by statistically "fusing" variables from "donor" surveys onto American Community Survey (ACS) microdata. This results in an integrated microdataset of household attributes and well-being dimensions that can be analyzed to address research questions in ways that are not currently possible. The presented data comprise the fusion onto the ACS of select donor variables from the Residential Energy Consumption Survey (RECS) of 2015, the National Household Transportation Survey (NHTS) of 2017, the American Housing Survey (AHS) of 2019, and the Consumer Expenditure Survey - Interview (CEI) for the years 2015-2019. The underlying statistical techniques are included in an open-source $R$ package, fusionModel, that provides generic tools for the creation, analysis, and validation of fused microdata. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: 35 pages, 6 figures

arXiv:2306.11706 [pdf, other]

RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation

Authors: Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauza, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Żołna, Scott Reed, Sergio Gómez Colmenarejo, Jon Scholz , et al. (14 additional authors not shown)

Abstract: The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned de… ▽ More The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned decision transformer capable of consuming action-labelled visual experience. This data spans a large repertoire of motor control skills from simulated and real robotic arms with varying sets of observations and actions. With RoboCat, we demonstrate the ability to generalise to new tasks and robots, both zero-shot as well as through adaptation using only 100-1000 examples for the target task. We also show how a trained model itself can be used to generate data for subsequent training iterations, thus providing a basic building block for an autonomous improvement loop. We investigate the agent's capabilities, with large-scale evaluations both in simulation and on three different real robot embodiments. We find that as we grow and diversify its training data, RoboCat not only shows signs of cross-task transfer, but also becomes more efficient at adapting to new tasks. △ Less

Submitted 22 December, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

Comments: Transactions on Machine Learning Research (12/2023)

arXiv:2305.12696 [pdf, other]

Learning Interpretable Style Embeddings via Prompting LLMs

Authors: Ajay Patel, Delip Rao, Ansh Kothary, Kathleen McKeown, Chris Callison-Burch

Abstract: Style representation learning builds content-independent representations of author style in text. Stylometry, the analysis of style in text, is often performed by expert forensic linguists and no large dataset of stylometric annotations exists for training. Current style representation learning uses neural methods to disentangle style from content to create style vectors, however, these approaches… ▽ More Style representation learning builds content-independent representations of author style in text. Stylometry, the analysis of style in text, is often performed by expert forensic linguists and no large dataset of stylometric annotations exists for training. Current style representation learning uses neural methods to disentangle style from content to create style vectors, however, these approaches result in uninterpretable representations, complicating their usage in downstream applications like authorship attribution where auditing and explainability is critical. In this work, we use prompting to perform stylometry on a large number of texts to create a synthetic dataset and train human-interpretable style representations we call LISA embeddings. We release our synthetic stylometry dataset and our interpretable style models as resources. △ Less

Submitted 9 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

arXiv:2304.13164 [pdf, other]

Towards Compute-Optimal Transfer Learning

Authors: Massimo Caccia, Alexandre Galashov, Arthur Douillard, Amal Rannen-Triki, Dushyant Rao, Michela Paganini, Laurent Charlin, Marc'Aurelio Ranzato, Razvan Pascanu

Abstract: The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet… ▽ More The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet effective way to trade computational efficiency for asymptotic performance which we define as the performance a learning algorithm achieves as compute tends to infinity. Specifically, we argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance. We evaluate our method on the Nevis'22 continual learning benchmark that offers a diverse set of transfer scenarios. Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes. △ Less

Submitted 25 April, 2023; originally announced April 2023.

arXiv:2303.02043 [pdf, other]

An Integrated Real-time UAV Trajectory Optimization with Potential Field Approach for Dynamic Collision Avoidance

Authors: D. M. K. K. Venkateswara Rao, Hamed Habibi, Jose Luis Sanchez-Lopez, Holger Voos

Abstract: This paper presents an integrated approach that combines trajectory optimization and Artificial Potential Field (APF) method for real-time optimal Unmanned Aerial Vehicle (UAV) trajectory planning and dynamic collision avoidance. A minimum-time trajectory optimization problem is formulated with initial and final positions as boundary conditions and collision avoidance as constraints. It is transcr… ▽ More This paper presents an integrated approach that combines trajectory optimization and Artificial Potential Field (APF) method for real-time optimal Unmanned Aerial Vehicle (UAV) trajectory planning and dynamic collision avoidance. A minimum-time trajectory optimization problem is formulated with initial and final positions as boundary conditions and collision avoidance as constraints. It is transcribed into a nonlinear programming problem using Chebyshev pseudospectral method. The state and control histories are approximated by using Lagrange polynomials and the collocation points are used to satisfy constraints. A novel sigmoid-type collision avoidance constraint is proposed to overcome the drawbacks of Lagrange polynomial approximation in pseudospectral methods that only guarantees inequality constraint satisfaction only at nodal points. Automatic differentiation of cost function and constraints is used to quickly determine their gradient and Jacobian, respectively. An APF method is used to update the optimal control inputs for guaranteeing collision avoidance. The trajectory optimization and APF method are implemented in a closed-loop fashion continuously, but in parallel at moderate and high frequencies, respectively. The initial guess for the optimization is provided based on the previous solution. The proposed approach is tested and validated through indoor experiments. △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2302.12617 [pdf, other]

Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains

Authors: Jingwei Zhang, Jost Tobias Springenberg, Arunkumar Byravan, Leonard Hasenclever, Abbas Abdolmaleki, Dushyant Rao, Nicolas Heess, Martin Riedmiller

Abstract: In this paper we study the problem of learning multi-step dynamics prediction models (jumpy models) from unlabeled experience and their utility for fast inference of (high-level) plans in downstream tasks. In particular we propose to learn a jumpy model alongside a skill embedding space offline, from previously collected experience for which no labels or reward annotations are required. We then in… ▽ More In this paper we study the problem of learning multi-step dynamics prediction models (jumpy models) from unlabeled experience and their utility for fast inference of (high-level) plans in downstream tasks. In particular we propose to learn a jumpy model alongside a skill embedding space offline, from previously collected experience for which no labels or reward annotations are required. We then investigate several options of harnessing those learned components in combination with model-based planning or model-free reinforcement learning (RL) to speed up learning on downstream tasks. We conduct a set of experiments in the RGB-stacking environment, showing that planning with the learned skills and the associated model can enable zero-shot generalization to new tasks, and can further speed up training of policies via reinforcement learning. These experiments demonstrate that jumpy models which incorporate temporal abstraction can facilitate planning in long-horizon tasks in which standard dynamics models fail. △ Less

Submitted 24 February, 2023; originally announced February 2023.

arXiv:2302.10147 [pdf, ps, other]

A DNN based Normalized Time-frequency Weighted Criterion for Robust Wideband DoA Estimation

Authors: Kuan-Lin Chen, Ching-Hua Lee, Bhaskar D. Rao, Harinath Garudadri

Abstract: Deep neural networks (DNNs) have greatly benefited direction of arrival (DoA) estimation methods for speech source localization in noisy environments. However, their localization accuracy is still far from satisfactory due to the vulnerability to nonspeech interference. To improve the robustness against interference, we propose a DNN based normalized time-frequency (T-F) weighted criterion which m… ▽ More Deep neural networks (DNNs) have greatly benefited direction of arrival (DoA) estimation methods for speech source localization in noisy environments. However, their localization accuracy is still far from satisfactory due to the vulnerability to nonspeech interference. To improve the robustness against interference, we propose a DNN based normalized time-frequency (T-F) weighted criterion which minimizes the distance between the candidate steering vectors and the filtered snapshots in the T-F domain. Our method requires no eigendecomposition and uses a simple normalization to prevent the optimization objective from being misled by noisy filtered snapshots. We also study different designs of T-F weights guided by a DNN. We find that duplicating the Hadamard product of speech ratio masks is highly effective and better than other techniques such as direct masking and taking the mean in the proposed approach. However, the best-performing design of T-F weights is criterion-dependent in general. Experiments show that the proposed method outperforms popular DNN based DoA estimation methods including widely used subspace methods in noisy and reverberant environments. △ Less

Submitted 20 February, 2023; originally announced February 2023.

Comments: 5 pages. Accepted at ICASSP 2023

arXiv:2301.13379 [pdf, other]

Faithful Chain-of-Thought Reasoning

Authors: Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, Chris Callison-Burch

Abstract: While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving… ▽ More While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving (reasoning chain $\rightarrow$ answer), using an LM and a deterministic solver respectively. This guarantees that the reasoning chain provides a faithful explanation of the final answer. Aside from interpretability, Faithful CoT also improves empirical performance: it outperforms standard CoT on 9 of 10 benchmarks from 4 diverse domains, with a relative accuracy gain of 6.3% on Math Word Problems (MWP), 3.4% on Planning, 5.5% on Multi-hop Question Answering (QA), and 21.4% on Relational Inference. Furthermore, with GPT-4 and Codex, it sets the new state-of-the-art few-shot performance on 7 datasets (with 95.0+ accuracy on 6 of them), showing a strong synergy between faithfulness and accuracy. △ Less

Submitted 20 September, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: IJCNLP-AACL 2023 camera-ready version

arXiv:2211.13743 [pdf, other]

SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration

Authors: Giulia Vezzani, Dhruva Tirumala, Markus Wulfmeier, Dushyant Rao, Abbas Abdolmaleki, Ben Moran, Tuomas Haarnoja, Jan Humplik, Roland Hafner, Michael Neunert, Claudio Fantacci, Tim Hertweck, Thomas Lampe, Fereshteh Sadeghi, Nicolas Heess, Martin Riedmiller

Abstract: The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitations.For example, fine-tuning an existing policy frequently fails, as the policy can degrade rapidly early in training. In a similar vein, distillation of expert be… ▽ More The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitations.For example, fine-tuning an existing policy frequently fails, as the policy can degrade rapidly early in training. In a similar vein, distillation of expert behavior can lead to poor results when given sub-optimal experts. We compare several common approaches for skill transfer on multiple domains including changes in task and system dynamics. We identify how existing methods can fail and introduce an alternative approach to mitigate these problems. Our approach learns to sequence existing temporally-extended skills for exploration but learns the final policy directly from the raw experience. This conceptual split enables rapid adaptation and thus efficient data collection but without constraining the final solution.It significantly outperforms many classical methods across a suite of evaluation tasks and we use a broad set of ablations to highlight the importance of differentc omponents of our method. △ Less

Submitted 11 January, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

arXiv:2211.05351 [pdf]

Biomedical Multi-hop Question Answering Using Knowledge Graph Embeddings and Language Models

Authors: Dattaraj J. Rao, Shraddha S. Mane, Mukta A. Paliwal

Abstract: Biomedical knowledge graphs (KG) are heterogenous networks consisting of biological entities as nodes and relations between them as edges. These entities and relations are extracted from millions of research papers and unified in a single resource. The goal of biomedical multi-hop question-answering over knowledge graph (KGQA) is to help biologist and scientist to get valuable insights by asking q… ▽ More Biomedical knowledge graphs (KG) are heterogenous networks consisting of biological entities as nodes and relations between them as edges. These entities and relations are extracted from millions of research papers and unified in a single resource. The goal of biomedical multi-hop question-answering over knowledge graph (KGQA) is to help biologist and scientist to get valuable insights by asking questions in natural language. Relevant answers can be found by first understanding the question and then querying the KG for right set of nodes and relationships to arrive at an answer. To model the question, language models such as RoBERTa and BioBERT are used to understand context from natural language question. One of the challenges in KGQA is missing links in the KG. Knowledge graph embeddings (KGE) help to overcome this problem by encoding nodes and edges in a dense and more efficient way. In this paper, we use a publicly available KG called Hetionet which is an integrative network of biomedical knowledge assembled from 29 different databases of genes, compounds, diseases, and more. We have enriched this KG dataset by creating a multi-hop biomedical question-answering dataset in natural language for testing the biomedical multi-hop question-answering system and this dataset will be made available to the research community. The major contribution of this research is an integrated system that combines language models with KG embeddings to give highly relevant answers to free-form questions asked by biologists in an intuitive interface. Biomedical multi-hop question-answering system is tested on this data and results are highly encouraging. △ Less

Submitted 10 November, 2022; originally announced November 2022.

ACM Class: I.2.4; I.2.7

arXiv:2210.12448 [pdf, other]

Probing Transfer in Deep Reinforcement Learning without Task Engineering

Authors: Andrei A. Rusu, Sebastian Flennerhag, Dushyant Rao, Razvan Pascanu, Raia Hadsell

Abstract: We evaluate the use of original game curricula supported by the Atari 2600 console as a heterogeneous transfer benchmark for deep reinforcement learning agents. Game designers created curricula using combinations of several discrete modifications to the basic versions of games such as Space Invaders, Breakout and Freeway, making them progressively more challenging for human players. By formally or… ▽ More We evaluate the use of original game curricula supported by the Atari 2600 console as a heterogeneous transfer benchmark for deep reinforcement learning agents. Game designers created curricula using combinations of several discrete modifications to the basic versions of games such as Space Invaders, Breakout and Freeway, making them progressively more challenging for human players. By formally organising these modifications into several factors of variation, we are able to show that Analyses of Variance (ANOVA) are a potent tool for studying the effects of human-relevant domain changes on the learning and transfer performance of a deep reinforcement learning agent. Since no manual task engineering is needed on our part, leveraging the original multi-factorial design avoids the pitfalls of unintentionally biasing the experimental setup. We find that game design factors have a large and statistically significant impact on an agent's ability to learn, and so do their combinatorial interactions. Furthermore, we show that zero-shot transfer from the basic games to their respective variations is possible, but the variance in performance is also largely explained by interactions between factors. As such, we argue that Atari game curricula offer a challenging benchmark for transfer learning in RL, that can help the community better understand the generalisation capabilities of RL agents along dimensions which meaningfully impact human generalisation performance. As a start, we report that value-function finetuning of regularly trained agents achieves positive transfer in a majority of cases, but significant headroom for algorithmic innovation remains. We conclude with the observation that selective transfer from multiple variants could further improve performance. △ Less

Submitted 22 October, 2022; originally announced October 2022.

arXiv:2210.07236 [pdf, ps, other]

Improved Bounds on Neural Complexity for Representing Piecewise Linear Functions

Authors: Kuan-Lin Chen, Harinath Garudadri, Bhaskar D. Rao

Abstract: A deep neural network using rectified linear units represents a continuous piecewise linear (CPWL) function and vice versa. Recent results in the literature estimated that the number of neurons needed to exactly represent any CPWL function grows exponentially with the number of pieces or exponentially in terms of the factorial of the number of distinct linear components. Moreover, such growth is a… ▽ More A deep neural network using rectified linear units represents a continuous piecewise linear (CPWL) function and vice versa. Recent results in the literature estimated that the number of neurons needed to exactly represent any CPWL function grows exponentially with the number of pieces or exponentially in terms of the factorial of the number of distinct linear components. Moreover, such growth is amplified linearly with the input dimension. These existing results seem to indicate that the cost of representing a CPWL function is expensive. In this paper, we propose much tighter bounds and establish a polynomial time algorithm to find a network satisfying these bounds for any given CPWL function. We prove that the number of hidden neurons required to exactly represent any CPWL function is at most a quadratic function of the number of pieces. In contrast to all previous results, this upper bound is invariant to the input dimension. Besides the number of pieces, we also study the number of distinct linear components in CPWL functions. When such a number is also given, we prove that the quadratic complexity turns into bilinear, which implies a lower neural complexity because the number of distinct linear components is always not greater than the minimum number of pieces in a CPWL function. When the number of pieces is unknown, we prove that, in terms of the number of distinct linear components, the neural complexities of any CPWL function are at most polynomial growth for low-dimensional inputs and factorial growth for the worst-case scenario, which are significantly better than existing results in the literature. △ Less

Submitted 15 January, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

Comments: 31 pages. Accepted at NeurIPS 2022

arXiv:2210.03266 [pdf, ps, other]

doi 10.1109/TSP.2023.3254919

Maximum Likelihood-based Gridless DoA Estimation Using Structured Covariance Matrix Recovery and SBL with Grid Refinement

Authors: Rohan R. Pote, Bhaskar D. Rao

Abstract: We consider the parametric data model employed in applications such as line spectral estimation and direction-of-arrival estimation. We focus on the stochastic maximum likelihood estimation (MLE) framework and offer approaches to estimate the parameter of interest in a gridless manner, overcoming the model complexities of the past. This progress is enabled by the modern trend of reparameterization… ▽ More We consider the parametric data model employed in applications such as line spectral estimation and direction-of-arrival estimation. We focus on the stochastic maximum likelihood estimation (MLE) framework and offer approaches to estimate the parameter of interest in a gridless manner, overcoming the model complexities of the past. This progress is enabled by the modern trend of reparameterization of the objective and exploiting the sparse Bayesian learning (SBL) approach. The latter is shown to be a correlation-aware method, and for the underlying problem it is identified as a grid-based technique for recovering a structured covariance matrix of the measurements. For the case when the structured matrix is expressible as a sampled Toeplitz matrix, such as when measurements are sampled in time or space at regular intervals, additional constraints and reparameterization of the SBL objective leads to the proposed structured matrix recovery technique based on MLE. The proposed optimization problem is non-convex, and we propose a majorization-minimization based iterative procedure to estimate the structured matrix; each iteration solves a semidefinite program. We recover the parameter of interest in a gridless manner by appealing to the Caratheodory-Fejer result on decomposition of PSD Toeplitz matrices. For the general case of irregularly spaced time or spatial samples, we propose an iterative SBL procedure that refines grid points to increase resolution near potential source locations, while maintaining a low per iteration complexity. We provide numerical results to evaluate and compare the performance of the proposed techniques with other gridless techniques, and the CRB. The proposed correlation-aware approach is more robust to environmental/system effects such as low number of snapshots, correlated sources, small separation between source locations and improves sources identifiability. △ Less

Submitted 6 October, 2022; originally announced October 2022.

Comments: Submitted to the IEEE Transactions on Signal Processing (Previous submission date: 29-Oct-2021)

arXiv:2209.01947 [pdf, other]

MO2: Model-Based Offline Options

Authors: Sasha Salter, Markus Wulfmeier, Dhruva Tirumala, Nicolas Heess, Martin Riedmiller, Raia Hadsell, Dushyant Rao

Abstract: The ability to discover useful behaviours from past experience and transfer them to new tasks is considered a core component of natural embodied intelligence. Inspired by neuroscience, discovering behaviours that switch at bottleneck states have been long sought after for inducing plans of minimum description length across tasks. Prior approaches have either only supported online, on-policy, bottl… ▽ More The ability to discover useful behaviours from past experience and transfer them to new tasks is considered a core component of natural embodied intelligence. Inspired by neuroscience, discovering behaviours that switch at bottleneck states have been long sought after for inducing plans of minimum description length across tasks. Prior approaches have either only supported online, on-policy, bottleneck state discovery, limiting sample-efficiency, or discrete state-action domains, restricting applicability. To address this, we introduce Model-Based Offline Options (MO2), an offline hindsight framework supporting sample-efficient bottleneck option discovery over continuous state-action spaces. Once bottleneck options are learnt offline over source domains, they are transferred online to improve exploration and value estimation on the transfer domain. Our experiments show that on complex long-horizon continuous control tasks with sparse, delayed rewards, MO2's properties are essential and lead to performance exceeding recent option learning methods. Additional ablations further demonstrate the impact on option predictability and credit assignment. △ Less

Submitted 5 September, 2022; originally announced September 2022.

Comments: Accepted at 1st Conference on Lifelong Learning Agents (CoLLAs) Conference Track, 2022

arXiv:2208.05552 [pdf, other]

Towards Automating Retinoscopy for Refractive Error Diagnosis

Authors: Aditya Aggarwal, Siddhartha Gairola, Uddeshya Upadhyay, Akshay P Vasishta, Diwakar Rao, Aditya Goyal, Kaushik Murali, Nipun Kwatra, Mohit Jain

Abstract: Refractive error is the most common eye disorder and is the key cause behind correctable visual impairment, responsible for nearly 80% of the visual impairment in the US. Refractive error can be diagnosed using multiple methods, including subjective refraction, retinoscopy, and autorefractors. Although subjective refraction is the gold standard, it requires cooperation from the patient and hence i… ▽ More Refractive error is the most common eye disorder and is the key cause behind correctable visual impairment, responsible for nearly 80% of the visual impairment in the US. Refractive error can be diagnosed using multiple methods, including subjective refraction, retinoscopy, and autorefractors. Although subjective refraction is the gold standard, it requires cooperation from the patient and hence is not suitable for infants, young children, and developmentally delayed adults. Retinoscopy is an objective refraction method that does not require any input from the patient. However, retinoscopy requires a lens kit and a trained examiner, which limits its use for mass screening. In this work, we automate retinoscopy by attaching a smartphone to a retinoscope and recording retinoscopic videos with the patient wearing a custom pair of paper frames. We develop a video processing pipeline that takes retinoscopic videos as input and estimates the net refractive error based on our proposed extension of the retinoscopy mathematical model. Our system alleviates the need for a lens kit and can be performed by an untrained examiner. In a clinical trial with 185 eyes, we achieved a sensitivity of 91.0% and specificity of 74.0% on refractive error diagnosis. Moreover, the mean absolute error of our approach was 0.75$\pm$0.67D on net refractive error estimation compared to subjective refraction measurements. Our results indicate that our approach has the potential to be used as a retinoscopy-based refractive error screening tool in real-world medical settings. △ Less

Submitted 10 August, 2022; originally announced August 2022.

Comments: This paper is accepted for publication in IMWUT 2022

arXiv:2204.05893 [pdf, other]

Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data

Authors: Wenxuan Zhou, Steven Bohez, Jan Humplik, Abbas Abdolmaleki, Dushyant Rao, Markus Wulfmeier, Tuomas Haarnoja, Nicolas Heess

Abstract: Robots will experience non-stationary environment dynamics throughout their lifetime: the robot dynamics can change due to wear and tear, or its surroundings may change over time. Eventually, the robots should perform well in all of the environment variations it has encountered. At the same time, it should still be able to learn fast in a new environment. We identify two challenges in Reinforcemen… ▽ More Robots will experience non-stationary environment dynamics throughout their lifetime: the robot dynamics can change due to wear and tear, or its surroundings may change over time. Eventually, the robots should perform well in all of the environment variations it has encountered. At the same time, it should still be able to learn fast in a new environment. We identify two challenges in Reinforcement Learning (RL) under such a lifelong learning setting with off-policy data: first, existing off-policy algorithms struggle with the trade-off between being conservative to maintain good performance in the old environment and learning efficiently in the new environment, despite keeping all the data in the replay buffer. We propose the Offline Distillation Pipeline to break this trade-off by separating the training procedure into an online interaction phase and an offline distillation phase.Second, we find that training with the imbalanced off-policy data from multiple environments across the lifetime creates a significant performance drop. We identify that this performance drop is caused by the combination of the imbalanced quality and size among the datasets which exacerbate the extrapolation error of the Q-function. During the distillation phase, we apply a simple fix to the issue by keeping the policy closer to the behavior policy that generated the data. In the experiments, we demonstrate these two challenges and the proposed solutions with a simulated bipedal robot walk-ing task across various environment changes. We show that the Offline Distillation Pipeline achieves better performance across all the encountered environments without affecting data collection. We also provide a comprehensive empirical study to support our hypothesis on the data imbalance issue. △ Less

Submitted 18 August, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

Comments: Published at 1st Conference on Lifelong Learning Agents, 2022

arXiv:2204.02799 [pdf]

doi 10.1002/aelm.202200975

Scandium Nitride as a Gateway III-Nitride Semiconductor for Optoelectronic Artificial Synaptic Devices

Authors: Dheemahi Rao, Bivas Saha

Abstract: Traditional computation based on von Neumann architecture is limited by the time and energy consumption due to data transfer between the storage and the processing units. The von Neumann architecture is also inefficient in solving unstructured, probabilistic, and real-time problems. To address these challenges, a new brain-inspired neuromorphic computational architecture is required. Due to absenc… ▽ More Traditional computation based on von Neumann architecture is limited by the time and energy consumption due to data transfer between the storage and the processing units. The von Neumann architecture is also inefficient in solving unstructured, probabilistic, and real-time problems. To address these challenges, a new brain-inspired neuromorphic computational architecture is required. Due to absence of resistance-capacitance (RC) delay, high bandwidth and low power consumption, optoelectronic artificial synaptic devices are highly attractive. Yet stable, scalable, and complementary-metal-oxide-semiconductor (CMOS)-compatible synapses have not been demonstrated. In this work, persistence in the photoconductivity of undoped and magnesium-doped scandium nitride (ScN) is equated to the inhibitory and excitatory synaptic plasticity of the biological synapses responsible for memory and learning. Primary functionalities of a biological synapse like short-term memory (STM), long-term memory (LTM), the transition from STM-to-LTM, learning and forgetting, frequency-selective optical filtering, frequency-dependent potentiation and depression, Hebbian learning, and logic gate operations are demonstrated. △ Less

Submitted 6 April, 2022; originally announced April 2022.

Comments: 14 pages, 5 figures. It is currently under review

Journal ref: Adv. Electron. Mater. 2022, 2200975

arXiv:2201.10152 [pdf, other]

Unsupervised Image Fusion Method based on Feature Mutual Mapping

Authors: Dongyu Rao, Xiao-Jun Wu, Tianyang Xu, Guoyang Chen

Abstract: Deep learning-based image fusion approaches have obtained wide attention in recent years, achieving promising performance in terms of visual perception. However, the fusion module in the current deep learning-based methods suffers from two limitations, \textit{i.e.}, manually designed fusion function, and input-independent network learning. In this paper, we propose an unsupervised adaptive image… ▽ More Deep learning-based image fusion approaches have obtained wide attention in recent years, achieving promising performance in terms of visual perception. However, the fusion module in the current deep learning-based methods suffers from two limitations, \textit{i.e.}, manually designed fusion function, and input-independent network learning. In this paper, we propose an unsupervised adaptive image fusion method to address the above issues. We propose a feature mutual mapping fusion module and dual-branch multi-scale autoencoder. More specifically, we construct a global map to measure the connections of pixels between the input source images. % The found mapping relationship guides the image fusion. Besides, we design a dual-branch multi-scale network through sampling transformation to extract discriminative image features. We further enrich feature representations of different scales through feature aggregation in the decoding process. Finally, we propose a modified loss function to train the network with efficient convergence property. Through sufficient training on infrared and visible image data sets, our method also shows excellent generalized performance in multi-focus and medical image fusion. Our method achieves superior performance in both visual perception and objective evaluation. Experiments prove that the performance of our proposed method on a variety of image fusion tasks surpasses other state-of-the-art methods, proving the effectiveness and versatility of our approach. △ Less

Submitted 29 January, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

arXiv:2201.10147 [pdf, other]

TGFuse: An Infrared and Visible Image Fusion Approach Based on Transformer and Generative Adversarial Network

Authors: Dongyu Rao, Xiao-Jun Wu, Tianyang Xu

Abstract: The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perception for complex scenario fusion. In this paper, therefore, we propose an infrared and visible image fu… ▽ More The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perception for complex scenario fusion. In this paper, therefore, we propose an infrared and visible image fusion algorithm based on a lightweight transformer module and adversarial learning. Inspired by the global interaction power, we use the transformer technique to learn the effective global fusion relations. In particular, shallow features extracted by CNN are interacted in the proposed transformer fusion module to refine the fusion relationship within the spatial scope and across channels simultaneously. Besides, adversarial learning is designed in the training process to improve the output discrimination via imposing competitive consistency from the inputs, reflecting the specific characteristics in infrared and visible images. The experimental performance demonstrates the effectiveness of the proposed modules, with superior improvement against the state-of-the-art, generalising a novel paradigm via transformer and adversarial learning in the fusion task. △ Less

Submitted 3 February, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

arXiv:2112.05062 [pdf, other]

Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies

Authors: Dushyant Rao, Fereshteh Sadeghi, Leonard Hasenclever, Markus Wulfmeier, Martina Zambelli, Giulia Vezzani, Dhruva Tirumala, Yusuf Aytar, Josh Merel, Nicolas Heess, Raia Hadsell

Abstract: For robots operating in the real world, it is desirable to learn reusable behaviours that can effectively be transferred and adapted to numerous tasks and scenarios. We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model. In contrast to existing work, our method exploits a three-level hierarchy of both discrete and continuous latent varia… ▽ More For robots operating in the real world, it is desirable to learn reusable behaviours that can effectively be transferred and adapted to numerous tasks and scenarios. We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model. In contrast to existing work, our method exploits a three-level hierarchy of both discrete and continuous latent variables, to capture a set of high-level behaviours while allowing for variance in how they are executed. We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours, while retaining the flexibility of a continuous latent variable model. The resulting skills can be transferred and fine-tuned on new tasks, unseen objects, and from state to vision-based policies, yielding better sample efficiency and asymptotic performance compared to existing skill- and imitation-based methods. We further analyse how and when the skills are most beneficial: they encourage directed exploration to cover large regions of the state space relevant to the task, making them most effective in challenging sparse-reward settings. △ Less

Submitted 14 March, 2022; v1 submitted 9 December, 2021; originally announced December 2021.

arXiv:2111.08952 [pdf, other]

doi 10.1109/IEEECONF44664.2019.9048906

A Generalized Proportionate-Type Normalized Subband Adaptive Filter

Authors: Kuan-Lin Chen, Ching-Hua Lee, Bhaskar D. Rao, Harinath Garudadri

Abstract: We show that a new design criterion, i.e., the least squares on subband errors regularized by a weighted norm, can be used to generalize the proportionate-type normalized subband adaptive filtering (PtNSAF) framework. The new criterion directly penalizes subband errors and includes a sparsity penalty term which is minimized using the damped regularized Newton's method. The impact of the proposed g… ▽ More We show that a new design criterion, i.e., the least squares on subband errors regularized by a weighted norm, can be used to generalize the proportionate-type normalized subband adaptive filtering (PtNSAF) framework. The new criterion directly penalizes subband errors and includes a sparsity penalty term which is minimized using the damped regularized Newton's method. The impact of the proposed generalized PtNSAF (GPtNSAF) is studied for the system identification problem via computer simulations. Specifically, we study the effects of using different numbers of subbands and various sparsity penalty terms for quasi-sparse, sparse, and dispersive systems. The results show that the benefit of increasing the number of subbands is larger than promoting sparsity of the estimated filter coefficients when the target system is quasi-sparse or dispersive. On the other hand, for sparse target systems, promoting sparsity becomes more important. More importantly, the two aspects provide complementary and additive benefits to the GPtNSAF for speeding up convergence. △ Less

Submitted 17 November, 2021; originally announced November 2021.

Comments: 5 pages. Presented at Asilomar Conference on Signals, Systems, and Computers (ACSSC) 2019

arXiv:2111.05496 [pdf, other]

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

Authors: Kuan-Lin Chen, Ching-Hua Lee, Harinath Garudadri, Bhaskar D. Rao

Abstract: Models recently used in the literature proving residual networks (ResNets) are better than linear predictors are actually different from standard ResNets that have been widely used in computer vision. In addition to the assumptions such as scalar-valued output or single residual block, these models have no nonlinearities at the final residual representation that feeds into the final affine layer.… ▽ More Models recently used in the literature proving residual networks (ResNets) are better than linear predictors are actually different from standard ResNets that have been widely used in computer vision. In addition to the assumptions such as scalar-valued output or single residual block, these models have no nonlinearities at the final residual representation that feeds into the final affine layer. To codify such a difference in nonlinearities and reveal a linear estimation property, we define ResNEsts, i.e., Residual Nonlinear Estimators, by simply dropping nonlinearities at the last residual representation from standard ResNets. We show that wide ResNEsts with bottleneck blocks can always guarantee a very desirable training property that standard ResNets aim to achieve, i.e., adding more blocks does not decrease performance given the same set of basis elements. To prove that, we first recognize ResNEsts are basis function models that are limited by a coupling problem in basis learning and linear prediction. Then, to decouple prediction weights from basis learning, we construct a special architecture termed augmented ResNEst (A-ResNEst) that always guarantees no worse performance with the addition of a block. As a result, such an A-ResNEst establishes empirical risk lower bounds for a ResNEst using corresponding bases. Our results demonstrate ResNEsts indeed have a problem of diminishing feature reuse; however, it can be avoided by sufficiently expanding or widening the input space, leading to the above-mentioned desirable property. Inspired by the DenseNets that have been shown to outperform ResNets, we also propose a corresponding new model called Densely connected Nonlinear Estimator (DenseNEst). We show that any DenseNEst can be represented as a wide ResNEst with bottleneck blocks. Unlike ResNEsts, DenseNEsts exhibit the desirable property without any special architectural re-design. △ Less

Submitted 15 January, 2022; v1 submitted 9 November, 2021; originally announced November 2021.

Comments: 24 pages. Accepted by NeurIPS 2021. Remark 1 clarified and typos corrected

arXiv:2108.06113 [pdf, other]

doi 10.1117/1.JEI.30.5.053013

UMFA: A photorealistic style transfer method based on U-Net and multi-layer feature aggregation

Authors: D. Y. Rao, X. J. Wu, H. Li, J. Kittler, T. Y. Xu

Abstract: In this paper, we propose a photorealistic style transfer network to emphasize the natural effect of photorealistic image stylization. In general, distortion of the image content and lacking of details are two typical issues in the style transfer field. To this end, we design a novel framework employing the U-Net structure to maintain the rich spatial clues, with a multi-layer feature aggregation… ▽ More In this paper, we propose a photorealistic style transfer network to emphasize the natural effect of photorealistic image stylization. In general, distortion of the image content and lacking of details are two typical issues in the style transfer field. To this end, we design a novel framework employing the U-Net structure to maintain the rich spatial clues, with a multi-layer feature aggregation (MFA) method to simultaneously provide the details obtained by the shallow layers in the stylization processing. In particular, an encoder based on the dense block and a decoder form a symmetrical structure of U-Net are jointly staked to realize an effective feature extraction and image reconstruction. Besides, a transfer module based on MFA and "adaptive instance normalization" (AdaIN) is inserted in the skip connection positions to achieve the stylization. Accordingly, the stylized image possesses the texture of a real photo and preserves rich content details without introducing any mask or post-processing steps. The experimental results on public datasets demonstrate that our method achieves a more faithful structural similarity with a lower style loss, reflecting the effectiveness and merit of our approach. △ Less

Submitted 13 August, 2021; originally announced August 2021.

arXiv:2106.14647 [pdf]

Zero-shot learning approach to adaptive Cybersecurity using Explainable AI

Authors: Dattaraj Rao, Shraddha Mane

Abstract: Cybersecurity is a domain where there is constant change in patterns of attack, and we need ways to make our Cybersecurity systems more adaptive to handle new attacks and categorize for appropriate action. We present a novel approach to handle the alarm flooding problem faced by Cybersecurity systems like security information and event management (SIEM) and intrusion detection (IDS). We apply a ze… ▽ More Cybersecurity is a domain where there is constant change in patterns of attack, and we need ways to make our Cybersecurity systems more adaptive to handle new attacks and categorize for appropriate action. We present a novel approach to handle the alarm flooding problem faced by Cybersecurity systems like security information and event management (SIEM) and intrusion detection (IDS). We apply a zero-shot learning method to machine learning (ML) by leveraging explanations for predictions of anomalies generated by a ML model. This approach has huge potential to auto detect alarm labels generated in SIEM and associate them with specific attack types. In this approach, without any prior knowledge of attack, we try to identify it, decipher the features that contribute to classification and try to bucketize the attack in a specific category - using explainable AI. Explanations give us measurable factors as to what features influence the prediction of a cyber-attack and to what degree. These explanations generated based on game-theory are used to allocate credit to specific features based on their influence on a specific prediction. Using this allocation of credit, we propose a novel zero-shot approach to categorize novel attacks into specific new classes based on feature influence. The resulting system demonstrated will get good at separating attack traffic from normal flow and auto-generate a label for attacks based on features that contribute to the attack. These auto-generated labels can be presented to SIEM analyst and are intuitive enough to figure out the nature of attack. We apply this approach to a network flow dataset and demonstrate results for specific attack types like ip sweep, denial of service, remote to local, etc. Paper was presented at the first Conference on Deployable AI at IIT-Madras in June 2021. △ Less

Submitted 21 June, 2021; originally announced June 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2103.07110

arXiv:2106.12772 [pdf, other]

Task-agnostic Continual Learning with Hybrid Probabilistic Models

Authors: Polina Kirichenko, Mehrdad Farajtabar, Dushyant Rao, Balaji Lakshminarayanan, Nir Levine, Ang Li, Huiyi Hu, Andrew Gordon Wilson, Razvan Pascanu

Abstract: Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning. In this work we propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. We model the distribution of each task and each class with a normalizing flow. The flow is used to lea… ▽ More Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning. In this work we propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. We model the distribution of each task and each class with a normalizing flow. The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting, all leveraging the invertibility and exact likelihood which are uniquely enabled by the normalizing flow model. We use the generative capabilities of the flow to avoid catastrophic forgetting through generative replay and a novel functional regularization technique. For task identification, we use state-of-the-art anomaly detection techniques based on measuring the typicality of the model's statistics. We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST. △ Less

Submitted 24 June, 2021; originally announced June 2021.

arXiv:2103.07110 [pdf]

Explaining Network Intrusion Detection System Using Explainable AI Framework

Authors: Shraddha Mane, Dattaraj Rao

Abstract: Cybersecurity is a domain where the data distribution is constantly changing with attackers exploring newer patterns to attack cyber infrastructure. Intrusion detection system is one of the important layers in cyber safety in today's world. Machine learning based network intrusion detection systems started showing effective results in recent years. With deep learning models, detection rates of net… ▽ More Cybersecurity is a domain where the data distribution is constantly changing with attackers exploring newer patterns to attack cyber infrastructure. Intrusion detection system is one of the important layers in cyber safety in today's world. Machine learning based network intrusion detection systems started showing effective results in recent years. With deep learning models, detection rates of network intrusion detection system are improved. More accurate the model, more the complexity and hence less the interpretability. Deep neural networks are complex and hard to interpret which makes difficult to use them in production as reasons behind their decisions are unknown. In this paper, we have used deep neural network for network intrusion detection and also proposed explainable AI framework to add transparency at every stage of machine learning pipeline. This is done by leveraging Explainable AI algorithms which focus on making ML models less of black boxes by providing explanations as to why a prediction is made. Explanations give us measurable factors as to what features influence the prediction of a cyberattack and to what degree. These explanations are generated from SHAP, LIME, Contrastive Explanations Method, ProtoDash and Boolean Decision Rules via Column Generation. We apply these approaches to NSL KDD dataset for intrusion detection system and demonstrate results. △ Less

Submitted 12 March, 2021; originally announced March 2021.

arXiv:2011.02591 [pdf, ps, other]

Modified Vector Quantization for Small-Cell Access Point Placement with Inter-Cell Interference

Authors: Govind R. Gopal, Elina Nayebi, Gabriel Porto Villardi, Bhaskar D. Rao

Abstract: In this paper, we explore the small-cell uplink access point (AP) placement problem in the context of throughput-optimality and provide solutions while taking into consideration inter-cell interference. First, we briefly review the vector quantization (VQ) approach and related single user throughput-optimal formulations for AP placement. Then, we investigate the small-cell case with multiple users… ▽ More In this paper, we explore the small-cell uplink access point (AP) placement problem in the context of throughput-optimality and provide solutions while taking into consideration inter-cell interference. First, we briefly review the vector quantization (VQ) approach and related single user throughput-optimal formulations for AP placement. Then, we investigate the small-cell case with multiple users and expose the limitations of mean squared error based VQ for solving this problem. While the Lloyd algorithm from the VQ approach is found not to strictly solve the small-cell case, based on the tractability and quality of resulting AP placement, we deem it suitable as a simple and appropriate framework to solve more complicated problems. Accordingly, to minimize ICI and consequently enhance achievable throughput, we design two Lloyd-type algorithms, namely, the Interference Lloyd algorithm and the Inter-AP Lloyd algorithm, both of which incorporate ICI in their distortion functions. Simulation results show that both of the proposed algorithms provide superior 95\%-likely rate over the traditional Lloyd algorithm and the Inter-AP Lloyd algorithm yields a significant increase of up to 36.34\% in achievable rate over the Lloyd algorithm. △ Less

Submitted 17 June, 2021; v1 submitted 4 November, 2020; originally announced November 2020.

arXiv:2009.10073 [pdf]

Contextual Bandits for adapting to changing User preferences over time

Authors: Dattaraj Rao

Abstract: Contextual bandits provide an effective way to model the dynamic data problem in ML by leveraging online (incremental) learning to continuously adjust the predictions based on changing environment. We explore details on contextual bandits, an extension to the traditional reinforcement learning (RL) problem and build a novel algorithm to solve this problem using an array of action-based learners. W… ▽ More Contextual bandits provide an effective way to model the dynamic data problem in ML by leveraging online (incremental) learning to continuously adjust the predictions based on changing environment. We explore details on contextual bandits, an extension to the traditional reinforcement learning (RL) problem and build a novel algorithm to solve this problem using an array of action-based learners. We apply this approach to model an article recommendation system using an array of stochastic gradient descent (SGD) learners to make predictions on rewards based on actions taken. We then extend the approach to a publicly available MovieLens dataset and explore the findings. First, we make available a simplified simulated dataset showing varying user preferences over time and how this can be evaluated with static and dynamic learning algorithms. This dataset made available as part of this research is intentionally simulated with limited number of features and can be used to evaluate different problem-solving strategies. We will build a classifier using static dataset and evaluate its performance on this dataset. We show limitations of static learner due to fixed context at a point of time and how changing that context brings down the accuracy. Next we develop a novel algorithm for solving the contextual bandit problem. Similar to the linear bandits, this algorithm maps the reward as a function of context vector but uses an array of learners to capture variation between actions/arms. We develop a bandit algorithm using an array of stochastic gradient descent (SGD) learners, with separate learner per arm. Finally, we will apply this contextual bandit algorithm to predicting movie ratings over time by different users from the standard Movie Lens dataset and demonstrate the results. △ Less

Submitted 23 September, 2020; v1 submitted 21 September, 2020; originally announced September 2020.

arXiv:2007.15588 [pdf, other]

Data-efficient Hindsight Off-policy Option Learning

Authors: Markus Wulfmeier, Dushyant Rao, Roland Hafner, Thomas Lampe, Abbas Abdolmaleki, Tim Hertweck, Michael Neunert, Dhruva Tirumala, Noah Siegel, Nicolas Heess, Martin Riedmiller

Abstract: We introduce Hindsight Off-policy Options (HO2), a data-efficient option learning algorithm. Given any trajectory, HO2 infers likely option choices and backpropagates through the dynamic programming inference procedure to robustly train all policy components off-policy and end-to-end. The approach outperforms existing option learning methods on common benchmarks. To better understand the option fr… ▽ More We introduce Hindsight Off-policy Options (HO2), a data-efficient option learning algorithm. Given any trajectory, HO2 infers likely option choices and backpropagates through the dynamic programming inference procedure to robustly train all policy components off-policy and end-to-end. The approach outperforms existing option learning methods on common benchmarks. To better understand the option framework and disentangle benefits from both temporal and action abstraction, we evaluate ablations with flat policies and mixture policies with comparable optimization. The results highlight the importance of both types of abstraction as well as off-policy training and trust-region constraints, particularly in challenging, simulated 3D robot manipulation tasks from raw pixel inputs. Finally, we intuitively adapt the inference step to investigate the effect of increased temporal abstraction on training with pre-trained options and from scratch. △ Less

Submitted 15 June, 2021; v1 submitted 30 July, 2020; originally announced July 2020.

Comments: Published at ICML2021

arXiv:1911.08363 [pdf, other]

Attention-Privileged Reinforcement Learning

Authors: Sasha Salter, Dushyant Rao, Markus Wulfmeier, Raia Hadsell, Ingmar Posner

Abstract: Image-based Reinforcement Learning is known to suffer from poor sample efficiency and generalisation to unseen visuals such as distractors (task-independent aspects of the observation space). Visual domain randomisation encourages transfer by training over visual factors of variation that may be encountered in the target domain. This increases learning complexity, can negatively impact learning ra… ▽ More Image-based Reinforcement Learning is known to suffer from poor sample efficiency and generalisation to unseen visuals such as distractors (task-independent aspects of the observation space). Visual domain randomisation encourages transfer by training over visual factors of variation that may be encountered in the target domain. This increases learning complexity, can negatively impact learning rate and performance, and requires knowledge of potential variations during deployment. In this paper, we introduce Attention-Privileged Reinforcement Learning (APRiL) which uses a self-supervised attention mechanism to significantly alleviate these drawbacks: by focusing on task-relevant aspects of the observations, attention provides robustness to distractors as well as significantly increased learning efficiency. APRiL trains two attention-augmented actor-critic agents: one purely based on image observations, available across training and transfer domains; and one with access to privileged information (such as environment states) available only during training. Experience is shared between both agents and their attention mechanisms are aligned. The image-based policy can then be deployed without access to privileged information. We experimentally demonstrate accelerated and more robust learning on a diverse set of domains, leading to improved final performance for environments both within and outside the training distribution. △ Less

Submitted 11 January, 2021; v1 submitted 19 November, 2019; originally announced November 2019.

Comments: Published at Conference on Robot Learning (CoRL) 2020

arXiv:1910.14481 [pdf, other]

Continual Unsupervised Representation Learning

Authors: Dushyant Rao, Francesco Visin, Andrei A. Rusu, Yee Whye Teh, Razvan Pascanu, Raia Hadsell

Abstract: Continual learning aims to improve the ability of modern learning systems to deal with non-stationary distributions, typically by attempting to learn a series of tasks sequentially. Prior art in the field has largely considered supervised or reinforcement learning tasks, and often assumes full knowledge of task labels and boundaries. In this work, we propose an approach (CURL) to tackle a more gen… ▽ More Continual learning aims to improve the ability of modern learning systems to deal with non-stationary distributions, typically by attempting to learn a series of tasks sequentially. Prior art in the field has largely considered supervised or reinforcement learning tasks, and often assumes full knowledge of task labels and boundaries. In this work, we propose an approach (CURL) to tackle a more general problem that we will refer to as unsupervised continual learning. The focus is on learning representations without any knowledge about task identity, and we explore scenarios when there are abrupt changes between tasks, smooth transitions from one task to another, or even when the data is shuffled. The proposed approach performs task inference directly within the model, is able to dynamically expand to capture new concepts over its lifetime, and incorporates additional rehearsal-based techniques to deal with catastrophic forgetting. We demonstrate the efficacy of CURL in an unsupervised learning setting with MNIST and Omniglot, where the lack of labels ensures no information is leaked about the task. Further, we demonstrate strong performance compared to prior art in an i.i.d setting, or when adapting the technique to supervised tasks such as incremental class learning. △ Less

Submitted 31 October, 2019; originally announced October 2019.

Comments: NeurIPS 2019

arXiv:1910.14409 [pdf, ps, other]

Quantifying (Hyper) Parameter Leakage in Machine Learning

Authors: Vasisht Duddu, D. Vijay Rao

Abstract: Machine Learning models, extensively used for various multimedia applications, are offered to users as a blackbox service on the Cloud on a pay-per-query basis. Such blackbox models are commercially valuable to adversaries, making them vulnerable to extraction attacks to reverse engineer the proprietary model thereby violating the model privacy and Intellectual Property. Here, the adversary first… ▽ More Machine Learning models, extensively used for various multimedia applications, are offered to users as a blackbox service on the Cloud on a pay-per-query basis. Such blackbox models are commercially valuable to adversaries, making them vulnerable to extraction attacks to reverse engineer the proprietary model thereby violating the model privacy and Intellectual Property. Here, the adversary first extracts the model architecture or hyperparameters through side channel leakage, followed by stealing the functionality of the target model by training the reconstructed architecture on a synthetic dataset. While the attacks proposed in literature are empirical, there is a need for a theoretical framework to measure the information leaked under such extraction attacks. To this extent, in this work, we propose a novel probabilistic framework, Airavata, to estimate the information leakage in such model extraction attacks. This framework captures the fact that extracting the exact target model is difficult due to experimental uncertainty while inferring model hyperparameters and stochastic nature of training to steal the target model functionality. Specifically, we use Bayesian Networks to capture uncertainty in estimating the target model under various extraction attacks based on the subjective notion of probability. We validate the proposed framework under different adversary assumptions commonly adopted in literature to reason about the attack efficacy. This provides a practical tool to infer actionable details about extracting blackbox models and help identify the best attack combination which maximises the knowledge extracted (or information leaked) from the target model. △ Less

Submitted 1 February, 2020; v1 submitted 31 October, 2019; originally announced October 2019.

arXiv:1910.13875 [pdf, ps, other]

doi 10.3233/JIFS-179677

Fault Tolerance of Neural Networks in Adversarial Settings

Authors: Vasisht Duddu, N. Rajesh Pillai, D. Vijay Rao, Valentina E. Balas

Abstract: Artificial Intelligence systems require a through assessment of different pillars of trust, namely, fairness, interpretability, data and model privacy, reliability (safety) and robustness against against adversarial attacks. While these research problems have been extensively studied in isolation, an understanding of the trade-off between different pillars of trust is lacking. To this extent, the… ▽ More Artificial Intelligence systems require a through assessment of different pillars of trust, namely, fairness, interpretability, data and model privacy, reliability (safety) and robustness against against adversarial attacks. While these research problems have been extensively studied in isolation, an understanding of the trade-off between different pillars of trust is lacking. To this extent, the trade-off between fault tolerance, privacy and adversarial robustness is evaluated for the specific case of Deep Neural Networks, by considering two adversarial settings under a security and a privacy threat model. Specifically, this work studies the impact of the fault tolerance of the Neural Network on training the model by adding noise to the input (Adversarial Robustness) and noise to the gradients (Differential Privacy). While training models with noise to inputs, gradients or weights enhances fault tolerance, it is observed that adversarial robustness and fault tolerance are at odds with each other. On the other hand, ($ε,δ$)-Differentially Private models enhance the fault tolerance, measured using generalisation error, theoretically has an upper bound of $e^ε - 1 + δ$. This novel study of the trade-off between different elements of trust is pivotal for training a model which satisfies the requirements for different pillars of trust simultaneously. △ Less

Submitted 7 March, 2020; v1 submitted 30 October, 2019; originally announced October 2019.

Journal ref: Journal of Intelligent and Fuzzy Systems (JIFS) 2020

arXiv:1910.13520 [pdf]

Digital Twin approach to Clinical DSS with Explainable AI

Authors: Dattaraj Jagdish Rao, Shraddha Mane

Abstract: We propose a digital twin approach to improve healthcare decision support systems with a combination of domain knowledge and data. Domain knowledge helps build decision thresholds that doctors can use to determine a risk or recommend a treatment or test based on the specific patient condition. However, these assessments tend to be highly subjective and differ from doctor to doctor and from patient… ▽ More We propose a digital twin approach to improve healthcare decision support systems with a combination of domain knowledge and data. Domain knowledge helps build decision thresholds that doctors can use to determine a risk or recommend a treatment or test based on the specific patient condition. However, these assessments tend to be highly subjective and differ from doctor to doctor and from patient to patient. We propose a system where we collate this subjective risk by compiling data from different doctors treating different patients and build a machine learning model that learns from this knowledge. Then using state-of-the-art explainability concepts we derive explanations from this model. These explanations give us a summary of different doctor domain knowledge applied in different cases to give a more generic perspective. Also these explanations are specific to a particular patient and are customized for their condition. This is a form of a digital twin for the patient that can now be used to enhance decision boundaries for earlier defined decision tables that help in diagnosis. We will show an example of running this analysis for a liver disease risk diagnosis. △ Less

Submitted 22 October, 2019; originally announced October 2019.

arXiv:1910.11241 [pdf]

Healthcare NER Models Using Language Model Pretraining

Authors: Amogh Kamat Tarcar, Aashis Tiwari, Vineet Naique Dhaimodker, Penjo Rebelo, Rahul Desai, Dattaraj Rao

Abstract: In this paper, we present our approach to extracting structured information from unstructured Electronic Health Records (EHR) [2] which can be used to, for example, study adverse drug reactions in patients due to chemicals in their products. Our solution uses a combination of Natural Language Processing (NLP) techniques and a web-based annotation tool to optimize the performance of a custom Named… ▽ More In this paper, we present our approach to extracting structured information from unstructured Electronic Health Records (EHR) [2] which can be used to, for example, study adverse drug reactions in patients due to chemicals in their products. Our solution uses a combination of Natural Language Processing (NLP) techniques and a web-based annotation tool to optimize the performance of a custom Named Entity Recognition (NER) [1] model trained on a limited amount of EHR training data. This work was presented at the first Health Search and Data Mining Workshop (HSDM 2020) [26]. We showcase a combination of tools and techniques leveraging the recent advancements in NLP aimed at targeting domain shifts by applying transfer learning and language model pre-training techniques [3]. We present a comparison of our technique to the current popular approaches and show the effective increase in performance of the NER model and the reduction in time to annotate data.A key observation of the results presented is that the F1 score of model (0.734) trained with our approach with just 50% of available training data outperforms the F1 score of the blank spaCy model without language model component (0.704) trained with 100% of the available training data. We also demonstrate an annotation tool to minimize domain expert time and the manual effort required to generate such a training dataset. Further, we plan to release the annotated dataset as well as the pre-trained model to the community to further research in medical health records. △ Less

Submitted 29 January, 2020; v1 submitted 23 October, 2019; originally announced October 2019.

Comments: This work was presented at the first Health Search and Data Mining Workshop (HSDM 2020) as part of WSDM 2020 conference

ACM Class: H.3.3

arXiv:1909.07116 [pdf]

Leveraging human Domain Knowledge to model an empirical Reward function for a Reinforcement Learning problem

Authors: Dattaraj Rao

Abstract: Traditional Reinforcement Learning (RL) problems depend on an exhaustive simulation environment that models real-world physics of the problem and trains the RL agent by observing this environment. In this paper, we present a novel approach to creating an environment by modeling the reward function based on empirical rules extracted from human domain knowledge of the system under study. Using this… ▽ More Traditional Reinforcement Learning (RL) problems depend on an exhaustive simulation environment that models real-world physics of the problem and trains the RL agent by observing this environment. In this paper, we present a novel approach to creating an environment by modeling the reward function based on empirical rules extracted from human domain knowledge of the system under study. Using this empirical rewards function, we will build an environment and train the agent. We will first create an environment that emulates the effect of setting cabin temperature through thermostat. This is typically done in RL problems by creating an exhaustive model of the system with detailed thermodynamic study. Instead, we propose an empirical approach to model the reward function based on human domain knowledge. We will document some rules of thumb that we usually exercise as humans while setting thermostat temperature and try and model these into our reward function. This modeling of empirical human domain rules into a reward function for RL is the unique aspect of this paper. This is a continuous action space problem and using deep deterministic policy gradient (DDPG) method, we will solve for maximizing the reward function. We will create a policy network that predicts optimal temperature setpoint given external temperature and humidity. △ Less

Submitted 16 September, 2019; originally announced September 2019.

Comments: 4 pages, 3 figures, code shared on Google colab

arXiv:1907.03103 [pdf, other]

Towards Enhancing Fault Tolerance in Neural Networks

Authors: Vasisht Duddu, D. Vijay Rao, Valentina E. Balas

Abstract: Deep Learning Accelerators are prone to faults which manifest in the form of errors in Neural Networks. Fault Tolerance in Neural Networks is crucial in real-time safety critical applications requiring computation for long durations. Neural Networks with high regularisation exhibit superior fault tolerance, however, at the cost of classification accuracy. In the view of difference in functionality… ▽ More Deep Learning Accelerators are prone to faults which manifest in the form of errors in Neural Networks. Fault Tolerance in Neural Networks is crucial in real-time safety critical applications requiring computation for long durations. Neural Networks with high regularisation exhibit superior fault tolerance, however, at the cost of classification accuracy. In the view of difference in functionality, a Neural Network is modelled as two separate networks, i.e, the Feature Extractor with unsupervised learning objective and the Classifier with a supervised learning objective. Traditional approaches of training the entire network using a single supervised learning objective is insufficient to achieve the objectives of the individual components optimally. In this work, a novel multi-criteria objective function, combining unsupervised training of the Feature Extractor followed by supervised tuning with Classifier Network is proposed. The unsupervised training solves two games simultaneously in the presence of adversary neural networks with conflicting objectives to the Feature Extractor. The first game minimises the loss in reconstructing the input image for indistinguishability given the features from the Extractor, in the presence of a generative decoder. The second game solves a minimax constraint optimisation for distributional smoothening of feature space to match a prior distribution, in the presence of a Discriminator network. The resultant strongly regularised Feature Extractor is combined with the Classifier Network for supervised fine-tuning. The proposed Adversarial Fault Tolerant Neural Network Training is scalable to large networks and is independent of the architecture. The evaluation on benchmarking datasets: FashionMNIST and CIFAR10, indicates that the resultant networks have high accuracy with superior tolerance to stuck at "0" faults compared to widely used regularisers. △ Less

Submitted 29 May, 2021; v1 submitted 6 July, 2019; originally announced July 2019.

Comments: MobiQuitous 2020

arXiv:1812.11720 [pdf, ps, other]

Stealing Neural Networks via Timing Side Channels

Authors: Vasisht Duddu, Debasis Samanta, D Vijay Rao, Valentina E. Balas

Abstract: Deep learning is gaining importance in many applications. However, Neural Networks face several security and privacy threats. This is particularly significant in the scenario where Cloud infrastructures deploy a service with Neural Network model at the back end. Here, an adversary can extract the Neural Network parameters, infer the regularization hyperparameter, identify if a data point was part… ▽ More Deep learning is gaining importance in many applications. However, Neural Networks face several security and privacy threats. This is particularly significant in the scenario where Cloud infrastructures deploy a service with Neural Network model at the back end. Here, an adversary can extract the Neural Network parameters, infer the regularization hyperparameter, identify if a data point was part of the training data, and generate effective transferable adversarial examples to evade classifiers. This paper shows how a Neural Network model is susceptible to timing side channel attack. In this paper, a black box Neural Network extraction attack is proposed by exploiting the timing side channels to infer the depth of the network. Although, constructing an equivalent architecture is a complex search problem, it is shown how Reinforcement Learning with knowledge distillation can effectively reduce the search space to infer a target model. The proposed approach has been tested with VGG architectures on CIFAR10 data set. It is observed that it is possible to reconstruct substitute models with test accuracy close to the target models and the proposed approach is scalable and independent of type of Neural Network architectures. △ Less

Submitted 8 July, 2019; v1 submitted 31 December, 2018; originally announced December 2018.

arXiv:1807.05960 [pdf, other]

Meta-Learning with Latent Embedding Optimization

Authors: Andrei A. Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, Raia Hadsell

Abstract: Gradient-based meta-learning techniques are both widely applicable and proficient at solving challenging few-shot learning and fast adaptation problems. However, they have practical difficulties when operating on high-dimensional parameter spaces in extreme low-data regimes. We show that it is possible to bypass these limitations by learning a data-dependent latent generative representation of mod… ▽ More Gradient-based meta-learning techniques are both widely applicable and proficient at solving challenging few-shot learning and fast adaptation problems. However, they have practical difficulties when operating on high-dimensional parameter spaces in extreme low-data regimes. We show that it is possible to bypass these limitations by learning a data-dependent latent generative representation of model parameters, and performing gradient-based meta-learning in this low-dimensional latent space. The resulting approach, latent embedding optimization (LEO), decouples the gradient-based adaptation procedure from the underlying high-dimensional space of model parameters. Our evaluation shows that LEO can achieve state-of-the-art performance on the competitive miniImageNet and tieredImageNet few-shot classification tasks. Further analysis indicates LEO is able to capture uncertainty in the data, and can perform adaptation more effectively by optimizing in latent space. △ Less

Submitted 26 March, 2019; v1 submitted 16 July, 2018; originally announced July 2018.

arXiv:1805.04074 [pdf, other]

Hybrid CMOS-CNFET based NP dynamic Carry Look Ahead Adder

Authors: A. Nagalakshmi, Ch. Sirisha, Dr. D. N. Madhusudana Rao

Abstract: Advanced electronic device technologies require a faster operation and smaller average power consumption, which are the most important parameters in very large scale integrated circuit design. The conventional Complementary Metal-Oxide Semiconductor (CMOS) technology is limited by the threshold voltage and subthreshold leakage problems in scaling of devices. This leads to failure in adapting it to… ▽ More Advanced electronic device technologies require a faster operation and smaller average power consumption, which are the most important parameters in very large scale integrated circuit design. The conventional Complementary Metal-Oxide Semiconductor (CMOS) technology is limited by the threshold voltage and subthreshold leakage problems in scaling of devices. This leads to failure in adapting it to sub-micron and nanotechnologies. The carbon nanotube (CNT) technology overcomes the threshold voltage and subthreshold leakage problems despite reduction in size. The CNT based technology develops the most promising devices among emerging technologies because it has most of the desired features. Carbon Nanotube Field Effect Transistors (CNFETs) are the novel devices that are expected to sustain the transistor scalability while increasing its performance. Recently, there have been tremendous advances in CNT technology for nanoelectronics applications. CNFETs avoid most of the fundamental limitations and offer several advantages compared to silicon-based technology. Though CNT evolves as a better option to overcome some of the bulk CMOS problems, the CNT itself still immersed with setbacks. The fabrication of carbon nanotube at very large digital circuits on a single substrate is difficult to achieve. Therefore, a hybrid NP dynamic Carry Look Ahead Adder (CLA) is designed using p-CNFET and n-MOS transistors. Here, the performance of CLA is evaluated in 8-bit, 16-bit, 32-bit and 64-bit stages with the following four different implementations: silicon MOSFET (Si-MOSFET) domino logic, Si-MOSFET NP dynamic CMOS, carbon nanotube MOSFET (CN-MOSFET) domino logic, and CN-MOSFET NP dynamic CMOS. Finally, a Hybrid CMOS-CNFET based 64-bit NP dynamic CLA is evaluated based on HSPICE simulation in 32nm technology, which effectively suppresses power dissipation without an increase in propagation delay. △ Less

Submitted 10 May, 2018; originally announced May 2018.

Comments: 6 pages, 1 figure, 6 tables, Based on Master's thesis project (2014-16) carried by A. Nagalakshmi

arXiv:1804.03740 [pdf, other]

Multimodal Sparse Bayesian Dictionary Learning

Authors: Igor Fedorov, Bhaskar D. Rao

Abstract: This paper addresses the problem of learning dictionaries for multimodal datasets, i.e. datasets collected from multiple data sources. We present an algorithm called multimodal sparse Bayesian dictionary learning (MSBDL). MSBDL leverages information from all available data modalities through a joint sparsity constraint. The underlying framework offers a considerable amount of flexibility to practi… ▽ More This paper addresses the problem of learning dictionaries for multimodal datasets, i.e. datasets collected from multiple data sources. We present an algorithm called multimodal sparse Bayesian dictionary learning (MSBDL). MSBDL leverages information from all available data modalities through a joint sparsity constraint. The underlying framework offers a considerable amount of flexibility to practitioners and addresses many of the shortcomings of existing multimodal dictionary learning approaches. In particular, the procedure includes the automatic tuning of hyperparameters and is unique in that it allows the dictionaries for each data modality to have different cardinality, a significant feature in cases when the dimensionality of data differs across modalities. MSBDL is scalable and can be used in supervised learning settings. Theoretical results relating to the convergence of MSBDL are presented and the numerical results provide evidence of the superior performance of MSBDL on synthetic and real datasets compared to existing methods. △ Less

Submitted 28 May, 2019; v1 submitted 10 April, 2018; originally announced April 2018.

arXiv:1804.00492 [pdf]

Regional Priority Based Anomaly Detection using Autoencoders

Authors: Shruti Mittal, Dattaraj Rao

Abstract: In the recent times, autoencoders, besides being used for compression, have been proven quite useful even for regenerating similar images or help in image denoising. They have also been explored for anomaly detection in a few cases. However, due to location invariance property of convolutional neural network, autoencoders tend to learn from or search for learned features in the complete image. Thi… ▽ More In the recent times, autoencoders, besides being used for compression, have been proven quite useful even for regenerating similar images or help in image denoising. They have also been explored for anomaly detection in a few cases. However, due to location invariance property of convolutional neural network, autoencoders tend to learn from or search for learned features in the complete image. This creates issues when all the items in the image are not equally important and their location matters. For such cases, a semi supervised solution - regional priority based autoencoder (RPAE) has been proposed. In this model, similar to object detection models, a region proposal network identifies the relevant areas in the images as belonging to one of the predefined categories and then those bounding boxes are fed into appropriate decoder based on the category they belong to. Finally, the error scores from all the decoders are combined based on their importance to provide total reconstruction error. △ Less

Submitted 2 April, 2018; originally announced April 2018.

Comments: 5 pages, 5 figures

Report number: 2018TDS0001

arXiv:1803.11377 [pdf, other]

Fuzzy Graph Modelling of Anonymous Networks

Authors: Vasisht Duddu, Debasis Samanta, D Vijay Rao

Abstract: Anonymous networks have enabled secure and anonymous communication between the users and service providers while maintaining their anonymity and privacy. The hidden services in the networks are dynamic and continuously change their domains and service features to maintain anonymity and prevent fingerprinting. This makes modelling of such networks a challenging task. Further, modelling with crisp g… ▽ More Anonymous networks have enabled secure and anonymous communication between the users and service providers while maintaining their anonymity and privacy. The hidden services in the networks are dynamic and continuously change their domains and service features to maintain anonymity and prevent fingerprinting. This makes modelling of such networks a challenging task. Further, modelling with crisp graphs is not suitable as they cannot capture the dynamic nature of the anonymous networks. In this work, we model the anonymous networks using fuzzy graphs and provide a methodology to simulate and analyze an anonymous network. We consider the case studies of two popular anonymous communication networks: Tor and Freenet, and show how the two networks can be analyzed using our proposed fuzzy representation. △ Less

Submitted 17 September, 2018; v1 submitted 30 March, 2018; originally announced March 2018.

arXiv:1802.01616 [pdf, ps, other]

Re-Weighted Learning for Sparsifying Deep Neural Networks

Authors: Igor Fedorov, Bhaskar D. Rao

Abstract: This paper addresses the topic of sparsifying deep neural networks (DNN's). While DNN's are powerful models that achieve state-of-the-art performance on a large number of tasks, the large number of model parameters poses serious storage and computational challenges. To combat these difficulties, a growing line of work focuses on pruning network weights without sacrificing performance. We propose a… ▽ More This paper addresses the topic of sparsifying deep neural networks (DNN's). While DNN's are powerful models that achieve state-of-the-art performance on a large number of tasks, the large number of model parameters poses serious storage and computational challenges. To combat these difficulties, a growing line of work focuses on pruning network weights without sacrificing performance. We propose a general affine scaling transformation (AST) algorithm to sparsify DNN's. Our approach follows in the footsteps of popular sparse recovery techniques, which have yet to be explored in the context of DNN's. We describe a principled framework for transforming densely connected DNN's into sparsely connected ones without sacrificing network performance. Unlike existing methods, our approach is able to learn sparse connections at each layer simultaneously, and achieves comparable pruning results on the architecture tested. △ Less

Submitted 5 February, 2018; originally announced February 2018.

arXiv:1802.01286 [pdf]

Data Augmentation of Railway Images for Track Inspection

Authors: S Ritika, Dattaraj Rao

Abstract: Regular maintenance of all the assets is pivotal for proper functioning of railway. Manual maintenance can be very cumbersome and leave room for errors. Track anomalies like vegetation overgrowth, sun kinks affect the track construct and result in unequal load transfer, imbalanced lateral forces on tracks which causes further deterioration of tracks and can ultimately result in derailment of locom… ▽ More Regular maintenance of all the assets is pivotal for proper functioning of railway. Manual maintenance can be very cumbersome and leave room for errors. Track anomalies like vegetation overgrowth, sun kinks affect the track construct and result in unequal load transfer, imbalanced lateral forces on tracks which causes further deterioration of tracks and can ultimately result in derailment of locomotive. Hence there is a need to continuously monitor rail track health. Track anomalies are rare with the skew as high as one anomaly in millions of good images. We propose a method to build training data that will make our algorithms more robust and help us detect real world track issues. The data augmentation will have a direct effect in making us detect better anomalies and hence improve time for railroads that is spent in manual inspection. This paper talks about a real world use case of detecting railway track defects from a camera mounted on a moving locomotive and tracking their locations. The camera is engineered to withstand the environment factors on a moving train and provide a consistent steady image at around 30 frames per second. An image simulation pipeline of track detection, region of interest selection, augmenting image for anomalies is implemented. Training images are simulated for sun kink and vegetation overgrowth. Inception V3 model pretrained on Imagenet dataset is finetuned for a 2 class classification. For the case of vegetation overgrowth, the model generalizes well on actual vegetation images, though it was trained and validated solely on simulated images which might have different distribution than the actual vegetation. Sun kink classifier can classify professionally simulated sun kink videos with a precision of 97.5%. △ Less

Submitted 5 February, 2018; originally announced February 2018.

arXiv:1802.01273 [pdf]

Face recognition for monitoring operator shift in railways

Authors: S Ritika, Dattaraj Rao

Abstract: Train Pilot is a very tedious and stressful job. Pilots must be vigilant at all times and its easy for them to lose track of time of shift. In countries like USA the pilots are mandated by law to adhere to 8 hour shifts. If they exceed 8 hours of shift the railroads may be penalized for over-tiring their drivers. The problem happens when the 8 hour shift may end in middle of a journey. In such cas… ▽ More Train Pilot is a very tedious and stressful job. Pilots must be vigilant at all times and its easy for them to lose track of time of shift. In countries like USA the pilots are mandated by law to adhere to 8 hour shifts. If they exceed 8 hours of shift the railroads may be penalized for over-tiring their drivers. The problem happens when the 8 hour shift may end in middle of a journey. In such case, the new drivers must be moved to the location locomotive is operating for shift change. Hence accurate monitoring of drivers during their shift and making sure the shifts are scheduled correctly is very important for railroads. Here we propose an automated camera system that uses camera mounted inside Locomotive cabs to continuously record video feeds. These feeds are analyzed in real time to detect the face of driver and recognize the driver using state of the art deep Learning techniques. The outcome is an increased safety of train pilots. Cameras continuously capture video from inside the cab which is stored on an on board data acquisition device. Using advanced computer vision and deep learning techniques the videos are analyzed at regular intervals to detect presence of the pilot and identify the pilot. Using a time based analysis, it is identified for how long that shift has been active. If this time exceeds allocated shift time an alert is sent to the dispatch to adjust shift hours. △ Less

Submitted 21 May, 2018; v1 submitted 5 February, 2018; originally announced February 2018.

arXiv:1712.08036 [pdf]

Siamese Neural Networks for One-shot detection of Railway Track Switches

Authors: Dattaraj J Rao, Shruti Mittal, S. Ritika

Abstract: Deep Learning methods have been extensively used to analyze video data to extract valuable information by classifying image frames and detecting objects. We describe a unique approach for using video feed from a moving Locomotive to continuously monitor the Railway Track and detect significant assets like Switches on the Track. The technique used here is called Siamese Networks, which uses 2 ident… ▽ More Deep Learning methods have been extensively used to analyze video data to extract valuable information by classifying image frames and detecting objects. We describe a unique approach for using video feed from a moving Locomotive to continuously monitor the Railway Track and detect significant assets like Switches on the Track. The technique used here is called Siamese Networks, which uses 2 identical networks to learn the similarity between of 2 images. Here we will use a Siamese network to continuously compare Track images and detect any significant difference in the Track. Switch will be one of those images that will be different and we will find a mapping that clearly distinguishes the Switch from other possible Track anomalies. The same method will then be extended to detect any abnormalities on the Railway Track. Railway Transportation is unique in the sense that is has wheeled vehicles, Trains pulled by Locomotives, running on guided Rails at very high speeds nearing 200 mph. Multiple Tracks on the Rail network are connected to each other using an equipment called Switch or a Turnout. Switch is either operated manually or automatically through command from a Control center and it governs the movement of Trains on different Tracks of the network. Accurate location of these Switches is very important for the railroad and getting a true picture of their state in field is important. Modern trains use high definition video cameras facing the Track that continuously record video from track. Using a Siamese network and comparing to benchmark images, we describe a method to monitor the Track and highlight anomalies. △ Less

Submitted 21 December, 2017; originally announced December 2017.

Comments: 6 pages and 7 figures

Showing 1–50 of 94 results for author: Rao, D