Search | arXiv e-print repository

arXiv:2408.16605 [pdf, ps, other]

Subspace Representation Learning for Sparse Linear Arrays to Localize More Sources than Sensors: A Deep Learning Methodology

Authors: Kuan-Lin Chen, Bhaskar D. Rao

Abstract: Localizing more sources than sensors with a sparse linear array (SLA) has long relied on minimizing a distance between two covariance matrices and recent algorithms often utilize semidefinite programming (SDP). Although deep neural network (DNN)-based methods offer new alternatives, they still depend on covariance matrix fitting. In this paper, we develop a novel methodology that estimates the co-… ▽ More Localizing more sources than sensors with a sparse linear array (SLA) has long relied on minimizing a distance between two covariance matrices and recent algorithms often utilize semidefinite programming (SDP). Although deep neural network (DNN)-based methods offer new alternatives, they still depend on covariance matrix fitting. In this paper, we develop a novel methodology that estimates the co-array subspaces from a sample covariance for SLAs. Our methodology trains a DNN to learn signal and noise subspace representations that are invariant to the selection of bases. To learn such representations, we propose loss functions that gauge the separation between the desired and the estimated subspace. In particular, we propose losses that measure the length of the shortest path between subspaces viewed on a union of Grassmannians, and prove that it is possible for a DNN to approximate signal subspaces. The computation of learning subspaces of different dimensions is accelerated by a new batch sampling strategy called consistent rank sampling. The methodology is robust to array imperfections due to its geometry-agnostic and data-driven nature. In addition, we propose a fully end-to-end gridless approach that directly learns angles to study the possibility of bypassing subspace methods. Numerical results show that learning such subspace representations is more beneficial than learning covariances or angles. It outperforms conventional SDP-based methods such as the sparse and parametric approach (SPA) and existing DNN-based covariance reconstruction methods for a wide range of signal-to-noise ratios (SNRs), snapshots, and source numbers for both perfect and imperfect arrays. △ Less

Submitted 29 August, 2024; originally announced August 2024.

Comments: 13 pages. Submitted to the IEEE Transactions on Signal Processing

arXiv:2408.15780 [pdf, other]

Evaporation of water-in-oil microemulsion droplet

Authors: Bal Krishan, Preetika Rastogi, D. Chaitanya Kumar Rao, Niket S. Kaisare, Madivala G. Basavaraj, Saptarshi Basu

Abstract: Emulsion fuels have the potential to reduce both particulate matter and NOx emissions and can potentially improve the efficiency of combustion engines. However, their limited stability remains a critical barrier to practical use as an alternative fuel. In this study, we explore the evaporation behavior of thermodynamically stable water-in-oil microemulsions. The water-in-oil microemulsion droplets… ▽ More Emulsion fuels have the potential to reduce both particulate matter and NOx emissions and can potentially improve the efficiency of combustion engines. However, their limited stability remains a critical barrier to practical use as an alternative fuel. In this study, we explore the evaporation behavior of thermodynamically stable water-in-oil microemulsions. The water-in-oil microemulsion droplets prepared from different types of oil were acoustically levitated and heated using a continuous laser at different irradiation intensities. We show that the evaporation characteristics of these microemulsions can be controlled by varying water-to-surfactant molar ratio (ω) and volume fraction of the dispersed phase (φ). The emulsion droplets undergo three distinct stages of evaporation, namely pre-heating, steady evaporation, and unsteady evaporation. During the steady evaporation phase, increasing φ reduces the evaporation rate for a fixed ω. It is observed that the evaporation of microemulsion is governed by the complex interplay between its constituents and their properties. We propose a parameter (η) denoting the volume fraction ratio between volatile and non-volatile components, which indicates the cumulative influence of various factors affecting the evaporation process. The evaporation of microemulsions eventually leads to the formation of solid spherical shells, which may undergo buckling. The distinction in the morphology of these shells is explored in detail using SEM imaging. △ Less

Submitted 28 August, 2024; originally announced August 2024.

Comments: 42 pages, 11 figures

arXiv:2408.08901 [pdf]

Bayesian inference to improve quality of Retrieval Augmented Generation

Authors: Dattaraj Rao

Abstract: Retrieval Augmented Generation or RAG is the most popular pattern for modern Large Language Model or LLM applications. RAG involves taking a user query and finding relevant paragraphs of context in a large corpus typically captured in a vector database. Once the first level of search happens over a vector database, the top n chunks of relevant text are included directly in the context and sent as… ▽ More Retrieval Augmented Generation or RAG is the most popular pattern for modern Large Language Model or LLM applications. RAG involves taking a user query and finding relevant paragraphs of context in a large corpus typically captured in a vector database. Once the first level of search happens over a vector database, the top n chunks of relevant text are included directly in the context and sent as prompt to the LLM. Problem with this approach is that quality of text chunks depends on effectiveness of search. There is no strong post processing after search to determine if the chunk does hold enough information to include in prompt. Also many times there may be chunks that have conflicting information on the same subject and the model has no prior experience which chunk to prioritize to make a decision. Often times, this leads to the model providing a statement that there are conflicting statements, and it cannot produce an answer. In this research we propose a Bayesian approach to verify the quality of text chunks from the search results. Bayes theorem tries to relate conditional probabilities of the hypothesis with evidence and prior probabilities. We propose that, finding likelihood of text chunks to give a quality answer and using prior probability of quality of text chunks can help us improve overall quality of the responses from RAG systems. We can use the LLM itself to get a likelihood of relevance of a context paragraph. For prior probability of the text chunk, we use the page number in the documents parsed. Assumption is that that paragraphs in earlier pages have a better probability of being findings and more relevant to generalizing an answer. △ Less

Submitted 12 August, 2024; originally announced August 2024.

arXiv:2407.20879 [pdf, other]

A Scalable Tool For Analyzing Genomic Variants Of Humans Using Knowledge Graphs and Machine Learning

Authors: Shivika Prasanna, Ajay Kumar, Deepthi Rao, Eduardo Simoes, Praveen Rao

Abstract: The integration of knowledge graphs and graph machine learning (GML) in genomic data analysis offers several opportunities for understanding complex genetic relationships, especially at the RNA level. We present a comprehensive approach for leveraging these technologies to analyze genomic variants, specifically in the context of RNA sequencing (RNA-seq) data from COVID-19 patient samples. The prop… ▽ More The integration of knowledge graphs and graph machine learning (GML) in genomic data analysis offers several opportunities for understanding complex genetic relationships, especially at the RNA level. We present a comprehensive approach for leveraging these technologies to analyze genomic variants, specifically in the context of RNA sequencing (RNA-seq) data from COVID-19 patient samples. The proposed method involves extracting variant-level genetic information, annotating the data with additional metadata using SnpEff, and converting the enriched Variant Call Format (VCF) files into Resource Description Framework (RDF) triples. The resulting knowledge graph is further enhanced with patient metadata and stored in a graph database, facilitating efficient querying and indexing. We utilize the Deep Graph Library (DGL) to perform graph machine learning tasks, including node classification with GraphSAGE and Graph Convolutional Networks (GCNs). Our approach demonstrates significant utility using our proposed tool, VariantKG, in three key scenarios: enriching graphs with new VCF data, creating subgraphs based on user-defined features, and conducting graph machine learning for node classification. △ Less

Submitted 30 July, 2024; originally announced July 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2312.04423

arXiv:2406.03226 [pdf]

Electron Confinement-Induced Plasmonic Breakdown in Metals

Authors: Prasanna Das, Sourav Rudra, Dheemahi Rao, Souvik Banerjee, Ashalatha Indiradevi Kamalasanan Pillai, Magnus Garbrecht, Alexandra Boltasseva, Igor V. Bondarev, Vladimir M. Shalaev, Bivas Saha

Abstract: Plasmon resonance in metals represents the collective oscillation of the free electron gas density and enables enhanced light-matter interactions in nanoscale dimensions. Traditionally, the classical Drude model describes the plasmonic excitation, wherein the plasma frequency exhibits no spatial dispersion. Here, we show conclusive experimental evidence of the breakdown of the plasmon resonance an… ▽ More Plasmon resonance in metals represents the collective oscillation of the free electron gas density and enables enhanced light-matter interactions in nanoscale dimensions. Traditionally, the classical Drude model describes the plasmonic excitation, wherein the plasma frequency exhibits no spatial dispersion. Here, we show conclusive experimental evidence of the breakdown of the plasmon resonance and a consequent photonic metal-insulator transition in an ultrathin archetypal refractory plasmonic material, hafnium nitride (HfN). Epitaxial HfN thick films exhibit a low-loss and high-quality Drude-like plasmon resonance in the visible spectral range. However, as the film thickness is reduced to nanoscale dimensions, the Coulomb interaction among electrons increases due to the electron confinement, leading to the spatial dispersion of the plasma frequency. Importantly, with the further decrease in thickness, electrons lose their ability to shield the incident electric field, turning the medium into a dielectric. The breakdown of the plasmon resonance in epitaxial ultrathin metals could be useful for fundamental physics studies in transdimensional regimes and novel photonic device applications. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2404.07604 [pdf, other]

Novel Active Sensing and Inference for mmWave Beam Alignment Using Single RF Chain Systems

Authors: Rohan R. Pote, Bhaskar D. Rao

Abstract: We propose a novel sensing approach for the beam alignment problem in millimeter wave systems using a single Radio Frequency (RF) chain. Conventionally, beam alignment using a single phased array involves comparing beamformer output power across different spatial regions. This incurs large training overhead due to the need to perform the beam scan operation. The proposed Synthesis of Virtual Array… ▽ More We propose a novel sensing approach for the beam alignment problem in millimeter wave systems using a single Radio Frequency (RF) chain. Conventionally, beam alignment using a single phased array involves comparing beamformer output power across different spatial regions. This incurs large training overhead due to the need to perform the beam scan operation. The proposed Synthesis of Virtual Array Manifold (SVAM) sensing methodology is inspired from synthetic aperture radar systems and realizes a virtual array geometry over temporal measurements. We demonstrate the benefits of SVAM using Cramér-Rao bound (CRB) analysis over schemes that repeat beam pattern to boost signal-to-noise (SNR) ratio. We also showcase versatile applicability of the proposed SVAM sensing by incorporating it within existing beam alignment procedures that assume perfect knowledge of the small-scale fading coefficient. We further consider the practical scenario wherein we estimate the fading coefficient and propose a novel beam alignment procedure based on efficient computation of an approximate posterior density on dominant path angle. We provide numerical experiments to study the impact of parameters involved in the procedure. The performance of the proposed sensing and beam alignment algorithm is empirically observed to approach the fading coefficient-perfectly known performance, even at low SNR. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2402.11450 [pdf, other]

Learning to Learn Faster from Human Feedback with Language Model Predictive Control

Authors: Jacky Liang, Fei Xia, Wenhao Yu, Andy Zeng, Montserrat Gonzalez Arenas, Maria Attarian, Maria Bauza, Matthew Bennice, Alex Bewley, Adil Dostmohamed, Chuyuan Kelly Fu, Nimrod Gileadi, Marissa Giustina, Keerthana Gopalakrishnan, Leonard Hasenclever, Jan Humplik, Jasmine Hsu, Nikhil Joshi, Ben Jyenis, Chase Kew, Sean Kirmani, Tsang-Wei Edward Lee, Kuang-Huei Lee, Assaf Hurwitz Michaely, Joss Moore , et al. (25 additional authors not shown)

Abstract: Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for o… ▽ More Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for only as long as it fits within the context size of the LLM, and can be forgotten over longer interactions. In this work, we investigate fine-tuning the robot code-writing LLMs, to remember their in-context interactions and improve their teachability i.e., how efficiently they adapt to human inputs (measured by average number of corrections before the user considers the task successful). Our key observation is that when human-robot interactions are viewed as a partially observable Markov decision process (in which human language inputs are observations, and robot code outputs are actions), then training an LLM to complete previous interactions is training a transition dynamics model -- that can be combined with classic robotics techniques such as model predictive control (MPC) to discover shorter paths to success. This gives rise to Language Model Predictive Control (LMPC), a framework that fine-tunes PaLM 2 to improve its teachability on 78 tasks across 5 robot embodiments -- improving non-expert teaching success rates of unseen tasks by 26.9% while reducing the average number of human corrections from 2.4 to 1.9. Experiments show that LMPC also produces strong meta-learners, improving the success rate of in-context learning new tasks on unseen robot embodiments and APIs by 31.5%. See videos, code, and demos at: https://robot-teaching.github.io/. △ Less

Submitted 31 May, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

arXiv:2312.04423 [pdf, other]

Scalable Knowledge Graph Construction and Inference on Human Genome Variants

Authors: Shivika Prasanna, Deepthi Rao, Eduardo Simoes, Praveen Rao

Abstract: Real-world knowledge can be represented as a graph consisting of entities and relationships between the entities. The need for efficient and scalable solutions arises when dealing with vast genomic data, like RNA-sequencing. Knowledge graphs offer a powerful approach for various tasks in such large-scale genomic data, such as analysis and inference. In this work, variant-level information extracte… ▽ More Real-world knowledge can be represented as a graph consisting of entities and relationships between the entities. The need for efficient and scalable solutions arises when dealing with vast genomic data, like RNA-sequencing. Knowledge graphs offer a powerful approach for various tasks in such large-scale genomic data, such as analysis and inference. In this work, variant-level information extracted from the RNA-sequences of vaccine-naïve COVID-19 patients have been represented as a unified, large knowledge graph. Variant call format (VCF) files containing the variant-level information were annotated to include further information for each variant. The data records in the annotated files were then converted to Resource Description Framework (RDF) triples. Each VCF file obtained had an associated CADD scores file that contained the raw and Phred-scaled scores for each variant. An ontology was defined for the VCF and CADD scores files. Using this ontology and the extracted information, a large, scalable knowledge graph was created. Available graph storage was then leveraged to query and create datasets for further downstream tasks. We also present a case study using the knowledge graph and perform a classification task using graph machine learning. We also draw comparisons between different Graph Neural Networks (GNNs) for the case study. △ Less

Submitted 7 December, 2023; originally announced December 2023.

arXiv:2311.06261 [pdf, other]

With ChatGPT, do we have to rewrite our learning objectives -- CASE study in Cybersecurity

Authors: Peter Jamieson, Suman Bhunia, Dhananjai M. Rao

Abstract: With the emergence of Artificial Intelligent chatbot tools such as ChatGPT and code writing AI tools such as GitHub Copilot, educators need to question what and how we should teach our courses and curricula in the future. In reality, automated tools may result in certain academic fields being deeply reduced in the number of employable people. In this work, we make a case study of cybersecurity und… ▽ More With the emergence of Artificial Intelligent chatbot tools such as ChatGPT and code writing AI tools such as GitHub Copilot, educators need to question what and how we should teach our courses and curricula in the future. In reality, automated tools may result in certain academic fields being deeply reduced in the number of employable people. In this work, we make a case study of cybersecurity undergrad education by using the lens of ``Understanding by Design'' (UbD). First, we provide a broad understanding of learning objectives (LOs) in cybersecurity from a computer science perspective. Next, we dig a little deeper into a curriculum with an undergraduate emphasis on cybersecurity and examine the major courses and their LOs for our cybersecurity program at Miami University. With these details, we perform a thought experiment on how attainable the LOs are with the above-described tools, asking the key question ``what needs to be enduring concepts?'' learned in this process. If an LO becomes something that the existence of automation tools might be able to do, we then ask ``what level is attainable for the LO that is not a simple query to the tools?''. With this exercise, we hope to establish an example of how to prompt ChatGPT to accelerate students in their achievements of LOs given the existence of these new AI tools, and our goal is to push all of us to leverage and teach these tools as powerful allies in our quest to improve human existence and knowledge. △ Less

Submitted 26 September, 2023; originally announced November 2023.

arXiv:2310.15546 [pdf, other]

doi 10.1103/PhysRevLett.133.050602

Robust and Deterministic Preparation of Bosonic Logical States in a Trapped Ion

Authors: V. G. Matsos, C. H. Valahu, T. Navickas, A. D. Rao, M. J. Millican, X. C. Kolesnikow, M. J. Biercuk, T. R. Tan

Abstract: Encoding logical qubits in bosonic modes provides a potentially hardware-efficient implementation of fault-tolerant quantum information processing. Here, we demonstrate high-fidelity and deterministic preparation of highly non-classical bosonic states in the mechanical motion of a trapped ion. Our approach implements error-suppressing pulses through optimized dynamical modulation of laser-driven s… ▽ More Encoding logical qubits in bosonic modes provides a potentially hardware-efficient implementation of fault-tolerant quantum information processing. Here, we demonstrate high-fidelity and deterministic preparation of highly non-classical bosonic states in the mechanical motion of a trapped ion. Our approach implements error-suppressing pulses through optimized dynamical modulation of laser-driven spin-motion interactions to generate the target state in a single step. We demonstrate logical fidelities for the Gottesman-Kitaev-Preskill (GKP) state as high as $\bar{\mathcal{F}}=0.940(8)$, a distance-3 binomial state with an average fidelity of $\mathcal{F}=0.807(7)$, and a 12.91(5) dB squeezed vacuum state. △ Less

Submitted 14 August, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

Comments: 14 pages, 9 figures

Journal ref: Phys. Rev. Lett. 133, 050602 (2024)

arXiv:2309.15605 [pdf]

Insights into bubble droplet interactions in evaporating polymeric droplets

Authors: Gannena K S Raghuram, Durbar Roy, D Chaitanya Kumar Rao, Aloke Kumar, Saptarshi Basu

Abstract: Polymer droplets subjected to a heated environment have significance in several fields ranging from spray drying and powder formation to surface coating. In the present work, we investigate the evaporation of a high viscoelastic modulus aqueous polymeric droplet in an acoustically levitated environment. Depending on the laser irradiation intensity, we observe nucleation of a bubble in the dilute r… ▽ More Polymer droplets subjected to a heated environment have significance in several fields ranging from spray drying and powder formation to surface coating. In the present work, we investigate the evaporation of a high viscoelastic modulus aqueous polymeric droplet in an acoustically levitated environment. Depending on the laser irradiation intensity, we observe nucleation of a bubble in the dilute regime of polymer concentration, contrary to the previously observed bubble nucleation in a semi-dilute entangled regime for low viscoelastic modulus polymer droplets. After the bubble nucleation, a quasi steady bubble growth occurs depending on the laser irradiation intensity and concentrations. Our scaling analysis reveals that bubble growth follows Plesset-Zwick criteria independent of the viscoelastic properties of the polymer solution. Further, we establish that the onset of bubble growth has an inverse nonlinear dependence on the laser irradiation intensity. At high concentrations and laser irradiation intensities, we report the expansion and collapse of polymer membrane without rupture, indicating the formation of an interfacial skin with significant strength. The droplet oscillations are primarily driven by the presence of multiple bubbles and, to some extent, by the rotational motion of the droplet. Finally, depending on the nature of bubble growth, different types of precipitate form contrary to the different modes of atomization observed in low viscoelastic modulus polymer droplets. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.11512 [pdf, other]

Multidimensional well-being of US households at a fine spatial scale using fused household surveys: fusionACS

Authors: Kevin Ummel, Miguel Poblete-Cazenave, Karthik Akkiraju, Nick Graetz, Hero Ashman, Cora Kingdon, Steven Herrera Tenorio, Aaryaman "Sunny" Singhal, Daniel Aldana Cohen, Narasimha D. Rao

Abstract: Social science often relies on surveys of households and individuals. Dozens of such surveys are regularly administered by the U.S. government. However, they field independent, unconnected samples with specialized questions, limiting research questions to those that can be answered by a single survey. The fusionACS project seeks to integrate data from multiple U.S. household surveys by statistical… ▽ More Social science often relies on surveys of households and individuals. Dozens of such surveys are regularly administered by the U.S. government. However, they field independent, unconnected samples with specialized questions, limiting research questions to those that can be answered by a single survey. The fusionACS project seeks to integrate data from multiple U.S. household surveys by statistically "fusing" variables from "donor" surveys onto American Community Survey (ACS) microdata. This results in an integrated microdataset of household attributes and well-being dimensions that can be analyzed to address research questions in ways that are not currently possible. The presented data comprise the fusion onto the ACS of select donor variables from the Residential Energy Consumption Survey (RECS) of 2015, the National Household Transportation Survey (NHTS) of 2017, the American Housing Survey (AHS) of 2019, and the Consumer Expenditure Survey - Interview (CEI) for the years 2015-2019. The underlying statistical techniques are included in an open-source $R$ package, fusionModel, that provides generic tools for the creation, analysis, and validation of fused microdata. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: 35 pages, 6 figures

arXiv:2306.11706 [pdf, other]

RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation

Authors: Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauza, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Żołna, Scott Reed, Sergio Gómez Colmenarejo, Jon Scholz , et al. (14 additional authors not shown)

Abstract: The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned de… ▽ More The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned decision transformer capable of consuming action-labelled visual experience. This data spans a large repertoire of motor control skills from simulated and real robotic arms with varying sets of observations and actions. With RoboCat, we demonstrate the ability to generalise to new tasks and robots, both zero-shot as well as through adaptation using only 100-1000 examples for the target task. We also show how a trained model itself can be used to generate data for subsequent training iterations, thus providing a basic building block for an autonomous improvement loop. We investigate the agent's capabilities, with large-scale evaluations both in simulation and on three different real robot embodiments. We find that as we grow and diversify its training data, RoboCat not only shows signs of cross-task transfer, but also becomes more efficient at adapting to new tasks. △ Less

Submitted 22 December, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

Comments: Transactions on Machine Learning Research (12/2023)

arXiv:2305.15543 [pdf, other]

Regularized Neural Detection for One-Bit Massive MIMO Communication Systems

Authors: Aditya Sant, Bhaskar D. Rao

Abstract: Detection for one-bit massive MIMO systems presents several challenges especially for higher order constellations. Recent advances in both model-based analysis and deep learning frameworks have resulted in several robust one-bit detector designs. Our work builds on the current state-of-the-art gradient descent (GD)-based detector. We introduce two novel contributions in our detector design: (i) We… ▽ More Detection for one-bit massive MIMO systems presents several challenges especially for higher order constellations. Recent advances in both model-based analysis and deep learning frameworks have resulted in several robust one-bit detector designs. Our work builds on the current state-of-the-art gradient descent (GD)-based detector. We introduce two novel contributions in our detector design: (i) We augment each GD iteration with a deep learning-aided regularization step, and (ii) We introduce a novel constellation-based loss function for our regularized DNN detector. This one-bit detection strategy is applied to two different DNN architectures based on algorithm unrolling, namely, a deep unfolded neural network and a deep recurrent neural network. Being trained on multiple randomly sampled channel matrices, these networks are developed as general one-bit detectors. The numerical results show that the combination of the DNN-augmented regularized GD and constellation-based loss function improve the quality of our one-bit detector, especially for higher order M-QAM constellations. △ Less

Submitted 26 May, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: Initially submitted to IEEE TMLCN in October 2022

arXiv:2305.12696 [pdf, other]

Learning Interpretable Style Embeddings via Prompting LLMs

Authors: Ajay Patel, Delip Rao, Ansh Kothary, Kathleen McKeown, Chris Callison-Burch

Abstract: Style representation learning builds content-independent representations of author style in text. Stylometry, the analysis of style in text, is often performed by expert forensic linguists and no large dataset of stylometric annotations exists for training. Current style representation learning uses neural methods to disentangle style from content to create style vectors, however, these approaches… ▽ More Style representation learning builds content-independent representations of author style in text. Stylometry, the analysis of style in text, is often performed by expert forensic linguists and no large dataset of stylometric annotations exists for training. Current style representation learning uses neural methods to disentangle style from content to create style vectors, however, these approaches result in uninterpretable representations, complicating their usage in downstream applications like authorship attribution where auditing and explainability is critical. In this work, we use prompting to perform stylometry on a large number of texts to create a synthetic dataset and train human-interpretable style representations we call LISA embeddings. We release our synthetic stylometry dataset and our interpretable style models as resources. △ Less

Submitted 9 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

arXiv:2304.13164 [pdf, other]

Towards Compute-Optimal Transfer Learning

Authors: Massimo Caccia, Alexandre Galashov, Arthur Douillard, Amal Rannen-Triki, Dushyant Rao, Michela Paganini, Laurent Charlin, Marc'Aurelio Ranzato, Razvan Pascanu

Abstract: The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet… ▽ More The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet effective way to trade computational efficiency for asymptotic performance which we define as the performance a learning algorithm achieves as compute tends to infinity. Specifically, we argue that zero-shot structured pruning of pretrained models allows them to increase compute efficiency with minimal reduction in performance. We evaluate our method on the Nevis'22 continual learning benchmark that offers a diverse set of transfer scenarios. Our results show that pruning convolutional filters of pretrained models can lead to more than 20% performance improvement in low computational regimes. △ Less

Submitted 25 April, 2023; originally announced April 2023.

arXiv:2303.11770

Adaptive Super-Twisting Controller Design for Accurate Trajectory Tracking Performance of Unmanned Aerial Vehicles

Authors: D. M. K. K. Venkateswara Rao, Hamed Habibi, Jose Luis Sanchez-Lopez, Prathyush P. Menon, Christopher Edwards, Holger Voos

Abstract: In this paper, an adaptive super-twisting controller is designed for an agile maneuvering quadrotor unmanned aerial vehicle to achieve accurate trajectory tracking in the presence of external disturbances. A cascaded control architecture is designed to determine the desired accelerations using the proposed controller and subsequently used to compute the desired orientation and angular rates. The f… ▽ More In this paper, an adaptive super-twisting controller is designed for an agile maneuvering quadrotor unmanned aerial vehicle to achieve accurate trajectory tracking in the presence of external disturbances. A cascaded control architecture is designed to determine the desired accelerations using the proposed controller and subsequently used to compute the desired orientation and angular rates. The finite-time convergence of sliding functions and closed-loop system stability are analytically proven. Furthermore, the restrictive assumption on the maximum variation of the disturbance is relaxed by designing a gain adaptation law and low-pass filtering of the estimated equivalent control. The proper selection of design parameters is discussed in detail. Finally, the effectiveness of the proposed method is evaluated by high-fidelity software-in-the-loop simulations and validated by experimental studies. △ Less

Submitted 14 September, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

Comments: We are working on a new version of this paper and revising some technical parts. We will replace the new version as soon as it is carefully revised

arXiv:2303.02053 [pdf, ps, other]

On SORA for High-Risk UAV Operations under New EU Regulations: Perspectives for Automated Approach

Authors: Hamed Habibi, D. M. K. K. Venkateswara Rao, Jose Luis Sanchez-Lopez, Holger Voos

Abstract: In this paper, we investigate requirements to prepare an application for Specific Operations Risk Assessment (SORA), regulated by European Union Aviation Safety Agency (EASA) to obtain flight authorization for Unmanned Aerial Vehicles (UAVs) operations and propose some perspectives to automate the approach based on our successful application. Preparation of SORA requires expert knowledge as it con… ▽ More In this paper, we investigate requirements to prepare an application for Specific Operations Risk Assessment (SORA), regulated by European Union Aviation Safety Agency (EASA) to obtain flight authorization for Unmanned Aerial Vehicles (UAVs) operations and propose some perspectives to automate the approach based on our successful application. Preparation of SORA requires expert knowledge as it contains technicalities. Also, the whole process is an iterative and time-consuming one. It is even more challenging for higher-risk operations, such as those in urban environments, near airports, and multi- and customized models for research activities. SORA process limits the potential socio-economic impacts of innovative UAV capabilities. Therefore, in this paper, we present a SORA example, review the steps and highlight challenges. Accordingly, we propose an alternative workflow, considering the same steps, while addressing the challenges and pitfalls, to shorten the whole process. Furthermore, we present a comprehensive list of preliminary technical procedures, including the pre/during/post-flight checklists, design and installation appraisal, flight logbook, operational manual, training manual, and General Data Protection Regulation (GDPR), which are not explicitly instructed in SORA manual. Moreover, we propose the initial idea to create an automated SORA workflow to facilitate obtaining authorization, which is significantly helpful for operators, especially the scientific community, to conduct experimental operations. △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2303.02043 [pdf, other]

An Integrated Real-time UAV Trajectory Optimization with Potential Field Approach for Dynamic Collision Avoidance

Authors: D. M. K. K. Venkateswara Rao, Hamed Habibi, Jose Luis Sanchez-Lopez, Holger Voos

Abstract: This paper presents an integrated approach that combines trajectory optimization and Artificial Potential Field (APF) method for real-time optimal Unmanned Aerial Vehicle (UAV) trajectory planning and dynamic collision avoidance. A minimum-time trajectory optimization problem is formulated with initial and final positions as boundary conditions and collision avoidance as constraints. It is transcr… ▽ More This paper presents an integrated approach that combines trajectory optimization and Artificial Potential Field (APF) method for real-time optimal Unmanned Aerial Vehicle (UAV) trajectory planning and dynamic collision avoidance. A minimum-time trajectory optimization problem is formulated with initial and final positions as boundary conditions and collision avoidance as constraints. It is transcribed into a nonlinear programming problem using Chebyshev pseudospectral method. The state and control histories are approximated by using Lagrange polynomials and the collocation points are used to satisfy constraints. A novel sigmoid-type collision avoidance constraint is proposed to overcome the drawbacks of Lagrange polynomial approximation in pseudospectral methods that only guarantees inequality constraint satisfaction only at nodal points. Automatic differentiation of cost function and constraints is used to quickly determine their gradient and Jacobian, respectively. An APF method is used to update the optimal control inputs for guaranteeing collision avoidance. The trajectory optimization and APF method are implemented in a closed-loop fashion continuously, but in parallel at moderate and high frequencies, respectively. The initial guess for the optimization is provided based on the previous solution. The proposed approach is tested and validated through indoor experiments. △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2302.12617 [pdf, other]

Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains

Authors: Jingwei Zhang, Jost Tobias Springenberg, Arunkumar Byravan, Leonard Hasenclever, Abbas Abdolmaleki, Dushyant Rao, Nicolas Heess, Martin Riedmiller

Abstract: In this paper we study the problem of learning multi-step dynamics prediction models (jumpy models) from unlabeled experience and their utility for fast inference of (high-level) plans in downstream tasks. In particular we propose to learn a jumpy model alongside a skill embedding space offline, from previously collected experience for which no labels or reward annotations are required. We then in… ▽ More In this paper we study the problem of learning multi-step dynamics prediction models (jumpy models) from unlabeled experience and their utility for fast inference of (high-level) plans in downstream tasks. In particular we propose to learn a jumpy model alongside a skill embedding space offline, from previously collected experience for which no labels or reward annotations are required. We then investigate several options of harnessing those learned components in combination with model-based planning or model-free reinforcement learning (RL) to speed up learning on downstream tasks. We conduct a set of experiments in the RGB-stacking environment, showing that planning with the learned skills and the associated model can enable zero-shot generalization to new tasks, and can further speed up training of policies via reinforcement learning. These experiments demonstrate that jumpy models which incorporate temporal abstraction can facilitate planning in long-horizon tasks in which standard dynamics models fail. △ Less

Submitted 24 February, 2023; originally announced February 2023.

arXiv:2302.10147 [pdf, ps, other]

A DNN based Normalized Time-frequency Weighted Criterion for Robust Wideband DoA Estimation

Authors: Kuan-Lin Chen, Ching-Hua Lee, Bhaskar D. Rao, Harinath Garudadri

Abstract: Deep neural networks (DNNs) have greatly benefited direction of arrival (DoA) estimation methods for speech source localization in noisy environments. However, their localization accuracy is still far from satisfactory due to the vulnerability to nonspeech interference. To improve the robustness against interference, we propose a DNN based normalized time-frequency (T-F) weighted criterion which m… ▽ More Deep neural networks (DNNs) have greatly benefited direction of arrival (DoA) estimation methods for speech source localization in noisy environments. However, their localization accuracy is still far from satisfactory due to the vulnerability to nonspeech interference. To improve the robustness against interference, we propose a DNN based normalized time-frequency (T-F) weighted criterion which minimizes the distance between the candidate steering vectors and the filtered snapshots in the T-F domain. Our method requires no eigendecomposition and uses a simple normalization to prevent the optimization objective from being misled by noisy filtered snapshots. We also study different designs of T-F weights guided by a DNN. We find that duplicating the Hadamard product of speech ratio masks is highly effective and better than other techniques such as direct masking and taking the mean in the proposed approach. However, the best-performing design of T-F weights is criterion-dependent in general. Experiments show that the proposed method outperforms popular DNN based DoA estimation methods including widely used subspace methods in noisy and reverberant environments. △ Less

Submitted 20 February, 2023; originally announced February 2023.

Comments: 5 pages. Accepted at ICASSP 2023

arXiv:2301.13379 [pdf, other]

Faithful Chain-of-Thought Reasoning

Authors: Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, Chris Callison-Burch

Abstract: While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving… ▽ More While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving (reasoning chain $\rightarrow$ answer), using an LM and a deterministic solver respectively. This guarantees that the reasoning chain provides a faithful explanation of the final answer. Aside from interpretability, Faithful CoT also improves empirical performance: it outperforms standard CoT on 9 of 10 benchmarks from 4 diverse domains, with a relative accuracy gain of 6.3% on Math Word Problems (MWP), 3.4% on Planning, 5.5% on Multi-hop Question Answering (QA), and 21.4% on Relational Inference. Furthermore, with GPT-4 and Codex, it sets the new state-of-the-art few-shot performance on 7 datasets (with 95.0+ accuracy on 6 of them), showing a strong synergy between faithfulness and accuracy. △ Less

Submitted 20 September, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: IJCNLP-AACL 2023 camera-ready version

arXiv:2211.13743 [pdf, other]

SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration

Authors: Giulia Vezzani, Dhruva Tirumala, Markus Wulfmeier, Dushyant Rao, Abbas Abdolmaleki, Ben Moran, Tuomas Haarnoja, Jan Humplik, Roland Hafner, Michael Neunert, Claudio Fantacci, Tim Hertweck, Thomas Lampe, Fereshteh Sadeghi, Nicolas Heess, Martin Riedmiller

Abstract: The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitations.For example, fine-tuning an existing policy frequently fails, as the policy can degrade rapidly early in training. In a similar vein, distillation of expert be… ▽ More The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitations.For example, fine-tuning an existing policy frequently fails, as the policy can degrade rapidly early in training. In a similar vein, distillation of expert behavior can lead to poor results when given sub-optimal experts. We compare several common approaches for skill transfer on multiple domains including changes in task and system dynamics. We identify how existing methods can fail and introduce an alternative approach to mitigate these problems. Our approach learns to sequence existing temporally-extended skills for exploration but learns the final policy directly from the raw experience. This conceptual split enables rapid adaptation and thus efficient data collection but without constraining the final solution.It significantly outperforms many classical methods across a suite of evaluation tasks and we use a broad set of ablations to highlight the importance of differentc omponents of our method. △ Less

Submitted 11 January, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

arXiv:2211.13356 [pdf, ps, other]

Vector Quantization Methods for Access Point Placement in Cell-Free Massive MIMO Systems

Authors: Govind R. Gopal, Bhaskar D. Rao

Abstract: We examine the problem of uplink cell-free access point (AP) placement in the context of optimal throughput. In this regard, we formulate two main placement problems, namely the sum rate and minimum rate maximization problems, and discuss the challenges associated with solving the underlying optimization problem with the help of some simple scenarios. As a practical solution to the AP placement pr… ▽ More We examine the problem of uplink cell-free access point (AP) placement in the context of optimal throughput. In this regard, we formulate two main placement problems, namely the sum rate and minimum rate maximization problems, and discuss the challenges associated with solving the underlying optimization problem with the help of some simple scenarios. As a practical solution to the AP placement problem, we suggest a vector quantization (VQ) approach. The suitability of the VQ approach to cell-free AP placement is investigated by examining three VQ-based solutions. First, the standard VQ approach, that is the Lloyd algorithm (using the squared error distortion function) is described. Second, the tree-structured VQ (TSVQ), which performs successive partitioning of the distribution space is applied. Third, a probability density function optimized VQ (PDFVQ) procedure is outlined, which enables efficient, low complexity, and scalable placement, and is aimed at a massive distributed multiple-input-multiple-output (MIMO) scenario. While the VQ-based solutions do not solve the cell-free AP placement problems explicitly, numerical experiments show that their sum and minimum rate performances are good enough, and offer a good starting point for gradient-based optimization methods. Among the VQ solutions, PDFVQ, with advantages over the other VQ methods, offers a good trade-off between sum and minimum rates. △ Less

Submitted 23 November, 2022; originally announced November 2022.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2211.07320 [pdf, other]

doi 10.1038/s41557-023-01300-3

Direct observation of geometric phase in dynamics around a conical intersection

Authors: Christophe H. Valahu, Vanessa C. Olaya-Agudelo, Ryan J. MacDonell, Tomas Navickas, Arjun D. Rao, Maverick J. Millican, Juan B. Pérez-Sánchez, Joel Yuen-Zhou, Michael J. Biercuk, Cornelius Hempel, Ting Rei Tan, Ivan Kassal

Abstract: Conical intersections are ubiquitous in chemistry and physics, often governing processes such as light harvesting, vision, photocatalysis, and chemical reactivity. They act as funnels between electronic states of molecules, allowing rapid and efficient relaxation during chemical dynamics. In addition, when a reaction path encircles a conical intersection, the molecular wavefunction experiences a g… ▽ More Conical intersections are ubiquitous in chemistry and physics, often governing processes such as light harvesting, vision, photocatalysis, and chemical reactivity. They act as funnels between electronic states of molecules, allowing rapid and efficient relaxation during chemical dynamics. In addition, when a reaction path encircles a conical intersection, the molecular wavefunction experiences a geometric phase, which can affect the outcome of the reaction through quantum-mechanical interference. Past experiments have measured indirect signatures of geometric phases in scattering patterns and spectroscopic observables, but there has been no direct observation of the underlying wavepacket interference. Here, we experimentally observe geometric-phase interference in the dynamics of a wavepacket travelling around an engineered conical intersection in a programmable trapped-ion quantum simulator. To achieve this, we develop a technique to reconstruct the two-dimensional wavepacket densities of a trapped ion. Experiments agree with the theoretical model, demonstrating the ability of analog quantum simulators -- such as those realised using trapped ions -- to accurately describe nuclear quantum effects. △ Less

Submitted 11 August, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

Comments: 10 pages, 5 figures

arXiv:2211.05351 [pdf]

Biomedical Multi-hop Question Answering Using Knowledge Graph Embeddings and Language Models

Authors: Dattaraj J. Rao, Shraddha S. Mane, Mukta A. Paliwal

Abstract: Biomedical knowledge graphs (KG) are heterogenous networks consisting of biological entities as nodes and relations between them as edges. These entities and relations are extracted from millions of research papers and unified in a single resource. The goal of biomedical multi-hop question-answering over knowledge graph (KGQA) is to help biologist and scientist to get valuable insights by asking q… ▽ More Biomedical knowledge graphs (KG) are heterogenous networks consisting of biological entities as nodes and relations between them as edges. These entities and relations are extracted from millions of research papers and unified in a single resource. The goal of biomedical multi-hop question-answering over knowledge graph (KGQA) is to help biologist and scientist to get valuable insights by asking questions in natural language. Relevant answers can be found by first understanding the question and then querying the KG for right set of nodes and relationships to arrive at an answer. To model the question, language models such as RoBERTa and BioBERT are used to understand context from natural language question. One of the challenges in KGQA is missing links in the KG. Knowledge graph embeddings (KGE) help to overcome this problem by encoding nodes and edges in a dense and more efficient way. In this paper, we use a publicly available KG called Hetionet which is an integrative network of biomedical knowledge assembled from 29 different databases of genes, compounds, diseases, and more. We have enriched this KG dataset by creating a multi-hop biomedical question-answering dataset in natural language for testing the biomedical multi-hop question-answering system and this dataset will be made available to the research community. The major contribution of this research is an integrated system that combines language models with KG embeddings to give highly relevant answers to free-form questions asked by biologists in an intuitive interface. Biomedical multi-hop question-answering system is tested on this data and results are highly encouraging. △ Less

Submitted 10 November, 2022; originally announced November 2022.

ACM Class: I.2.4; I.2.7

arXiv:2211.05095 [pdf]

doi 10.1016/j.optlastec.2022.108928

Tamm plasmon polariton in planar structures: A brief overview and applications

Authors: Chinmaya Kar, Shuvendu Jena, Dinesh V. Udupa, K. Divakar Rao

Abstract: Tamm plasmon provides a new avenue in plasmonics of interface states in planar multilayer structures due to its strong light matter interaction. This article reviews the research and development in Tamm plasmon polariton excited at the interface of a metal and a distributed Bragg reflector. Tamm plasmon offers an easy planar solution compared to patterned surface plasmon devices with huge field en… ▽ More Tamm plasmon provides a new avenue in plasmonics of interface states in planar multilayer structures due to its strong light matter interaction. This article reviews the research and development in Tamm plasmon polariton excited at the interface of a metal and a distributed Bragg reflector. Tamm plasmon offers an easy planar solution compared to patterned surface plasmon devices with huge field enhancement at the interface and does not require of any phase matching method for its excitation. The ease of depositing multilayer thin film stacks, direct optical excitation, and high-Q modes make Tamm plasmons an attractive field of research with potential practical applications. The basic properties of the Tamm plasmon modes including its dispersion, effect of different plasmon active metals, coupling with other resonant modes and their polarization splitting, and tunability of Tamm plasmon coupled hybrid modes under externally applied stimuli have been discussed. The application of Tamm plasmon modes in lasers, hot electron photodetectors, perfect absorbers, thermal emitters, light emitting devices, and sensors have also been discussed in detail. This review covers all the major advancements in this field over the last fifteen years with special emphasis on the application part. △ Less

Submitted 9 November, 2022; originally announced November 2022.

arXiv:2210.12448 [pdf, other]

Probing Transfer in Deep Reinforcement Learning without Task Engineering

Authors: Andrei A. Rusu, Sebastian Flennerhag, Dushyant Rao, Razvan Pascanu, Raia Hadsell

Abstract: We evaluate the use of original game curricula supported by the Atari 2600 console as a heterogeneous transfer benchmark for deep reinforcement learning agents. Game designers created curricula using combinations of several discrete modifications to the basic versions of games such as Space Invaders, Breakout and Freeway, making them progressively more challenging for human players. By formally or… ▽ More We evaluate the use of original game curricula supported by the Atari 2600 console as a heterogeneous transfer benchmark for deep reinforcement learning agents. Game designers created curricula using combinations of several discrete modifications to the basic versions of games such as Space Invaders, Breakout and Freeway, making them progressively more challenging for human players. By formally organising these modifications into several factors of variation, we are able to show that Analyses of Variance (ANOVA) are a potent tool for studying the effects of human-relevant domain changes on the learning and transfer performance of a deep reinforcement learning agent. Since no manual task engineering is needed on our part, leveraging the original multi-factorial design avoids the pitfalls of unintentionally biasing the experimental setup. We find that game design factors have a large and statistically significant impact on an agent's ability to learn, and so do their combinatorial interactions. Furthermore, we show that zero-shot transfer from the basic games to their respective variations is possible, but the variance in performance is also largely explained by interactions between factors. As such, we argue that Atari game curricula offer a challenging benchmark for transfer learning in RL, that can help the community better understand the generalisation capabilities of RL agents along dimensions which meaningfully impact human generalisation performance. As a start, we report that value-function finetuning of regularly trained agents achieves positive transfer in a majority of cases, but significant headroom for algorithmic innovation remains. We conclude with the observation that selective transfer from multiple variants could further improve performance. △ Less

Submitted 22 October, 2022; originally announced October 2022.

arXiv:2210.07236 [pdf, ps, other]

Improved Bounds on Neural Complexity for Representing Piecewise Linear Functions

Authors: Kuan-Lin Chen, Harinath Garudadri, Bhaskar D. Rao

Abstract: A deep neural network using rectified linear units represents a continuous piecewise linear (CPWL) function and vice versa. Recent results in the literature estimated that the number of neurons needed to exactly represent any CPWL function grows exponentially with the number of pieces or exponentially in terms of the factorial of the number of distinct linear components. Moreover, such growth is a… ▽ More A deep neural network using rectified linear units represents a continuous piecewise linear (CPWL) function and vice versa. Recent results in the literature estimated that the number of neurons needed to exactly represent any CPWL function grows exponentially with the number of pieces or exponentially in terms of the factorial of the number of distinct linear components. Moreover, such growth is amplified linearly with the input dimension. These existing results seem to indicate that the cost of representing a CPWL function is expensive. In this paper, we propose much tighter bounds and establish a polynomial time algorithm to find a network satisfying these bounds for any given CPWL function. We prove that the number of hidden neurons required to exactly represent any CPWL function is at most a quadratic function of the number of pieces. In contrast to all previous results, this upper bound is invariant to the input dimension. Besides the number of pieces, we also study the number of distinct linear components in CPWL functions. When such a number is also given, we prove that the quadratic complexity turns into bilinear, which implies a lower neural complexity because the number of distinct linear components is always not greater than the minimum number of pieces in a CPWL function. When the number of pieces is unknown, we prove that, in terms of the number of distinct linear components, the neural complexities of any CPWL function are at most polynomial growth for low-dimensional inputs and factorial growth for the worst-case scenario, which are significantly better than existing results in the literature. △ Less

Submitted 15 January, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

Comments: 31 pages. Accepted at NeurIPS 2022

arXiv:2210.03266 [pdf, ps, other]

doi 10.1109/TSP.2023.3254919

Maximum Likelihood-based Gridless DoA Estimation Using Structured Covariance Matrix Recovery and SBL with Grid Refinement

Authors: Rohan R. Pote, Bhaskar D. Rao

Abstract: We consider the parametric data model employed in applications such as line spectral estimation and direction-of-arrival estimation. We focus on the stochastic maximum likelihood estimation (MLE) framework and offer approaches to estimate the parameter of interest in a gridless manner, overcoming the model complexities of the past. This progress is enabled by the modern trend of reparameterization… ▽ More We consider the parametric data model employed in applications such as line spectral estimation and direction-of-arrival estimation. We focus on the stochastic maximum likelihood estimation (MLE) framework and offer approaches to estimate the parameter of interest in a gridless manner, overcoming the model complexities of the past. This progress is enabled by the modern trend of reparameterization of the objective and exploiting the sparse Bayesian learning (SBL) approach. The latter is shown to be a correlation-aware method, and for the underlying problem it is identified as a grid-based technique for recovering a structured covariance matrix of the measurements. For the case when the structured matrix is expressible as a sampled Toeplitz matrix, such as when measurements are sampled in time or space at regular intervals, additional constraints and reparameterization of the SBL objective leads to the proposed structured matrix recovery technique based on MLE. The proposed optimization problem is non-convex, and we propose a majorization-minimization based iterative procedure to estimate the structured matrix; each iteration solves a semidefinite program. We recover the parameter of interest in a gridless manner by appealing to the Caratheodory-Fejer result on decomposition of PSD Toeplitz matrices. For the general case of irregularly spaced time or spatial samples, we propose an iterative SBL procedure that refines grid points to increase resolution near potential source locations, while maintaining a low per iteration complexity. We provide numerical results to evaluate and compare the performance of the proposed techniques with other gridless techniques, and the CRB. The proposed correlation-aware approach is more robust to environmental/system effects such as low number of snapshots, correlated sources, small separation between source locations and improves sources identifiability. △ Less

Submitted 6 October, 2022; originally announced October 2022.

Comments: Submitted to the IEEE Transactions on Signal Processing (Previous submission date: 29-Oct-2021)

arXiv:2209.06558 [pdf, other]

doi 10.1039/D3SC02453A

Predicting molecular vibronic spectra using time-domain analog quantum simulation

Authors: Ryan J. MacDonell, Tomas Navickas, Tim F. Wohlers-Reichel, Christophe H. Valahu, Arjun D. Rao, Maverick J. Millican, Michael A. Currington, Michael J. Biercuk, Ting Rei Tan, Cornelius Hempel, Ivan Kassal

Abstract: Spectroscopy is one of the most accurate probes of the molecular world. However, predicting molecular spectra accurately is computationally difficult because of the presence of entanglement between electronic and nuclear degrees of freedom. Although quantum computers promise to reduce this computational cost, existing quantum approaches rely on combining signals from individual eigenstates, an app… ▽ More Spectroscopy is one of the most accurate probes of the molecular world. However, predicting molecular spectra accurately is computationally difficult because of the presence of entanglement between electronic and nuclear degrees of freedom. Although quantum computers promise to reduce this computational cost, existing quantum approaches rely on combining signals from individual eigenstates, an approach that is difficult to scale because the number of eigenstates grows exponentially with molecule size. Here, we introduce a method for scalable analog quantum simulation of molecular spectroscopy, by performing simulations in the time domain. Our approach can treat more complicated molecular models than previous ones, requires fewer approximations, and can be extended to open quantum systems with minimal overhead. We present a direct mapping of the underlying problem of time-domain simulation of molecular spectra to the degrees of freedom and control fields available in a trapped-ion quantum simulator. We experimentally demonstrate our algorithm on a trapped-ion device, exploiting both intrinsic electronic and motional degrees of freedom, showing excellent quantitative agreement for a single-mode vibronic photoelectron spectrum of SO$_2$. △ Less

Submitted 10 August, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

Comments: 13 pages, 8 figures

arXiv:2209.01947 [pdf, other]

MO2: Model-Based Offline Options

Authors: Sasha Salter, Markus Wulfmeier, Dhruva Tirumala, Nicolas Heess, Martin Riedmiller, Raia Hadsell, Dushyant Rao

Abstract: The ability to discover useful behaviours from past experience and transfer them to new tasks is considered a core component of natural embodied intelligence. Inspired by neuroscience, discovering behaviours that switch at bottleneck states have been long sought after for inducing plans of minimum description length across tasks. Prior approaches have either only supported online, on-policy, bottl… ▽ More The ability to discover useful behaviours from past experience and transfer them to new tasks is considered a core component of natural embodied intelligence. Inspired by neuroscience, discovering behaviours that switch at bottleneck states have been long sought after for inducing plans of minimum description length across tasks. Prior approaches have either only supported online, on-policy, bottleneck state discovery, limiting sample-efficiency, or discrete state-action domains, restricting applicability. To address this, we introduce Model-Based Offline Options (MO2), an offline hindsight framework supporting sample-efficient bottleneck option discovery over continuous state-action spaces. Once bottleneck options are learnt offline over source domains, they are transferred online to improve exploration and value estimation on the transfer domain. Our experiments show that on complex long-horizon continuous control tasks with sparse, delayed rewards, MO2's properties are essential and lead to performance exceeding recent option learning methods. Additional ablations further demonstrate the impact on option predictability and credit assignment. △ Less

Submitted 5 September, 2022; originally announced September 2022.

Comments: Accepted at 1st Conference on Lifelong Learning Agents (CoLLAs) Conference Track, 2022

arXiv:2208.05552 [pdf, other]

Towards Automating Retinoscopy for Refractive Error Diagnosis

Authors: Aditya Aggarwal, Siddhartha Gairola, Uddeshya Upadhyay, Akshay P Vasishta, Diwakar Rao, Aditya Goyal, Kaushik Murali, Nipun Kwatra, Mohit Jain

Abstract: Refractive error is the most common eye disorder and is the key cause behind correctable visual impairment, responsible for nearly 80% of the visual impairment in the US. Refractive error can be diagnosed using multiple methods, including subjective refraction, retinoscopy, and autorefractors. Although subjective refraction is the gold standard, it requires cooperation from the patient and hence i… ▽ More Refractive error is the most common eye disorder and is the key cause behind correctable visual impairment, responsible for nearly 80% of the visual impairment in the US. Refractive error can be diagnosed using multiple methods, including subjective refraction, retinoscopy, and autorefractors. Although subjective refraction is the gold standard, it requires cooperation from the patient and hence is not suitable for infants, young children, and developmentally delayed adults. Retinoscopy is an objective refraction method that does not require any input from the patient. However, retinoscopy requires a lens kit and a trained examiner, which limits its use for mass screening. In this work, we automate retinoscopy by attaching a smartphone to a retinoscope and recording retinoscopic videos with the patient wearing a custom pair of paper frames. We develop a video processing pipeline that takes retinoscopic videos as input and estimates the net refractive error based on our proposed extension of the retinoscopy mathematical model. Our system alleviates the need for a lens kit and can be performed by an untrained examiner. In a clinical trial with 185 eyes, we achieved a sensitivity of 91.0% and specificity of 74.0% on refractive error diagnosis. Moreover, the mean absolute error of our approach was 0.75$\pm$0.67D on net refractive error estimation compared to subjective refraction measurements. Our results indicate that our approach has the potential to be used as a retinoscopy-based refractive error screening tool in real-world medical settings. △ Less

Submitted 10 August, 2022; originally announced August 2022.

Comments: This paper is accepted for publication in IMWUT 2022

arXiv:2204.05893 [pdf, other]

Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data

Authors: Wenxuan Zhou, Steven Bohez, Jan Humplik, Abbas Abdolmaleki, Dushyant Rao, Markus Wulfmeier, Tuomas Haarnoja, Nicolas Heess

Abstract: Robots will experience non-stationary environment dynamics throughout their lifetime: the robot dynamics can change due to wear and tear, or its surroundings may change over time. Eventually, the robots should perform well in all of the environment variations it has encountered. At the same time, it should still be able to learn fast in a new environment. We identify two challenges in Reinforcemen… ▽ More Robots will experience non-stationary environment dynamics throughout their lifetime: the robot dynamics can change due to wear and tear, or its surroundings may change over time. Eventually, the robots should perform well in all of the environment variations it has encountered. At the same time, it should still be able to learn fast in a new environment. We identify two challenges in Reinforcement Learning (RL) under such a lifelong learning setting with off-policy data: first, existing off-policy algorithms struggle with the trade-off between being conservative to maintain good performance in the old environment and learning efficiently in the new environment, despite keeping all the data in the replay buffer. We propose the Offline Distillation Pipeline to break this trade-off by separating the training procedure into an online interaction phase and an offline distillation phase.Second, we find that training with the imbalanced off-policy data from multiple environments across the lifetime creates a significant performance drop. We identify that this performance drop is caused by the combination of the imbalanced quality and size among the datasets which exacerbate the extrapolation error of the Q-function. During the distillation phase, we apply a simple fix to the issue by keeping the policy closer to the behavior policy that generated the data. In the experiments, we demonstrate these two challenges and the proposed solutions with a simulated bipedal robot walk-ing task across various environment changes. We show that the Offline Distillation Pipeline achieves better performance across all the encountered environments without affecting data collection. We also provide a comprehensive empirical study to support our hypothesis on the data imbalance issue. △ Less

Submitted 18 August, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

Comments: Published at 1st Conference on Lifelong Learning Agents, 2022

arXiv:2204.02799 [pdf]

doi 10.1002/aelm.202200975

Scandium Nitride as a Gateway III-Nitride Semiconductor for Optoelectronic Artificial Synaptic Devices

Authors: Dheemahi Rao, Bivas Saha

Abstract: Traditional computation based on von Neumann architecture is limited by the time and energy consumption due to data transfer between the storage and the processing units. The von Neumann architecture is also inefficient in solving unstructured, probabilistic, and real-time problems. To address these challenges, a new brain-inspired neuromorphic computational architecture is required. Due to absenc… ▽ More Traditional computation based on von Neumann architecture is limited by the time and energy consumption due to data transfer between the storage and the processing units. The von Neumann architecture is also inefficient in solving unstructured, probabilistic, and real-time problems. To address these challenges, a new brain-inspired neuromorphic computational architecture is required. Due to absence of resistance-capacitance (RC) delay, high bandwidth and low power consumption, optoelectronic artificial synaptic devices are highly attractive. Yet stable, scalable, and complementary-metal-oxide-semiconductor (CMOS)-compatible synapses have not been demonstrated. In this work, persistence in the photoconductivity of undoped and magnesium-doped scandium nitride (ScN) is equated to the inhibitory and excitatory synaptic plasticity of the biological synapses responsible for memory and learning. Primary functionalities of a biological synapse like short-term memory (STM), long-term memory (LTM), the transition from STM-to-LTM, learning and forgetting, frequency-selective optical filtering, frequency-dependent potentiation and depression, Hebbian learning, and logic gate operations are demonstrated. △ Less

Submitted 6 April, 2022; originally announced April 2022.

Comments: 14 pages, 5 figures. It is currently under review

Journal ref: Adv. Electron. Mater. 2022, 2200975

arXiv:2201.11889 [pdf]

doi 10.1073/pnas.1920036117

Squeezed metallic droplet with tunable Kubo gap and charge injection in transition metal dichalcogenides

Authors: Jiaren Yuan, Yuanping Chen, Yuee Xie, Xiaoyu Zhang, Dewei Rao, Yandong Guo, Xiaohong Yan, Yuanping Feng, Yongqing Cai

Abstract: Shrinking the size of a bulk metal into nanoscale leads to the discreteness of electronic energy levels, the so-called Kubo gap. Renormalization of the electronic properties with a tunable and size-dependent Kubo gap renders fascinating photon emission and electron tunneling. In contrast with usual three-dimensional (3D) metal clusters, here we demonstrate that Kubo gap can be achieved with a two-… ▽ More Shrinking the size of a bulk metal into nanoscale leads to the discreteness of electronic energy levels, the so-called Kubo gap. Renormalization of the electronic properties with a tunable and size-dependent Kubo gap renders fascinating photon emission and electron tunneling. In contrast with usual three-dimensional (3D) metal clusters, here we demonstrate that Kubo gap can be achieved with a two-dimensional (2D) metallic transition metal dichalcogenide (i.e., 1T'-phase MoTe2) nanocluster embedded in a semiconducting polymorph (i.e., 1H-phase MoTe2). Such a 1T'-1H MoTe2 nanodomain resembles a 3D metallic droplet squeezed in a 2D space which shows a strong polarization catastrophe while simultaneously maintains its bond integrity which is absent in traditional delta-gapped 3D clusters. The weak screening of the host 2D MoTe2 leads to photon emission of such pseudo-metallic systems and a ballistic injection of carriers in the 1T'-1H-1T' homojunctions which may find applications in sensors and 2D reconfigurable devices. △ Less

Submitted 27 January, 2022; originally announced January 2022.

Journal ref: Proc. Natl. Acad. Sci. U S A 117, 6362-6369 (2020)

arXiv:2201.10152 [pdf, other]

Unsupervised Image Fusion Method based on Feature Mutual Mapping

Authors: Dongyu Rao, Xiao-Jun Wu, Tianyang Xu, Guoyang Chen

Abstract: Deep learning-based image fusion approaches have obtained wide attention in recent years, achieving promising performance in terms of visual perception. However, the fusion module in the current deep learning-based methods suffers from two limitations, \textit{i.e.}, manually designed fusion function, and input-independent network learning. In this paper, we propose an unsupervised adaptive image… ▽ More Deep learning-based image fusion approaches have obtained wide attention in recent years, achieving promising performance in terms of visual perception. However, the fusion module in the current deep learning-based methods suffers from two limitations, \textit{i.e.}, manually designed fusion function, and input-independent network learning. In this paper, we propose an unsupervised adaptive image fusion method to address the above issues. We propose a feature mutual mapping fusion module and dual-branch multi-scale autoencoder. More specifically, we construct a global map to measure the connections of pixels between the input source images. % The found mapping relationship guides the image fusion. Besides, we design a dual-branch multi-scale network through sampling transformation to extract discriminative image features. We further enrich feature representations of different scales through feature aggregation in the decoding process. Finally, we propose a modified loss function to train the network with efficient convergence property. Through sufficient training on infrared and visible image data sets, our method also shows excellent generalized performance in multi-focus and medical image fusion. Our method achieves superior performance in both visual perception and objective evaluation. Experiments prove that the performance of our proposed method on a variety of image fusion tasks surpasses other state-of-the-art methods, proving the effectiveness and versatility of our approach. △ Less

Submitted 29 January, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

arXiv:2201.10147 [pdf, other]

TGFuse: An Infrared and Visible Image Fusion Approach Based on Transformer and Generative Adversarial Network

Authors: Dongyu Rao, Xiao-Jun Wu, Tianyang Xu

Abstract: The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perception for complex scenario fusion. In this paper, therefore, we propose an infrared and visible image fu… ▽ More The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perception for complex scenario fusion. In this paper, therefore, we propose an infrared and visible image fusion algorithm based on a lightweight transformer module and adversarial learning. Inspired by the global interaction power, we use the transformer technique to learn the effective global fusion relations. In particular, shallow features extracted by CNN are interacted in the proposed transformer fusion module to refine the fusion relationship within the spatial scope and across channels simultaneously. Besides, adversarial learning is designed in the training process to improve the output discrimination via imposing competitive consistency from the inputs, reflecting the specific characteristics in infrared and visible images. The experimental performance demonstrates the effectiveness of the proposed modules, with superior improvement against the state-of-the-art, generalising a novel paradigm via transformer and adversarial learning in the fusion task. △ Less

Submitted 3 February, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

arXiv:2112.05062 [pdf, other]

Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies

Authors: Dushyant Rao, Fereshteh Sadeghi, Leonard Hasenclever, Markus Wulfmeier, Martina Zambelli, Giulia Vezzani, Dhruva Tirumala, Yusuf Aytar, Josh Merel, Nicolas Heess, Raia Hadsell

Abstract: For robots operating in the real world, it is desirable to learn reusable behaviours that can effectively be transferred and adapted to numerous tasks and scenarios. We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model. In contrast to existing work, our method exploits a three-level hierarchy of both discrete and continuous latent varia… ▽ More For robots operating in the real world, it is desirable to learn reusable behaviours that can effectively be transferred and adapted to numerous tasks and scenarios. We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model. In contrast to existing work, our method exploits a three-level hierarchy of both discrete and continuous latent variables, to capture a set of high-level behaviours while allowing for variance in how they are executed. We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours, while retaining the flexibility of a continuous latent variable model. The resulting skills can be transferred and fine-tuned on new tasks, unseen objects, and from state to vision-based policies, yielding better sample efficiency and asymptotic performance compared to existing skill- and imitation-based methods. We further analyse how and when the skills are most beneficial: they encourage directed exploration to cover large regions of the state space relevant to the task, making them most effective in challenging sparse-reward settings. △ Less

Submitted 14 March, 2022; v1 submitted 9 December, 2021; originally announced December 2021.

arXiv:2112.01207 [pdf]

Photonic and electronic state interactions in BaTiO3 based Optical Microcavity

Authors: Jitendra Nath Acharyya, R. B. Gangineni, D. Narayana Rao, G. Vijaya Prakash

Abstract: The photonic modes mediated absorption dynamics at femtosecond time scales along with the control of spontaneous emission tunability are investigated all-dielectric optical microcavity having BaTiO3 (BTO) as defect layer. The cavity-enhanced transient absorption reveals the dominant excited state absorption (ESA) of both photonic and electronic modes due to strong third-order optical nonlinearity… ▽ More The photonic modes mediated absorption dynamics at femtosecond time scales along with the control of spontaneous emission tunability are investigated all-dielectric optical microcavity having BaTiO3 (BTO) as defect layer. The cavity-enhanced transient absorption reveals the dominant excited state absorption (ESA) of both photonic and electronic modes due to strong third-order optical nonlinearity influence. Photoluminescence of BTO is found to be guided and tuned by the photonic cavity mode. Such active photonic structures can be envisaged as a potential candidate in nonlinear optics and photonic device applications. △ Less

Submitted 2 December, 2021; originally announced December 2021.

Comments: 7 pages, 4 figures, iNEEBA 2021 Conference, Paper ID: OP-ESC-07

arXiv:2111.08952 [pdf, other]

doi 10.1109/IEEECONF44664.2019.9048906

A Generalized Proportionate-Type Normalized Subband Adaptive Filter

Authors: Kuan-Lin Chen, Ching-Hua Lee, Bhaskar D. Rao, Harinath Garudadri

Abstract: We show that a new design criterion, i.e., the least squares on subband errors regularized by a weighted norm, can be used to generalize the proportionate-type normalized subband adaptive filtering (PtNSAF) framework. The new criterion directly penalizes subband errors and includes a sparsity penalty term which is minimized using the damped regularized Newton's method. The impact of the proposed g… ▽ More We show that a new design criterion, i.e., the least squares on subband errors regularized by a weighted norm, can be used to generalize the proportionate-type normalized subband adaptive filtering (PtNSAF) framework. The new criterion directly penalizes subband errors and includes a sparsity penalty term which is minimized using the damped regularized Newton's method. The impact of the proposed generalized PtNSAF (GPtNSAF) is studied for the system identification problem via computer simulations. Specifically, we study the effects of using different numbers of subbands and various sparsity penalty terms for quasi-sparse, sparse, and dispersive systems. The results show that the benefit of increasing the number of subbands is larger than promoting sparsity of the estimated filter coefficients when the target system is quasi-sparse or dispersive. On the other hand, for sparse target systems, promoting sparsity becomes more important. More importantly, the two aspects provide complementary and additive benefits to the GPtNSAF for speeding up convergence. △ Less

Submitted 17 November, 2021; originally announced November 2021.

Comments: 5 pages. Presented at Asilomar Conference on Signals, Systems, and Computers (ACSSC) 2019

arXiv:2111.05496 [pdf, other]

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

Authors: Kuan-Lin Chen, Ching-Hua Lee, Harinath Garudadri, Bhaskar D. Rao

Abstract: Models recently used in the literature proving residual networks (ResNets) are better than linear predictors are actually different from standard ResNets that have been widely used in computer vision. In addition to the assumptions such as scalar-valued output or single residual block, these models have no nonlinearities at the final residual representation that feeds into the final affine layer.… ▽ More Models recently used in the literature proving residual networks (ResNets) are better than linear predictors are actually different from standard ResNets that have been widely used in computer vision. In addition to the assumptions such as scalar-valued output or single residual block, these models have no nonlinearities at the final residual representation that feeds into the final affine layer. To codify such a difference in nonlinearities and reveal a linear estimation property, we define ResNEsts, i.e., Residual Nonlinear Estimators, by simply dropping nonlinearities at the last residual representation from standard ResNets. We show that wide ResNEsts with bottleneck blocks can always guarantee a very desirable training property that standard ResNets aim to achieve, i.e., adding more blocks does not decrease performance given the same set of basis elements. To prove that, we first recognize ResNEsts are basis function models that are limited by a coupling problem in basis learning and linear prediction. Then, to decouple prediction weights from basis learning, we construct a special architecture termed augmented ResNEst (A-ResNEst) that always guarantees no worse performance with the addition of a block. As a result, such an A-ResNEst establishes empirical risk lower bounds for a ResNEst using corresponding bases. Our results demonstrate ResNEsts indeed have a problem of diminishing feature reuse; however, it can be avoided by sufficiently expanding or widening the input space, leading to the above-mentioned desirable property. Inspired by the DenseNets that have been shown to outperform ResNets, we also propose a corresponding new model called Densely connected Nonlinear Estimator (DenseNEst). We show that any DenseNEst can be represented as a wide ResNEst with bottleneck blocks. Unlike ResNEsts, DenseNEsts exhibit the desirable property without any special architectural re-design. △ Less

Submitted 15 January, 2022; v1 submitted 9 November, 2021; originally announced November 2021.

Comments: 24 pages. Accepted by NeurIPS 2021. Remark 1 clarified and typos corrected

arXiv:2110.04008 [pdf, ps, other]

Synthesis and study of ScN thin films

Authors: Susmita Chowdhury, Rachana Gupta, Parasmani Rajput, Akhil Tayal, Dheemahi Rao, Reddy Sekhar, Shashi Prakash, Ramaseshan Rajagopalan, S. N. Jha, Bivas Saha, Mukul Gupta

Abstract: To contemplate an alternative approach for the minimization of diffusion at high temperature depositions, present findings impart viability of room-temperature deposited reactively sputtered ScN thin film samples. The adopted room temperature route endows precise control over the $R_{N_2}$ flow for a methodical structural phase evolution from Sc$\to$ScN and probe the correlated physical aspects of… ▽ More To contemplate an alternative approach for the minimization of diffusion at high temperature depositions, present findings impart viability of room-temperature deposited reactively sputtered ScN thin film samples. The adopted room temperature route endows precise control over the $R_{N_2}$ flow for a methodical structural phase evolution from Sc$\to$ScN and probe the correlated physical aspects of the highly textured ScN samples. In the nitrided regime i.e. at $R_{N_2}$ = 2.5-100% flow, incorporation of unintentional oxygen defects were evidenced from surface sensitive soft x-ray absorption spectroscopy study, though less compared to their metal ($R_{N_2} = 0\%$) and interstitial ($R_{N_2} = 1.6\%$) counterparts, due to higher Gibb's free energy for Sc-O-N formation with no trace of ligand field splitting around the O K-edge spectra. To eradicate the sceptism of appearance of N K-edge (401.6 eV) and Sc L-edge (402.2 eV) absorption spectra adjacent to each other, the nascent Sc K-edge study has been adopted for the first time to validate complementary insight on the metrical parameters of the Sc-N system taken into consideration. Optical bandgaps of the polycrystalline ScN thin film samples were found to vary between 2.25-2.62 eV as obtained from the UV-Vis spectroscopy, whereas, the nano-indentation hardness and modulus of the as-deposited samples lie between 15-34GPa and 152-476GPa, respectively following a linearly increasing trend of resistance to plastic deformations. Besides, contrary to other early 3d transition metal nitrides (TiN, VN, CrN), a comprehensive comparison of noticeably large homogeneity range in Sc-N has been outlined to apprehend the minuscule lattice expansion over the large $R_{N_2}$ realm. △ Less

Submitted 8 October, 2021; originally announced October 2021.

Comments: 13 pages, 6 figures

arXiv:2108.06113 [pdf, other]

doi 10.1117/1.JEI.30.5.053013

UMFA: A photorealistic style transfer method based on U-Net and multi-layer feature aggregation

Authors: D. Y. Rao, X. J. Wu, H. Li, J. Kittler, T. Y. Xu

Abstract: In this paper, we propose a photorealistic style transfer network to emphasize the natural effect of photorealistic image stylization. In general, distortion of the image content and lacking of details are two typical issues in the style transfer field. To this end, we design a novel framework employing the U-Net structure to maintain the rich spatial clues, with a multi-layer feature aggregation… ▽ More In this paper, we propose a photorealistic style transfer network to emphasize the natural effect of photorealistic image stylization. In general, distortion of the image content and lacking of details are two typical issues in the style transfer field. To this end, we design a novel framework employing the U-Net structure to maintain the rich spatial clues, with a multi-layer feature aggregation (MFA) method to simultaneously provide the details obtained by the shallow layers in the stylization processing. In particular, an encoder based on the dense block and a decoder form a symmetrical structure of U-Net are jointly staked to realize an effective feature extraction and image reconstruction. Besides, a transfer module based on MFA and "adaptive instance normalization" (AdaIN) is inserted in the skip connection positions to achieve the stylization. Accordingly, the stylized image possesses the texture of a real photo and preserves rich content details without introducing any mask or post-processing steps. The experimental results on public datasets demonstrate that our method achieves a more faithful structural similarity with a lower style loss, reflecting the effectiveness and merit of our approach. △ Less

Submitted 13 August, 2021; originally announced August 2021.

arXiv:2106.14690 [pdf, ps, other]

doi 10.1016/j.ijmultiphaseflow.2021.103925

Laser pulse-droplet interaction enables the deformation and fragmentation of droplet array

Authors: D. Chaitanya Kumar Rao, Awanish Pratap Singh, Saptarshi Basu

Abstract: Droplet-droplet interactions is ubiquitous in various applications ranging from medical diagnostics to enhancing and optimizing liquid jet propulsion. We employ an experimental technique where the laser pulse interacts with a micron-sized droplet and causes optical breakdown. The synergy of a nanosecond laser pulse and an isolated spherical droplet is accurately controlled and manipulated to influ… ▽ More Droplet-droplet interactions is ubiquitous in various applications ranging from medical diagnostics to enhancing and optimizing liquid jet propulsion. We employ an experimental technique where the laser pulse interacts with a micron-sized droplet and causes optical breakdown. The synergy of a nanosecond laser pulse and an isolated spherical droplet is accurately controlled and manipulated to influence the deformation and fragmentation of an array of droplets. We elucidate how the fluid dynamic response (such as drop-drop and shock-drop interactions) of an arrangement of droplets can be regulated and optimally shaped by laser pulse energy and its interplay with the optical density of liquid target. A new butterfly type breakup is revealed, which is found to result in controlled and efficient fragmentation of the outer droplets in an array. The spatio-temporal characteristics of a laser-induced breakdown dictate how shock wave and central droplet fragments can influence outer droplets. The incident laser energy and pulse width employed in this work are representative of diverse industrial applications such as surface cleaning, nano-lithography, microelectronics, and medical procedures such as intraocular microsurgery. △ Less

Submitted 28 June, 2021; originally announced June 2021.

Journal ref: International Journal of Multiphase Flow, 148, 103925 (2022)

arXiv:2106.14647 [pdf]

Zero-shot learning approach to adaptive Cybersecurity using Explainable AI

Authors: Dattaraj Rao, Shraddha Mane

Abstract: Cybersecurity is a domain where there is constant change in patterns of attack, and we need ways to make our Cybersecurity systems more adaptive to handle new attacks and categorize for appropriate action. We present a novel approach to handle the alarm flooding problem faced by Cybersecurity systems like security information and event management (SIEM) and intrusion detection (IDS). We apply a ze… ▽ More Cybersecurity is a domain where there is constant change in patterns of attack, and we need ways to make our Cybersecurity systems more adaptive to handle new attacks and categorize for appropriate action. We present a novel approach to handle the alarm flooding problem faced by Cybersecurity systems like security information and event management (SIEM) and intrusion detection (IDS). We apply a zero-shot learning method to machine learning (ML) by leveraging explanations for predictions of anomalies generated by a ML model. This approach has huge potential to auto detect alarm labels generated in SIEM and associate them with specific attack types. In this approach, without any prior knowledge of attack, we try to identify it, decipher the features that contribute to classification and try to bucketize the attack in a specific category - using explainable AI. Explanations give us measurable factors as to what features influence the prediction of a cyber-attack and to what degree. These explanations generated based on game-theory are used to allocate credit to specific features based on their influence on a specific prediction. Using this allocation of credit, we propose a novel zero-shot approach to categorize novel attacks into specific new classes based on feature influence. The resulting system demonstrated will get good at separating attack traffic from normal flow and auto-generate a label for attacks based on features that contribute to the attack. These auto-generated labels can be presented to SIEM analyst and are intuitive enough to figure out the nature of attack. We apply this approach to a network flow dataset and demonstrate results for specific attack types like ip sweep, denial of service, remote to local, etc. Paper was presented at the first Conference on Deployable AI at IIT-Madras in June 2021. △ Less

Submitted 21 June, 2021; originally announced June 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2103.07110

arXiv:2106.12772 [pdf, other]

Task-agnostic Continual Learning with Hybrid Probabilistic Models

Authors: Polina Kirichenko, Mehrdad Farajtabar, Dushyant Rao, Balaji Lakshminarayanan, Nir Levine, Ang Li, Huiyi Hu, Andrew Gordon Wilson, Razvan Pascanu

Abstract: Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning. In this work we propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. We model the distribution of each task and each class with a normalizing flow. The flow is used to lea… ▽ More Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning. In this work we propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. We model the distribution of each task and each class with a normalizing flow. The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting, all leveraging the invertibility and exact likelihood which are uniquely enabled by the normalizing flow model. We use the generative capabilities of the flow to avoid catastrophic forgetting through generative replay and a novel functional regularization technique. For task identification, we use state-of-the-art anomaly detection techniques based on measuring the typicality of the model's statistics. We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST. △ Less

Submitted 24 June, 2021; originally announced June 2021.

arXiv:2103.07110 [pdf]

Explaining Network Intrusion Detection System Using Explainable AI Framework

Authors: Shraddha Mane, Dattaraj Rao

Abstract: Cybersecurity is a domain where the data distribution is constantly changing with attackers exploring newer patterns to attack cyber infrastructure. Intrusion detection system is one of the important layers in cyber safety in today's world. Machine learning based network intrusion detection systems started showing effective results in recent years. With deep learning models, detection rates of net… ▽ More Cybersecurity is a domain where the data distribution is constantly changing with attackers exploring newer patterns to attack cyber infrastructure. Intrusion detection system is one of the important layers in cyber safety in today's world. Machine learning based network intrusion detection systems started showing effective results in recent years. With deep learning models, detection rates of network intrusion detection system are improved. More accurate the model, more the complexity and hence less the interpretability. Deep neural networks are complex and hard to interpret which makes difficult to use them in production as reasons behind their decisions are unknown. In this paper, we have used deep neural network for network intrusion detection and also proposed explainable AI framework to add transparency at every stage of machine learning pipeline. This is done by leveraging Explainable AI algorithms which focus on making ML models less of black boxes by providing explanations as to why a prediction is made. Explanations give us measurable factors as to what features influence the prediction of a cyberattack and to what degree. These explanations are generated from SHAP, LIME, Contrastive Explanations Method, ProtoDash and Boolean Decision Rules via Column Generation. We apply these approaches to NSL KDD dataset for intrusion detection system and demonstrate results. △ Less

Submitted 12 March, 2021; originally announced March 2021.

arXiv:2102.08740 [pdf]

doi 10.1002/admi.202100757

Ultrafast nonlinear pulse propagation dynamics in metal-dielectric periodic photonic architectures

Authors: Jitendra Nath Acharyya, Akhilesh Kumar Mishra, D. Narayana Rao, Ajit Kumar, G. Vijaya Prakash

Abstract: One-dimensional (1D) metal-dielectric (MD) periodic structures take advantage of large refractive index contrast between metal and dielectrics to invoke extremely high nonlinear ultrafast responses of metal. These structures are also special due to their extremely high laser damage threshold. The Bragg like 1D MD structure (Ag/SiO2)4 enables strong optical field confinement with much enhanced nonl… ▽ More One-dimensional (1D) metal-dielectric (MD) periodic structures take advantage of large refractive index contrast between metal and dielectrics to invoke extremely high nonlinear ultrafast responses of metal. These structures are also special due to their extremely high laser damage threshold. The Bragg like 1D MD structure (Ag/SiO2)4 enables strong optical field confinement with much enhanced nonlinear features as compared to simple metal or single (Ag/SiO2)1 structure. In the present work, the ultrafast nonlinear optical responses of the above structures are investigated via femtosecond broadband optical pump-probe technique. The enhanced nonlinear optical absorption is of reverse saturation of absorption (RSA) nature, resulted due to free-carrier absorption (FCA) and excited-state absorption (ESA) processes. The spectral nonlinearities are closely related to the pump-induced modification of the metal's dielectric functions, which are qualitatively visualized by transfer matrix and two-temperature models. The ultrafast temporal evolution of nonlinear absorption clearly demonstrated enhanced optical nonlinearity, disentangled by the electron-electron and electron-phonon dynamic interactions at picosecond time scales. A phenomenological pulse propagation model is employed that incorporates the experimentally obtained nonlinear absorption coefficients and different nonlinear effects exhibited by the system. Nonlinearity plays a crucial role in controlling the ultrafast pulse propagation and could open a new window for many nonlinear device applications. The findings of these new optical materials could possibly pave the way for promising applications in ultrafast photonics. △ Less

Submitted 15 September, 2021; v1 submitted 17 February, 2021; originally announced February 2021.

Comments: (23 main text + 9 supplementary) pages, (7 main text + 8 supplementary) figures

Journal ref: Advanced Materials Interfaces 2021

arXiv:2102.08515 [pdf, ps, other]

A Novel Bayesian Approach for the Two-Dimensional Harmonic Retrieval Problem

Authors: Rohan R. Pote, Bhaskar D. Rao

Abstract: Sparse signal recovery algorithms like sparse Bayesian learning work well but the complexity quickly grows when tackling higher dimensional parametric dictionaries. In this work we propose a novel Bayesian strategy to address the two dimensional harmonic retrieval problem, through remodeling and reparameterization of the standard data model. This new model allows us to introduce a block sparsity s… ▽ More Sparse signal recovery algorithms like sparse Bayesian learning work well but the complexity quickly grows when tackling higher dimensional parametric dictionaries. In this work we propose a novel Bayesian strategy to address the two dimensional harmonic retrieval problem, through remodeling and reparameterization of the standard data model. This new model allows us to introduce a block sparsity structure in a manner that enables a natural pairing of the parameters in the two dimensions. The numerical simulations demonstrate that the inference algorithm developed (H-MSBL) does not suffer from source identifiability issues and is capable of estimating the harmonic components in challenging scenarios, while maintaining a low computational complexity. △ Less

Submitted 16 February, 2021; originally announced February 2021.

Comments: To appear in the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Showing 1–50 of 186 results for author: Rao, D