-
Capture Point Control in Thruster-Assisted Bipedal Locomotion
Authors:
Shreyansh Pitroda,
Aditya Bondada,
Kaushik Venkatesh Krishnamurthy,
Adarsh Salagame,
Chenghao Wang,
Taoran Liu,
Bibek Gupta,
Eric Sihite,
Reza Nemovi,
Alireza Ramezani,
Morteza Gharib
Abstract:
Despite major advancements in control design that are robust to unplanned disturbances, bipedal robots are still susceptible to falling over and struggle to negotiate rough terrains. By utilizing thrusters in our bipedal robot, we can perform additional posture manipulation and expand the modes of locomotion to enhance the robot's stability and ability to negotiate rough and difficult-to-navigate…
▽ More
Despite major advancements in control design that are robust to unplanned disturbances, bipedal robots are still susceptible to falling over and struggle to negotiate rough terrains. By utilizing thrusters in our bipedal robot, we can perform additional posture manipulation and expand the modes of locomotion to enhance the robot's stability and ability to negotiate rough and difficult-to-navigate terrains. In this paper, we present our efforts in designing a controller based on capture point control for our thruster-assisted walking model named Harpy and explore its control design possibilities. While capture point control based on centroidal models for bipedal systems has been extensively studied, the incorporation of external forces that can influence the dynamics of linear inverted pendulum models, often used in capture point-based works, has not been explored before. The inclusion of these external forces can lead to interesting interpretations of locomotion, such as virtual buoyancy studied in aquatic-legged locomotion. This paper outlines the dynamical model of our robot, the capture point method we use to assist the upper body stabilization, and the simulation work done to show the controller's feasibility.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Thruster-Assisted Incline Walking
Authors:
Kaushik Venkatesh Krishnamurthy,
Chenghao Wang,
Shreyansh Pitroda,
Adarsh Salagame,
Eric Sihite,
Reza Nemovi,
Alireza Ramezani,
Morteza Gharib
Abstract:
In this study, our aim is to evaluate the effectiveness of thruster-assisted steep slope walking for the Husky Carbon, a quadrupedal robot equipped with custom-designed actuators and plural electric ducted fans, through simulation prior to conducting experimental trials. Thruster-assisted steep slope walking draws inspiration from wing-assisted incline running (WAIR) observed in birds, and intrigu…
▽ More
In this study, our aim is to evaluate the effectiveness of thruster-assisted steep slope walking for the Husky Carbon, a quadrupedal robot equipped with custom-designed actuators and plural electric ducted fans, through simulation prior to conducting experimental trials. Thruster-assisted steep slope walking draws inspiration from wing-assisted incline running (WAIR) observed in birds, and intriguingly incorporates posture manipulation and thrust vectoring, a locomotion technique not previously explored in the animal kingdom. Our approach involves developing a reduced-order model of the Husky robot, followed by the application of an optimization-based controller utilizing collocation methods and dynamics interpolation to determine control actions. Through simulation testing, we demonstrate the feasibility of hardware implementation of our controller.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Identifying Hate Speech Peddlers in Online Platforms. A Bayesian Social Learning Approach for Large Language Model Driven Decision-Makers
Authors:
Adit Jain,
Vikram Krishnamurthy
Abstract:
This paper studies the problem of autonomous agents performing Bayesian social learning for sequential detection when the observations of the state belong to a high-dimensional space and are expensive to analyze. Specifically, when the observations are textual, the Bayesian agent can use a large language model (LLM) as a map to get a low-dimensional private observation. The agent performs Bayesian…
▽ More
This paper studies the problem of autonomous agents performing Bayesian social learning for sequential detection when the observations of the state belong to a high-dimensional space and are expensive to analyze. Specifically, when the observations are textual, the Bayesian agent can use a large language model (LLM) as a map to get a low-dimensional private observation. The agent performs Bayesian learning and takes an action that minimizes the expected cost and is visible to subsequent agents. We prove that a sequence of such Bayesian agents herd in finite time to the public belief and take the same action disregarding the private observations. We propose a stopping time formulation for quickest time herding in social learning and optimally balance privacy and herding. Structural results are shown on the threshold nature of the optimal policy to the stopping time problem. We illustrate the application of our framework when autonomous Bayesian detectors aim to sequentially identify if a user is a hate speech peddler on an online platform by parsing text observations using an LLM. We numerically validate our results on real-world hate speech datasets. We show that autonomous Bayesian agents designed to flag hate speech peddlers in online platforms herd and misclassify the users when the public prior is strong. We also numerically show the effect of a threshold policy in delaying herding.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Structured Reinforcement Learning for Incentivized Stochastic Covert Optimization
Authors:
Adit Jain,
Vikram Krishnamurthy
Abstract:
This paper studies how a stochastic gradient algorithm (SG) can be controlled to hide the estimate of the local stationary point from an eavesdropper. Such problems are of significant interest in distributed optimization settings like federated learning and inventory management. A learner queries a stochastic oracle and incentivizes the oracle to obtain noisy gradient measurements and perform SG.…
▽ More
This paper studies how a stochastic gradient algorithm (SG) can be controlled to hide the estimate of the local stationary point from an eavesdropper. Such problems are of significant interest in distributed optimization settings like federated learning and inventory management. A learner queries a stochastic oracle and incentivizes the oracle to obtain noisy gradient measurements and perform SG. The oracle probabilistically returns either a noisy gradient of the function} or a non-informative measurement, depending on the oracle state and incentive. The learner's query and incentive are visible to an eavesdropper who wishes to estimate the stationary point. This paper formulates the problem of the learner performing covert optimization by dynamically incentivizing the stochastic oracle and obfuscating the eavesdropper as a finite-horizon Markov decision process (MDP). Using conditions for interval-dominance on the cost and transition probability structure, we show that the optimal policy for the MDP has a monotone threshold structure. We propose searching for the optimal stationary policy with the threshold structure using a stochastic approximation algorithm and a multi-armed bandit approach. The effectiveness of our methods is numerically demonstrated on a covert federated learning hate-speech classification task.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Narrow-Path, Dynamic Walking Using Integrated Posture Manipulation and Thrust Vectoring
Authors:
Kaushik Venkatesh Krishnamurthy,
Chenghao Wang,
Shreyansh Pitroda,
Adarsh Salagame,
Eric Sihite,
Reza Nemovi,
Alireza Ramezani,
Morteza Gharib
Abstract:
This research concentrates on enhancing the navigational capabilities of Northeastern Universitys Husky, a multi-modal quadrupedal robot, that can integrate posture manipulation and thrust vectoring, to traverse through narrow pathways such as walking over pipes and slacklining. The Husky is outfitted with thrusters designed to stabilize its body during dynamic walking over these narrow paths. The…
▽ More
This research concentrates on enhancing the navigational capabilities of Northeastern Universitys Husky, a multi-modal quadrupedal robot, that can integrate posture manipulation and thrust vectoring, to traverse through narrow pathways such as walking over pipes and slacklining. The Husky is outfitted with thrusters designed to stabilize its body during dynamic walking over these narrow paths. The project involves modeling the robot using the HROM (Husky Reduced Order Model) and developing an optimal control framework. This framework is based on polynomial approximation of the HROM and a collocation approach to derive optimal thruster commands necessary for achieving dynamic walking on narrow paths. The effectiveness of the modeling and control design approach is validated through simulations conducted using Matlab.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Adaptive Mechanism Design using Multi-Agent Revealed Preferences
Authors:
Luke Snow,
Vikram Krishnamurthy
Abstract:
This paper constructs an algorithmic framework for adaptively achieving the mechanism design objective, finding a mechanism inducing socially optimal Nash equilibria, without knowledge of the utility functions of the agents. We consider a probing scheme where the designer can iteratively enact mechanisms and observe Nash equilibria responses. We first derive necessary and sufficient conditions, ta…
▽ More
This paper constructs an algorithmic framework for adaptively achieving the mechanism design objective, finding a mechanism inducing socially optimal Nash equilibria, without knowledge of the utility functions of the agents. We consider a probing scheme where the designer can iteratively enact mechanisms and observe Nash equilibria responses. We first derive necessary and sufficient conditions, taking the form of linear program feasibility, for the existence of utility functions under which the empirical Nash equilibria responses are socially optimal. Then, we utilize this to construct a loss function with respect to the mechanism, and show that its global minimization occurs at mechanisms under which Nash equilibria system responses are also socially optimal. We develop a simulated annealing-based gradient algorithm, and prove that it converges in probability to this set of global minima, thus achieving adaptive mechanism design.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Eclipse Attack Detection on a Blockchain Network as a Non-Parametric Change Detection Problem
Authors:
Anurag Gupta,
Vikram Krishnamurthy,
Brian M. Sadler
Abstract:
This paper introduces a novel non-parametric change detection algorithm to identify eclipse attacks on a blockchain network; the non-parametric algorithm relies only on the empirical mean and variance of the dataset, making it highly adaptable. An eclipse attack occurs when malicious actors isolate blockchain users, disrupting their ability to reach consensus with the broader network, thereby dist…
▽ More
This paper introduces a novel non-parametric change detection algorithm to identify eclipse attacks on a blockchain network; the non-parametric algorithm relies only on the empirical mean and variance of the dataset, making it highly adaptable. An eclipse attack occurs when malicious actors isolate blockchain users, disrupting their ability to reach consensus with the broader network, thereby distorting their local copy of the ledger. To detect an eclipse attack, we monitor changes in the Fréchet mean and variance of the evolving blockchain communication network connecting blockchain users. First, we leverage the Johnson-Lindenstrauss lemma to project large-dimensional networks into a lower-dimensional space, preserving essential statistical properties. Subsequently, we employ a non-parametric change detection procedure, leading to a test statistic that converges weakly to a Brownian bridge process in the absence of an eclipse attack. This enables us to quantify the false alarm rate of the detector. Our detector can be implemented as a smart contract on the blockchain, offering a tamper-proof and reliable solution. Finally, we use numerical examples to compare the proposed eclipse attack detector with a detector based on the random forest model.
△ Less
Submitted 30 May, 2024; v1 submitted 30 March, 2024;
originally announced April 2024.
-
Medial Parametrization of Arbitrary Planar Compact Domains with Dipoles
Authors:
Vinayak Krishnamurthy,
Ergun Akleman
Abstract:
We present medial parametrization, a new approach to parameterizing any compact planar domain bounded by simple closed curves. The basic premise behind our proposed approach is to use two close Voronoi sites, which we call dipoles, to construct and reconstruct an approximate piecewise-linear version of the original boundary and medial axis through Voronoi tessellation. The boundaries and medial ax…
▽ More
We present medial parametrization, a new approach to parameterizing any compact planar domain bounded by simple closed curves. The basic premise behind our proposed approach is to use two close Voronoi sites, which we call dipoles, to construct and reconstruct an approximate piecewise-linear version of the original boundary and medial axis through Voronoi tessellation. The boundaries and medial axes of such planar compact domains offer a natural way to describe the domain's interior. Any compact planar domain is homeomorphic to a compact unit circular disk admits a natural parameterization isomorphic to the polar parametrization of the disk. Specifically, the medial axis and the boundary generalize the radial and angular parameters, respectively. In this paper, we present a simple algorithm that puts these principles into practice. The algorithm is based on the simultaneous re-creation of the boundaries of the domain and its medial axis using Voronoi tessellation. This simultaneous re-creation provides partitions of the domain into a set of "skinny" convex polygons wherein each polygon is essentially a subset of the medial edges (which we call the spine) connected to the boundary through exactly two straight edges (which we call limbs). This unique structure enables us to convert the original Voronoi tessellation into quadrilaterals and triangles (at the poles of the medial axis) neatly ordered along the domain boundary, thereby allowing proper parametrization of the domain. Our approach is agnostic to the number of holes and disconnected components bounding the domain. We investigate the efficacy of our concept and algorithm through several examples.
△ Less
Submitted 7 March, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
FLASH: Federated Learning Across Simultaneous Heterogeneities
Authors:
Xiangyu Chang,
Sk Miraj Ahmed,
Srikanth V. Krishnamurthy,
Basak Guler,
Ananthram Swami,
Samet Oymak,
Amit K. Roy-Chowdhury
Abstract:
The key premise of federated learning (FL) is to train ML models across a diverse set of data-owners (clients), without exchanging local data. An overarching challenge to this date is client heterogeneity, which may arise not only from variations in data distribution, but also in data quality, as well as compute/communication latency. An integrated view of these diverse and concurrent sources of h…
▽ More
The key premise of federated learning (FL) is to train ML models across a diverse set of data-owners (clients), without exchanging local data. An overarching challenge to this date is client heterogeneity, which may arise not only from variations in data distribution, but also in data quality, as well as compute/communication latency. An integrated view of these diverse and concurrent sources of heterogeneity is critical; for instance, low-latency clients may have poor data quality, and vice versa. In this work, we propose FLASH(Federated Learning Across Simultaneous Heterogeneities), a lightweight and flexible client selection algorithm that outperforms state-of-the-art FL frameworks under extensive sources of heterogeneity, by trading-off the statistical information associated with the client's data quality, data distribution, and latency. FLASH is the first method, to our knowledge, for handling all these heterogeneities in a unified manner. To do so, FLASH models the learning dynamics through contextual multi-armed bandits (CMAB) and dynamically selects the most promising clients. Through extensive experiments, we demonstrate that FLASH achieves substantial and consistent improvements over state-of-the-art baselines -- as much as 10% in absolute accuracy -- thanks to its unified approach. Importantly, FLASH also outperforms federated aggregation methods that are designed to handle highly heterogeneous settings and even enjoys a performance boost when integrated with them.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Plug-and-Play Transformer Modules for Test-Time Adaptation
Authors:
Xiangyu Chang,
Sk Miraj Ahmed,
Srikanth V. Krishnamurthy,
Basak Guler,
Ananthram Swami,
Samet Oymak,
Amit K. Roy-Chowdhury
Abstract:
Parameter-efficient tuning (PET) methods such as LoRA, Adapter, and Visual Prompt Tuning (VPT) have found success in enabling adaptation to new domains by tuning small modules within a transformer model. However, the number of domains encountered during test time can be very large, and the data is usually unlabeled. Thus, adaptation to new domains is challenging; it is also impractical to generate…
▽ More
Parameter-efficient tuning (PET) methods such as LoRA, Adapter, and Visual Prompt Tuning (VPT) have found success in enabling adaptation to new domains by tuning small modules within a transformer model. However, the number of domains encountered during test time can be very large, and the data is usually unlabeled. Thus, adaptation to new domains is challenging; it is also impractical to generate customized tuned modules for each such domain. Toward addressing these challenges, this work introduces PLUTO: a Plug-and-pLay modUlar Test-time domain adaptatiOn strategy. We pre-train a large set of modules, each specialized for different source domains, effectively creating a ``module store''. Given a target domain with few-shot unlabeled data, we introduce an unsupervised test-time adaptation (TTA) method to (1) select a sparse subset of relevant modules from this store and (2) create a weighted combination of selected modules without tuning their weights. This plug-and-play nature enables us to harness multiple most-relevant source domains in a single inference call. Comprehensive evaluations demonstrate that PLUTO uniformly outperforms alternative TTA methods and that selecting $\leq$5 modules suffice to extract most of the benefit. At a high level, our method equips pre-trained transformers with the capability to dynamically adapt to new domains, motivating a new paradigm for efficient and scalable domain adaptation.
△ Less
Submitted 8 February, 2024; v1 submitted 5 January, 2024;
originally announced January 2024.
-
Towards dynamic Narrow path walking on NU's Husky
Authors:
Kaushik Venkatesh Krishnamurthy
Abstract:
This research focuses on enabling Northeastern University's Husky, a multi-modal quadrupedal robot, to navigate narrow paths akin to various animals in nature. The Husky is equipped with thrusters to stabilize its body during dynamic maneuvers, addressing challenges inherent in aerial-legged systems. The approach involves modeling the robot as HROM (Husky Reduced Model) and creating an optimal con…
▽ More
This research focuses on enabling Northeastern University's Husky, a multi-modal quadrupedal robot, to navigate narrow paths akin to various animals in nature. The Husky is equipped with thrusters to stabilize its body during dynamic maneuvers, addressing challenges inherent in aerial-legged systems. The approach involves modeling the robot as HROM (Husky Reduced Model) and creating an optimal control framework using linearized dynamics for narrow path walking. The thesis introduces a gait scheduling method to generate an open-loop walking gait and validates these gaits through a high-fidelity Simscape simulation. Experimental results of the open-loop walking are presented, accompanied by potential directions for advancing this robotic system.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Curved Space-Filling Tiles Using Voronoi Decomposition with Line, and Curve Segments Closed Under Wallpaper Symmetries
Authors:
Haard Panchal,
Ergun Akleman,
Vinayak Krishnamurthy,
Tolga Talha Yildiz,
Varda Grover
Abstract:
In this paper, we present a new approach to obtain symmetric tiles with curved edges. Our approach is based on using higher-order Voronoi sites that are closed under wallpaper symmetries. The resulting Voronoi tessellations provide us with symmetric tiles with curved edges. We have developed a web application that provides real-time tile design. Our application can be found at https://voronoi.viz.…
▽ More
In this paper, we present a new approach to obtain symmetric tiles with curved edges. Our approach is based on using higher-order Voronoi sites that are closed under wallpaper symmetries. The resulting Voronoi tessellations provide us with symmetric tiles with curved edges. We have developed a web application that provides real-time tile design. Our application can be found at https://voronoi.viz.tamu.edu. One of our key findings in this paper is that not all symmetry operations are useful for creating curved tiles. In particular, all symmetries that use mirror operation produce straight lines that are useless for creating new tiles. This result is interesting because it suggests that we need to avoid mirror transformations to produce unusual space-filling tiles in 2D and 3D using Voronoi tessellations.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Bayesian longitudinal tensor response regression for modeling neuroplasticity
Authors:
Suprateek Kundu,
Alec Reinhardt,
Serena Song,
Joo Han,
M. Lawson Meadows,
Bruce Crosson,
Venkatagiri Krishnamurthy
Abstract:
A major interest in longitudinal neuroimaging studies involves investigating voxel-level neuroplasticity due to treatment and other factors across visits. However, traditional voxel-wise methods are beset with several pitfalls, which can compromise the accuracy of these approaches. We propose a novel Bayesian tensor response regression approach for longitudinal imaging data, which pools informatio…
▽ More
A major interest in longitudinal neuroimaging studies involves investigating voxel-level neuroplasticity due to treatment and other factors across visits. However, traditional voxel-wise methods are beset with several pitfalls, which can compromise the accuracy of these approaches. We propose a novel Bayesian tensor response regression approach for longitudinal imaging data, which pools information across spatially-distributed voxels to infer significant changes while adjusting for covariates. The proposed method, which is implemented using Markov chain Monte Carlo (MCMC) sampling, utilizes low-rank decomposition to reduce dimensionality and preserve spatial configurations of voxels when estimating coefficients. It also enables feature selection via joint credible regions which respect the shape of the posterior distributions for more accurate inference. In addition to group level inferences, the method is able to infer individual-level neuroplasticity, allowing for examination of personalized disease or recovery trajectories. The advantages of the proposed approach in terms of prediction and feature selection over voxel-wise regression are highlighted via extensive simulation studies. Subsequently, we apply the approach to a longitudinal Aphasia dataset consisting of task functional MRI images from a group of subjects who were administered either a control intervention or intention treatment at baseline and were followed up over subsequent visits. Our analysis revealed that while the control therapy showed long-term increases in brain activity, the intention treatment produced predominantly short-term changes, both of which were concentrated in distinct localized regions. In contrast, the voxel-wise regression failed to detect any significant neuroplasticity after multiplicity adjustments, which is biologically implausible and implies lack of power.
△ Less
Submitted 18 October, 2023; v1 submitted 12 September, 2023;
originally announced September 2023.
-
Controlling Federated Learning for Covertness
Authors:
Adit Jain,
Vikram Krishnamurthy
Abstract:
A learner aims to minimize a function $f$ by repeatedly querying a distributed oracle that provides noisy gradient evaluations. At the same time, the learner seeks to hide $\arg\min f$ from a malicious eavesdropper that observes the learner's queries. This paper considers the problem of \textit{covert} or \textit{learner-private} optimization, where the learner has to dynamically choose between le…
▽ More
A learner aims to minimize a function $f$ by repeatedly querying a distributed oracle that provides noisy gradient evaluations. At the same time, the learner seeks to hide $\arg\min f$ from a malicious eavesdropper that observes the learner's queries. This paper considers the problem of \textit{covert} or \textit{learner-private} optimization, where the learner has to dynamically choose between learning and obfuscation by exploiting the stochasticity. The problem of controlling the stochastic gradient algorithm for covert optimization is modeled as a Markov decision process, and we show that the dynamic programming operator has a supermodular structure implying that the optimal policy has a monotone threshold structure. A computationally efficient policy gradient algorithm is proposed to search for the optimal querying policy without knowledge of the transition probabilities. As a practical application, our methods are demonstrated on a hate speech classification task in a federated setting where an eavesdropper can use the optimal weights to generate toxic content, which is more easily misclassified. Numerical results show that when the learner uses the optimal policy, an eavesdropper can only achieve a validation accuracy of $52\%$ with no information and $69\%$ when it has a public dataset with 10\% positive samples compared to $83\%$ when the learner employs a greedy policy.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Fréchet Statistics Based Change Point Detection in Multivariate Hawkes Process
Authors:
Rui Luo,
Vikram Krishnamurthy
Abstract:
This paper proposes a new approach for change point detection in causal networks of multivariate Hawkes processes using Frechet statistics. Our method splits the point process into overlapping windows, estimates kernel matrices in each window, and reconstructs the signed Laplacians by treating the kernel matrices as the adjacency matrices of the causal network. We demonstrate the effectiveness of…
▽ More
This paper proposes a new approach for change point detection in causal networks of multivariate Hawkes processes using Frechet statistics. Our method splits the point process into overlapping windows, estimates kernel matrices in each window, and reconstructs the signed Laplacians by treating the kernel matrices as the adjacency matrices of the causal network. We demonstrate the effectiveness of our method through experiments on both simulated and real-world cryptocurrency datasets. Our results show that our method is capable of accurately detecting and characterizing changes in the causal structure of multivariate Hawkes processes, and may have potential applications in fields such as finance and neuroscience. The proposed method is an extension of previous work on Frechet statistics in point process settings and represents an important contribution to the field of change point detection in multivariate point processes.
△ Less
Submitted 15 August, 2023; v1 submitted 13 August, 2023;
originally announced August 2023.
-
Multiple-stopping time Sequential Detection for Energy Efficient Mining in Blockchain-Enabled IoT
Authors:
Anurag Gupta,
Vikram Krishnamurthy
Abstract:
What are the optimal times for an Internet of Things (IoT) device to act as a blockchain miner? The aim is to minimize the energy consumed by low-power IoT devices that log their data into a secure (tamper-proof) distributed ledger. We formulate a multiple stopping time Bayesian sequential detection problem to address energy-efficient blockchain mining for IoT devices. The objective is to identify…
▽ More
What are the optimal times for an Internet of Things (IoT) device to act as a blockchain miner? The aim is to minimize the energy consumed by low-power IoT devices that log their data into a secure (tamper-proof) distributed ledger. We formulate a multiple stopping time Bayesian sequential detection problem to address energy-efficient blockchain mining for IoT devices. The objective is to identify $L$ optimal stops for mining, thereby maximizing the probability of successfully adding a block to the blockchain; we also present a model to optimize the number of stops (mining instants). The formulation is equivalent to a multiple stopping time POMDP. Since POMDPs are in general computationally intractable to solve, we show mathematically using submodularity arguments that the optimal mining policy has a useful structure: 1) it is monotone in belief space, and 2) it exhibits a threshold structure, which divides the belief space into two connected sets. Exploiting the structural results, we formulate a computationally-efficient linear mining policy for the blockchain-enabled IoT device. We present a policy gradient technique to optimize the parameters of the linear mining policy. Finally, we use synthetic and real Bitcoin datasets to study the performance of our proposed mining policy. We demonstrate the energy efficiency achieved by the optimal linear mining policy in contrast to other heuristic strategies.
△ Less
Submitted 17 August, 2023; v1 submitted 9 May, 2023;
originally announced May 2023.
-
Finite-Sample Bounds for Adaptive Inverse Reinforcement Learning using Passive Langevin Dynamics
Authors:
Luke Snow,
Vikram Krishnamurthy
Abstract:
This paper provides a finite-sample analysis of a passive stochastic gradient Langevin dynamics algorithm (PSGLD) designed to achieve adaptive inverse reinforcement learning (IRL). By passive, we mean that the noisy gradients available to the PSGLD algorithm (inverse learning process) are evaluated at randomly chosen points by an external stochastic gradient algorithm (forward learner) that aims t…
▽ More
This paper provides a finite-sample analysis of a passive stochastic gradient Langevin dynamics algorithm (PSGLD) designed to achieve adaptive inverse reinforcement learning (IRL). By passive, we mean that the noisy gradients available to the PSGLD algorithm (inverse learning process) are evaluated at randomly chosen points by an external stochastic gradient algorithm (forward learner) that aims to optimize a cost function. The PSGLD algorithm acts as a randomized sampler to achieve adaptive IRL by reconstructing this cost function nonparametrically from the stationary measure of a Langevin diffusion. Previous work has analyzed the asymptotic performance of this passive algorithm using weak convergence techniques. This paper analyzes the non-asymptotic (finite-sample) performance using a logarithmic-Sobolev inequality and the Otto-Villani Theorem. We obtain finite-sample bounds on the 2-Wasserstein distance between the estimates generated by the PSGLD algorithm and the cost function. Apart from achieving finite-sample guarantees for adaptive IRL, this work extends a line of research in analysis of passive stochastic gradient algorithms to the finite-sample regime for Langevin dynamics.
△ Less
Submitted 27 September, 2023; v1 submitted 18 April, 2023;
originally announced April 2023.
-
Who You Play Affects How You Play: Predicting Sports Performance Using Graph Attention Networks With Temporal Convolution
Authors:
Rui Luo,
Vikram Krishnamurthy
Abstract:
This study presents a novel deep learning method, called GATv2-GCN, for predicting player performance in sports. To construct a dynamic player interaction graph, we leverage player statistics and their interactions during gameplay. We use a graph attention network to capture the attention that each player pays to each other, allowing for more accurate modeling of the dynamic player interactions. T…
▽ More
This study presents a novel deep learning method, called GATv2-GCN, for predicting player performance in sports. To construct a dynamic player interaction graph, we leverage player statistics and their interactions during gameplay. We use a graph attention network to capture the attention that each player pays to each other, allowing for more accurate modeling of the dynamic player interactions. To handle the multivariate player statistics time series, we incorporate a temporal convolution layer, which provides the model with temporal predictive power. We evaluate the performance of our model using real-world sports data, demonstrating its effectiveness in predicting player performance. Furthermore, we explore the potential use of our model in a sports betting context, providing insights into profitable strategies that leverage our predictive power. The proposed method has the potential to advance the state-of-the-art in player performance prediction and to provide valuable insights for sports analytics and betting industries.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
Fréchet Statistics Based Change Point Detection in Dynamic Social Networks
Authors:
Rui Luo,
Vikram Krishnamurthy
Abstract:
This paper proposes a method to detect change points in dynamic social networks using Fréchet statistics. We address two main questions: (1) what metric can quantify the distances between graph Laplacians in a dynamic network and enable efficient computation, and (2) how can the Fréchet statistics be extended to detect multiple change points while maintaining the significance level of the hypothes…
▽ More
This paper proposes a method to detect change points in dynamic social networks using Fréchet statistics. We address two main questions: (1) what metric can quantify the distances between graph Laplacians in a dynamic network and enable efficient computation, and (2) how can the Fréchet statistics be extended to detect multiple change points while maintaining the significance level of the hypothesis test? Our solution defines a metric space for graph Laplacians using the Log-Euclidean metric, enabling a closed-form formula for Fréchet mean and variance. We present a framework for change point detection using Fréchet statistics and extend it to multiple change points with binary segmentation. The proposed algorithm uses incremental computation for Fréchet mean and variance to improve efficiency and is validated on simulated and two real-world datasets, namely the UCI message dataset and the Enron email dataset.
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
Mutual Information Measure for Glass Ceiling Effect in Preferential Attachment Models
Authors:
Rui Luo,
Buddhika Nettasinghe,
Vikram Krishnamurthy
Abstract:
We propose a new way to measure inequalities such as the glass ceiling effect in attributed networks. Existing measures typically rely solely on node degree distribution or degree assortativity, but our approach goes beyond these measures by using mutual information (based on Shannon and more generally, Renyi entropy) between the conditional probability distributions of node attributes given node…
▽ More
We propose a new way to measure inequalities such as the glass ceiling effect in attributed networks. Existing measures typically rely solely on node degree distribution or degree assortativity, but our approach goes beyond these measures by using mutual information (based on Shannon and more generally, Renyi entropy) between the conditional probability distributions of node attributes given node degrees of adjacent nodes. We show that this mutual information measure aligns with both the analytical structural inequality model and historical publication data, making it a reliable approach to capture the complexities of attributed networks. Specifically, we demonstrate this through an analysis of citation networks. Moreover, we propose a stochastic optimization algorithm using a parameterized conditional logit model for edge addition, which outperforms a baseline uniform distribution. By recommending links at random using this algorithm, we can mitigate the glass ceiling effect, which is a crucial tool in addressing structural inequalities in networks.
△ Less
Submitted 17 March, 2023;
originally announced March 2023.
-
Adaptive ECCM for Mitigating Smart Jammers
Authors:
Kunal Pattanayak,
Shashwat Jain,
Vikram Krishnamurthy,
Chris Berry
Abstract:
This paper considers adaptive radar electronic counter-counter measures (ECCM) to mitigate ECM by an adversarial jammer. Our ECCM approach models the jammer-radar interaction as a Principal Agent Problem (PAP), a popular economics framework for interaction between two entities with an information imbalance. In our setup, the radar does not know the jammer's utility. Instead, the radar learns the j…
▽ More
This paper considers adaptive radar electronic counter-counter measures (ECCM) to mitigate ECM by an adversarial jammer. Our ECCM approach models the jammer-radar interaction as a Principal Agent Problem (PAP), a popular economics framework for interaction between two entities with an information imbalance. In our setup, the radar does not know the jammer's utility. Instead, the radar learns the jammer's utility adaptively over time using inverse reinforcement learning. The radar's adaptive ECCM objective is two-fold (1) maximize its utility by solving the PAP, and (2) estimate the jammer's utility by observing its response. Our adaptive ECCM scheme uses deep ideas from revealed preference in micro-economics and principal agent problem in contract theory. Our numerical results show that, over time, our adaptive ECCM both identifies and mitigates the jammer's utility.
△ Less
Submitted 4 December, 2022;
originally announced December 2022.
-
Identifying Coordination in a Cognitive Radar Network -- A Multi-Objective Inverse Reinforcement Learning Approach
Authors:
Luke Snow,
Vikram Krishnamurthy,
Brian M. Sadler
Abstract:
Consider a target being tracked by a cognitive radar network. If the target can intercept some radar network emissions, how can it detect coordination among the radars? By 'coordination' we mean that the radar emissions satisfy Pareto optimality with respect to multi-objective optimization over each radar's utility. This paper provides a novel multi-objective inverse reinforcement learning approac…
▽ More
Consider a target being tracked by a cognitive radar network. If the target can intercept some radar network emissions, how can it detect coordination among the radars? By 'coordination' we mean that the radar emissions satisfy Pareto optimality with respect to multi-objective optimization over each radar's utility. This paper provides a novel multi-objective inverse reinforcement learning approach which allows for both detection of such Pareto optimal ('coordinating') behavior and subsequent reconstruction of each radar's utility function, given a finite dataset of radar network emissions. The method for accomplishing this is derived from the micro-economic setting of Revealed Preferences, and also applies to more general problems of inverse detection and learning of multi-objective optimizing systems.
△ Less
Submitted 13 November, 2022;
originally announced November 2022.
-
How can a Radar Mask its Cognition?
Authors:
Kunal Pattanayak,
Vikram Krishnamurthy,
Christopher Berry
Abstract:
A cognitive radar is a constrained utility maximizer that adapts its sensing mode in response to a changing environment. If an adversary can estimate the utility function of a cognitive radar, it can determine the radar's sensing strategy and mitigate the radar performance via electronic countermeasures (ECM). This paper discusses how a cognitive radar can {\em hide} its strategy from an adversary…
▽ More
A cognitive radar is a constrained utility maximizer that adapts its sensing mode in response to a changing environment. If an adversary can estimate the utility function of a cognitive radar, it can determine the radar's sensing strategy and mitigate the radar performance via electronic countermeasures (ECM). This paper discusses how a cognitive radar can {\em hide} its strategy from an adversary that detects cognition. The radar does so by transmitting purposefully designed sub-optimal responses to spoof the adversary's Neyman-Pearson detector. We provide theoretical guarantees by ensuring the Type-I error probability of the adversary's detector exceeds a pre-defined level for a specified tolerance on the radar's performance loss. We illustrate our cognition masking scheme via numerical examples involving waveform adaptation and beam allocation. We show that small purposeful deviations from the optimal strategy of the radar confuse the adversary by significant amounts, thereby masking the radar's cognition. Our approach uses novel ideas from revealed preference in microeconomics and adversarial inverse reinforcement learning. Our proposed algorithms provide a principled approach for system-level electronic counter-countermeasures (ECCM) to mask the radar's cognition, i.e., hide the radar's strategy from an adversary. We also provide performance bounds for our cognition masking scheme when the adversary has misspecified measurements of the radar's response.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Leveraging Local Patch Differences in Multi-Object Scenes for Generative Adversarial Attacks
Authors:
Abhishek Aich,
Shasha Li,
Chengyu Song,
M. Salman Asif,
Srikanth V. Krishnamurthy,
Amit K. Roy-Chowdhury
Abstract:
State-of-the-art generative model-based attacks against image classifiers overwhelmingly focus on single-object (i.e., single dominant object) images. Different from such settings, we tackle a more practical problem of generating adversarial perturbations using multi-object (i.e., multiple dominant objects) images as they are representative of most real-world scenes. Our goal is to design an attac…
▽ More
State-of-the-art generative model-based attacks against image classifiers overwhelmingly focus on single-object (i.e., single dominant object) images. Different from such settings, we tackle a more practical problem of generating adversarial perturbations using multi-object (i.e., multiple dominant objects) images as they are representative of most real-world scenes. Our goal is to design an attack strategy that can learn from such natural scenes by leveraging the local patch differences that occur inherently in such images (e.g. difference between the local patch on the object `person' and the object `bike' in a traffic scene). Our key idea is to misclassify an adversarial multi-object image by confusing the victim classifier for each local patch in the image. Based on this, we propose a novel generative attack (called Local Patch Difference or LPD-Attack) where a novel contrastive loss function uses the aforesaid local differences in feature space of multi-object scenes to optimize the perturbation generator. Through various experiments across diverse victim convolutional neural networks, we show that our approach outperforms baseline generative attacks with highly transferable perturbations when evaluated under different white-box and black-box settings.
△ Less
Submitted 3 October, 2022; v1 submitted 20 September, 2022;
originally announced September 2022.
-
GAMA: Generative Adversarial Multi-Object Scene Attacks
Authors:
Abhishek Aich,
Calvin-Khang Ta,
Akash Gupta,
Chengyu Song,
Srikanth V. Krishnamurthy,
M. Salman Asif,
Amit K. Roy-Chowdhury
Abstract:
The majority of methods for crafting adversarial attacks have focused on scenes with a single dominant object (e.g., images from ImageNet). On the other hand, natural scenes include multiple dominant objects that are semantically related. Thus, it is crucial to explore designing attack strategies that look beyond learning on single-object scenes or attack single-object victim classifiers. Due to t…
▽ More
The majority of methods for crafting adversarial attacks have focused on scenes with a single dominant object (e.g., images from ImageNet). On the other hand, natural scenes include multiple dominant objects that are semantically related. Thus, it is crucial to explore designing attack strategies that look beyond learning on single-object scenes or attack single-object victim classifiers. Due to their inherent property of strong transferability of perturbations to unknown models, this paper presents the first approach of using generative models for adversarial attacks on multi-object scenes. In order to represent the relationships between different objects in the input scene, we leverage upon the open-sourced pre-trained vision-language model CLIP (Contrastive Language-Image Pre-training), with the motivation to exploit the encoded semantics in the language space along with the visual space. We call this attack approach Generative Adversarial Multi-object scene Attacks (GAMA). GAMA demonstrates the utility of the CLIP model as an attacker's tool to train formidable perturbation generators for multi-object scenes. Using the joint image-text features to train the generator, we show that GAMA can craft potent transferable perturbations in order to fool victim classifiers in various attack settings. For example, GAMA triggers ~16% more misclassification than state-of-the-art generative approaches in black-box settings where both the classifier architecture and data distribution of the attacker are different from the victim. Our code is available here: https://abhishekaich27.github.io/gama.html
△ Less
Submitted 15 October, 2022; v1 submitted 20 September, 2022;
originally announced September 2022.
-
A Letter on Progress Made on Husky Carbon: A Legged-Aerial, Multi-modal Platform
Authors:
Adarsh Salagame,
Shoghair Manjikian,
Chenghao Wang,
Kaushik Venkatesh Krishnamurthy,
Shreyansh Pitroda,
Bibek Gupta,
Tobias Jacob,
Benjamin Mottis,
Eric Sihite,
Milad Ramezani,
Alireza Ramezani
Abstract:
Animals, such as birds, widely use multi-modal locomotion by combining legged and aerial mobility with dominant inertial effects. The robotic biomimicry of this multi-modal locomotion feat can yield ultra-flexible systems in terms of their ability to negotiate their task spaces. The main objective of this paper is to discuss the challenges in achieving multi-modal locomotion, and to report our pro…
▽ More
Animals, such as birds, widely use multi-modal locomotion by combining legged and aerial mobility with dominant inertial effects. The robotic biomimicry of this multi-modal locomotion feat can yield ultra-flexible systems in terms of their ability to negotiate their task spaces. The main objective of this paper is to discuss the challenges in achieving multi-modal locomotion, and to report our progress in developing our quadrupedal robot capable of multi-modal locomotion (legged and aerial locomotion), the Husky Carbon. We report the mechanical and electrical components utilized in our robot, in addition to the simulation and experimentation done to achieve our goal in developing a versatile multi-modal robotic platform.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
Estimating Exposure to Information on Social Networks
Authors:
Buddhika Nettasinghe,
Kowe Kadoma,
Mor Naaman,
Vikram Krishnamurthy
Abstract:
This paper considers the problem of estimating exposure to information in a social network. Given a piece of information (e.g., a URL of a news article on Facebook, a hashtag on Twitter), our aim is to find the fraction of people on the network who have been exposed to it. The exact value of exposure to a piece of information is determined by two features: the structure of the underlying social ne…
▽ More
This paper considers the problem of estimating exposure to information in a social network. Given a piece of information (e.g., a URL of a news article on Facebook, a hashtag on Twitter), our aim is to find the fraction of people on the network who have been exposed to it. The exact value of exposure to a piece of information is determined by two features: the structure of the underlying social network and the set of people who shared the piece of information. Often, both features are not publicly available (i.e., access to the two features is limited only to the internal administrators of the platform) and difficult to be estimated from data. As a solution, we propose two methods to estimate the exposure to a piece of information in an unbiased manner: a vanilla method which is based on sampling the network uniformly and a method which non-uniformly samples the network motivated by the Friendship Paradox. We provide theoretical results which characterize the conditions (in terms of properties of the network and the piece of information) under which one method outperforms the other. Further, we outline extensions of the proposed methods to dynamic information cascades (where the exposure needs to be tracked in real-time). We demonstrate the practical feasibility of the proposed methods via experiments on multiple synthetic and real-world datasets.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
Inverse-Inverse Reinforcement Learning. How to Hide Strategy from an Adversarial Inverse Reinforcement Learner
Authors:
Kunal Pattanayak,
Vikram Krishnamurthy,
Christopher Berry
Abstract:
Inverse reinforcement learning (IRL) deals with estimating an agent's utility function from its actions. In this paper, we consider how an agent can hide its strategy and mitigate an adversarial IRL attack; we call this inverse IRL (I-IRL). How should the decision maker choose its response to ensure a poor reconstruction of its strategy by an adversary performing IRL to estimate the agent's strate…
▽ More
Inverse reinforcement learning (IRL) deals with estimating an agent's utility function from its actions. In this paper, we consider how an agent can hide its strategy and mitigate an adversarial IRL attack; we call this inverse IRL (I-IRL). How should the decision maker choose its response to ensure a poor reconstruction of its strategy by an adversary performing IRL to estimate the agent's strategy? This paper comprises four results: First, we present an adversarial IRL algorithm that estimates the agent's strategy while controlling the agent's utility function. Our second result for I-IRL result spoofs the IRL algorithm used by the adversary. Our I-IRL results are based on revealed preference theory in micro-economics. The key idea is for the agent to deliberately choose sub-optimal responses that sufficiently masks its true strategy. Third, we give a sample complexity result for our main I-IRL result when the agent has noisy estimates of the adversary specified utility function. Finally, we illustrate our I-IRL scheme in a radar problem where a meta-cognitive radar is trying to mitigate an adversarial target.
△ Less
Submitted 22 May, 2022;
originally announced May 2022.
-
Enhancing Slot Tagging with Intent Features for Task Oriented Natural Language Understanding using BERT
Authors:
Shruthi Hariharan,
Vignesh Kumar Krishnamurthy,
Utkarsh,
Jayantha Gowda Sarapanahalli
Abstract:
Recent joint intent detection and slot tagging models have seen improved performance when compared to individual models. In many real-world datasets, the slot labels and values have a strong correlation with their intent labels. In such cases, the intent label information may act as a useful feature to the slot tagging model. In this paper, we examine the effect of leveraging intent label features…
▽ More
Recent joint intent detection and slot tagging models have seen improved performance when compared to individual models. In many real-world datasets, the slot labels and values have a strong correlation with their intent labels. In such cases, the intent label information may act as a useful feature to the slot tagging model. In this paper, we examine the effect of leveraging intent label features through 3 techniques in the slot tagging task of joint intent and slot detection models. We evaluate our techniques on benchmark spoken language datasets SNIPS and ATIS, as well as over a large private Bixby dataset and observe an improved slot-tagging performance over state-of-the-art models.
△ Less
Submitted 23 May, 2022; v1 submitted 19 May, 2022;
originally announced May 2022.
-
Meta-Cognition. An Inverse-Inverse Reinforcement Learning Approach for Cognitive Radars
Authors:
Kunal Pattanayak,
Vikram Krishnamurthy,
Christopher Berry
Abstract:
This paper considers meta-cognitive radars in an adversarial setting. A cognitive radar optimally adapts its waveform (response) in response to maneuvers (probes) of a possibly adversarial moving target. A meta-cognitive radar is aware of the adversarial nature of the target and seeks to mitigate the adversarial target. How should the meta-cognitive radar choose its responses to sufficiently confu…
▽ More
This paper considers meta-cognitive radars in an adversarial setting. A cognitive radar optimally adapts its waveform (response) in response to maneuvers (probes) of a possibly adversarial moving target. A meta-cognitive radar is aware of the adversarial nature of the target and seeks to mitigate the adversarial target. How should the meta-cognitive radar choose its responses to sufficiently confuse the adversary trying to estimate the radar's utility function? This paper abstracts the radar's meta-cognition problem in terms of the spectra (eigenvalues) of the state and observation noise covariance matrices, and embeds the algebraic Riccati equation into an economics-based utility maximization setup. This adversarial target is an inverse reinforcement learner. By observing a noisy sequence of radar's responses (waveforms), the adversarial target uses a statistical hypothesis test to detect if the radar is a utility maximizer. In turn, the meta-cognitive radar deliberately chooses sub-optimal responses that increasing its Type-I error probability of the adversary's detector. We call this counter-adversarial step taken by the meta-cognitive radar as inverse inverse reinforcement learning (I-IRL). We illustrate the meta-cognition results of this paper via simple numerical examples. Our approach for meta-cognition in this paper is based on revealed preference theory in micro-economics and inspired by results in differential privacy and adversarial obfuscation in machine learning.
△ Less
Submitted 3 May, 2022;
originally announced May 2022.
-
Hawkes Process Modeling of Block Arrivals in Bitcoin Blockchain
Authors:
Rui Luo,
Vikram Krishnamurthy,
Erik Blasch
Abstract:
The paper constructs a multi-variate Hawkes process model of Bitcoin block arrivals and price jumps. Hawkes processes are selfexciting point processes that can capture the self- and cross-excitation effects of block mining and Bitcoin price volatility. We use publicly available blockchain datasets to estimate the model parameters via maximum likelihood estimation. The results show that Bitcoin pri…
▽ More
The paper constructs a multi-variate Hawkes process model of Bitcoin block arrivals and price jumps. Hawkes processes are selfexciting point processes that can capture the self- and cross-excitation effects of block mining and Bitcoin price volatility. We use publicly available blockchain datasets to estimate the model parameters via maximum likelihood estimation. The results show that Bitcoin price volatility boost block mining rate and Bitcoin investment return demonstrates mean reversion. Quantile-Quantile plots show that the proposed Hawkes process model is a better fit to the blockchain datasets than a Poisson process model.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Zero-Query Transfer Attacks on Context-Aware Object Detectors
Authors:
Zikui Cai,
Shantanu Rane,
Alejandro E. Brito,
Chengyu Song,
Srikanth V. Krishnamurthy,
Amit K. Roy-Chowdhury,
M. Salman Asif
Abstract:
Adversarial attacks perturb images such that a deep neural network produces incorrect classification results. A promising approach to defend against adversarial attacks on natural multi-object scenes is to impose a context-consistency check, wherein, if the detected objects are not consistent with an appropriately defined context, then an attack is suspected. Stronger attacks are needed to fool su…
▽ More
Adversarial attacks perturb images such that a deep neural network produces incorrect classification results. A promising approach to defend against adversarial attacks on natural multi-object scenes is to impose a context-consistency check, wherein, if the detected objects are not consistent with an appropriately defined context, then an attack is suspected. Stronger attacks are needed to fool such context-aware detectors. We present the first approach for generating context-consistent adversarial attacks that can evade the context-consistency check of black-box object detectors operating on complex, natural scenes. Unlike many black-box attacks that perform repeated attempts and open themselves to detection, we assume a "zero-query" setting, where the attacker has no knowledge of the classification decisions of the victim system. First, we derive multiple attack plans that assign incorrect labels to victim objects in a context-consistent manner. Then we design and use a novel data structure that we call the perturbation success probability matrix, which enables us to filter the attack plans and choose the one most likely to succeed. This final attack plan is implemented using a perturbation-bounded adversarial attack algorithm. We compare our zero-query attack against a few-query scheme that repeatedly checks if the victim system is fooled. We also compare against state-of-the-art context-agnostic attacks. Against a context-aware defense, the fooling rate of our zero-query approach is significantly higher than context-agnostic approaches and higher than that achievable with up to three rounds of the few-query scheme.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
Controlling Transaction Rate in Tangle Ledger: A Principal Agent Problem Approach
Authors:
Anurag Gupta,
Vikram Krishnamurthy
Abstract:
Tangle is a distributed ledger technology that stores data as a directed acyclic graph (DAG). Unlike blockchain, Tangle does not require dedicated miners for its operation; this makes Tangle suitable for Internet of Things (IoT) applications. Distributed ledgers have a built-in transaction rate control mechanism to prevent congestion and spamming; this is typically achieved by increasing or decrea…
▽ More
Tangle is a distributed ledger technology that stores data as a directed acyclic graph (DAG). Unlike blockchain, Tangle does not require dedicated miners for its operation; this makes Tangle suitable for Internet of Things (IoT) applications. Distributed ledgers have a built-in transaction rate control mechanism to prevent congestion and spamming; this is typically achieved by increasing or decreasing the proof of work (PoW) difficulty level based on the number of users. Unfortunately, this simplistic mechanism gives an unfair advantage to users with high computing power. This paper proposes a principal-agent problem (PAP) framework from microeconomics to control the transaction rate in Tangle. With users as agents and the transaction rate controller as the principal, we design a truth-telling mechanism to assign PoW difficulty levels to agents as a function of their computing power. The solution of the PAP is achieved by compensating a higher PoW difficulty level with a larger weight/reputation for the transaction. The mechanism has two benefits, (1) the security of Tangle is increased as agents are incentivized to perform difficult PoW, and (2) the rate of new transactions is moderated in Tangle. The solution of PAP is obtained by solving a mixed-integer optimization problem. We show that the optimal solution of the PAP increases with the computing power of agents. The structural results reduce the search space of the mixed-integer program and enable efficient computation of the optimal mechanism. Finally, via numerical examples, we illustrate the transaction rate control mechanism and study its impact on the dynamics of Tangle.
△ Less
Submitted 18 April, 2023; v1 submitted 10 March, 2022;
originally announced March 2022.
-
Mitigating Misinformation Spread on Blockchain Enabled Social Media Networks
Authors:
Rui Luo,
Vikram Krishnamurthy,
Erik Blasch
Abstract:
The paper develops a blockchain protocol for a social media network (BE-SMN) to mitigate the spread of misinformation. BE-SMN is derived based on the information transmission-time distribution by modeling the misinformation transmission as double-spend attacks on blockchain. The misinformation distribution is then incorporated into the SIR (Susceptible, Infectious, or Recovered) model, which subst…
▽ More
The paper develops a blockchain protocol for a social media network (BE-SMN) to mitigate the spread of misinformation. BE-SMN is derived based on the information transmission-time distribution by modeling the misinformation transmission as double-spend attacks on blockchain. The misinformation distribution is then incorporated into the SIR (Susceptible, Infectious, or Recovered) model, which substitutes the single rate parameter in the traditional SIR model. Then, on a multi-community network, we study the propagation of misinformation numerically and show that the proposed blockchain enabled social media network outperforms the baseline network in flattening the curve of the infected population.
△ Less
Submitted 1 May, 2023; v1 submitted 18 January, 2022;
originally announced January 2022.
-
Context-Aware Transfer Attacks for Object Detection
Authors:
Zikui Cai,
Xinxin Xie,
Shasha Li,
Mingjun Yin,
Chengyu Song,
Srikanth V. Krishnamurthy,
Amit K. Roy-Chowdhury,
M. Salman Asif
Abstract:
Blackbox transfer attacks for image classifiers have been extensively studied in recent years. In contrast, little progress has been made on transfer attacks for object detectors. Object detectors take a holistic view of the image and the detection of one object (or lack thereof) often depends on other objects in the scene. This makes such detectors inherently context-aware and adversarial attacks…
▽ More
Blackbox transfer attacks for image classifiers have been extensively studied in recent years. In contrast, little progress has been made on transfer attacks for object detectors. Object detectors take a holistic view of the image and the detection of one object (or lack thereof) often depends on other objects in the scene. This makes such detectors inherently context-aware and adversarial attacks in this space are more challenging than those targeting image classifiers. In this paper, we present a new approach to generate context-aware attacks for object detectors. We show that by using co-occurrence of objects and their relative locations and sizes as context information, we can successfully generate targeted mis-categorization attacks that achieve higher transfer success rates on blackbox object detectors than the state-of-the-art. We test our approach on a variety of object detectors with images from PASCAL VOC and MS COCO datasets and demonstrate up to $20$ percentage points improvement in performance compared to the other state-of-the-art methods.
△ Less
Submitted 6 December, 2021;
originally announced December 2021.
-
ADC: Adversarial attacks against object Detection that evade Context consistency checks
Authors:
Mingjun Yin,
Shasha Li,
Chengyu Song,
M. Salman Asif,
Amit K. Roy-Chowdhury,
Srikanth V. Krishnamurthy
Abstract:
Deep Neural Networks (DNNs) have been shown to be vulnerable to adversarial examples, which are slightly perturbed input images which lead DNNs to make wrong predictions. To protect from such examples, various defense strategies have been proposed. A very recent defense strategy for detecting adversarial examples, that has been shown to be robust to current attacks, is to check for intrinsic conte…
▽ More
Deep Neural Networks (DNNs) have been shown to be vulnerable to adversarial examples, which are slightly perturbed input images which lead DNNs to make wrong predictions. To protect from such examples, various defense strategies have been proposed. A very recent defense strategy for detecting adversarial examples, that has been shown to be robust to current attacks, is to check for intrinsic context consistencies in the input data, where context refers to various relationships (e.g., object-to-object co-occurrence relationships) in images. In this paper, we show that even context consistency checks can be brittle to properly crafted adversarial examples and to the best of our knowledge, we are the first to do so. Specifically, we propose an adaptive framework to generate examples that subvert such defenses, namely, Adversarial attacks against object Detection that evade Context consistency checks (ADC). In ADC, we formulate a joint optimization problem which has two attack goals, viz., (i) fooling the object detector and (ii) evading the context consistency check system, at the same time. Experiments on both PASCAL VOC and MS COCO datasets show that examples generated with ADC fool the object detector with a success rate of over 85% in most cases, and at the same time evade the recently proposed context consistency checks, with a bypassing rate of over 80% in most cases. Our results suggest that how to robustly model context and check its consistency, is still an open problem.
△ Less
Submitted 23 October, 2021;
originally announced October 2021.
-
Adversarial Attacks on Black Box Video Classifiers: Leveraging the Power of Geometric Transformations
Authors:
Shasha Li,
Abhishek Aich,
Shitong Zhu,
M. Salman Asif,
Chengyu Song,
Amit K. Roy-Chowdhury,
Srikanth V. Krishnamurthy
Abstract:
When compared to the image classification models, black-box adversarial attacks against video classification models have been largely understudied. This could be possible because, with video, the temporal dimension poses significant additional challenges in gradient estimation. Query-efficient black-box attacks rely on effectively estimated gradients towards maximizing the probability of misclassi…
▽ More
When compared to the image classification models, black-box adversarial attacks against video classification models have been largely understudied. This could be possible because, with video, the temporal dimension poses significant additional challenges in gradient estimation. Query-efficient black-box attacks rely on effectively estimated gradients towards maximizing the probability of misclassifying the target video. In this work, we demonstrate that such effective gradients can be searched for by parameterizing the temporal structure of the search space with geometric transformations. Specifically, we design a novel iterative algorithm Geometric TRAnsformed Perturbations (GEO-TRAP), for attacking video classification models. GEO-TRAP employs standard geometric transformation operations to reduce the search space for effective gradients into searching for a small group of parameters that define these operations. This group of parameters describes the geometric progression of gradients, resulting in a reduced and structured search space. Our algorithm inherently leads to successful perturbations with surprisingly few queries. For example, adversarial examples generated from GEO-TRAP have better attack success rates with ~73.55% fewer queries compared to the state-of-the-art method for video adversarial attacks on the widely used Jester dataset. Overall, our algorithm exposes vulnerabilities of diverse video classification models and achieves new state-of-the-art results under black-box settings on two large datasets. Code is available here: https://github.com/sli057/Geo-TRAP
△ Less
Submitted 26 October, 2021; v1 submitted 5 October, 2021;
originally announced October 2021.
-
Anomalous Edge Detection in Edge Exchangeable Social Network Models
Authors:
Rui Luo,
Buddhika Nettasinghe,
Vikram Krishnamurthy
Abstract:
This paper studies detecting anomalous edges in directed graphs that model social networks. We exploit edge exchangeability as a criterion for distinguishing anomalous edges from normal edges. Then we present an anomaly detector based on conformal prediction theory; this detector has a guaranteed upper bound for false positive rate. In numerical experiments, we show that the proposed algorithm ach…
▽ More
This paper studies detecting anomalous edges in directed graphs that model social networks. We exploit edge exchangeability as a criterion for distinguishing anomalous edges from normal edges. Then we present an anomaly detector based on conformal prediction theory; this detector has a guaranteed upper bound for false positive rate. In numerical experiments, we show that the proposed algorithm achieves superior performance to baseline methods.
△ Less
Submitted 21 August, 2023; v1 submitted 26 September, 2021;
originally announced September 2021.
-
Controlling Segregation in Social Network Dynamics as an Edge Formation Game
Authors:
Rui Luo,
Buddhika Nettasinghe,
Vikram Krishnamurthy
Abstract:
This paper studies controlling segregation in social networks via exogenous incentives. We construct an edge formation game on a directed graph. A user (node) chooses the probability with which it forms an inter- or intra- community edge based on a utility function that reflects the tradeoff between homophily (preference to connect with individuals that belong to the same group) and the preference…
▽ More
This paper studies controlling segregation in social networks via exogenous incentives. We construct an edge formation game on a directed graph. A user (node) chooses the probability with which it forms an inter- or intra- community edge based on a utility function that reflects the tradeoff between homophily (preference to connect with individuals that belong to the same group) and the preference to obtain an exogenous incentive. Decisions made by the users to connect with each other determine the evolution of the social network. We explore an algorithmic recommendation mechanism where the exogenous incentive in the utility function is based on weak ties which incentivizes users to connect across communities and mitigates the segregation. This setting leads to a submodular game with a unique Nash equilibrium. In numerical simulations, we explore how the proposed model can be useful in controlling segregation and echo chambers in social networks under various settings.
△ Less
Submitted 28 August, 2021;
originally announced August 2021.
-
Exploiting Multi-Object Relationships for Detecting Adversarial Attacks in Complex Scenes
Authors:
Mingjun Yin,
Shasha Li,
Zikui Cai,
Chengyu Song,
M. Salman Asif,
Amit K. Roy-Chowdhury,
Srikanth V. Krishnamurthy
Abstract:
Vision systems that deploy Deep Neural Networks (DNNs) are known to be vulnerable to adversarial examples. Recent research has shown that checking the intrinsic consistencies in the input data is a promising way to detect adversarial attacks (e.g., by checking the object co-occurrence relationships in complex scenes). However, existing approaches are tied to specific models and do not offer genera…
▽ More
Vision systems that deploy Deep Neural Networks (DNNs) are known to be vulnerable to adversarial examples. Recent research has shown that checking the intrinsic consistencies in the input data is a promising way to detect adversarial attacks (e.g., by checking the object co-occurrence relationships in complex scenes). However, existing approaches are tied to specific models and do not offer generalizability. Motivated by the observation that language descriptions of natural scene images have already captured the object co-occurrence relationships that can be learned by a language model, we develop a novel approach to perform context consistency checks using such language models. The distinguishing aspect of our approach is that it is independent of the deployed object detector and yet offers very high accuracy in terms of detecting adversarial examples in practical scenes with multiple objects.
△ Less
Submitted 18 August, 2021;
originally announced August 2021.
-
Construction of Planar and Symmetric Truss Structures with Interlocking Edge Elements
Authors:
Anantha Natarajan,
Jiaqi Cui,
Ergun Akleman,
Vinayak Krishnamurthy
Abstract:
In this paper, we present an algorithmic approach to design and construct planar truss structures based on symmetric lattices using modular elements. The method of assembly is similar to Leonardo grids as they both rely on the property of interlocking. In theory, our modular elements can be assembled by the same type of binary operations. Our modular elements embody the principle of geometric inte…
▽ More
In this paper, we present an algorithmic approach to design and construct planar truss structures based on symmetric lattices using modular elements. The method of assembly is similar to Leonardo grids as they both rely on the property of interlocking. In theory, our modular elements can be assembled by the same type of binary operations. Our modular elements embody the principle of geometric interlocking, a principle recently introduced in literature that allows for pieces of an assembly to be interlocked in a way that they can neither be assembled nor disassembled unless the pieces are subjected to deformation or breakage. We demonstrate that breaking the pieces can indeed facilitate the effective assembly of these pieces through the use of a simple key-in-hole concept. As a result, these modular elements can be assembled together to form an interlocking structure, in which the locking pieces apply the force necessary to hold the entire assembly together.
△ Less
Submitted 18 June, 2021;
originally announced June 2021.
-
Reference-based Weak Supervision for Answer Sentence Selection using Web Data
Authors:
Vivek Krishnamurthy,
Thuy Vu,
Alessandro Moschitti
Abstract:
Answer sentence selection (AS2) modeling requires annotated data, i.e., hand-labeled question-answer pairs. We present a strategy to collect weakly supervised answers for a question based on its reference to improve AS2 modeling. Specifically, we introduce Reference-based Weak Supervision (RWS), a fully automatic large-scale data pipeline that harvests high-quality weakly-supervised answers from a…
▽ More
Answer sentence selection (AS2) modeling requires annotated data, i.e., hand-labeled question-answer pairs. We present a strategy to collect weakly supervised answers for a question based on its reference to improve AS2 modeling. Specifically, we introduce Reference-based Weak Supervision (RWS), a fully automatic large-scale data pipeline that harvests high-quality weakly-supervised answers from abundant Web data requiring only a question-reference pair as input. We study the efficacy and robustness of RWS in the setting of TANDA, a recent state-of-the-art fine-tuning approach specialized for AS2. Our experiments indicate that the produced data consistently bolsters TANDA. We achieve the state of the art in terms of P@1, 90.1%, and MAP, 92.9%, on WikiQA.
△ Less
Submitted 18 April, 2021;
originally announced April 2021.
-
Distributed learning in congested environments with partial information
Authors:
Tomer Boyarski,
Amir Leshem,
Vikram Krishnamurthy
Abstract:
How can non-communicating agents learn to share congested resources efficiently? This is a challenging task when the agents can access the same resource simultaneously (in contrast to multi-agent multi-armed bandit problems) and the resource valuations differ among agents. We present a fully distributed algorithm for learning to share in congested environments and prove that the agents' regret wit…
▽ More
How can non-communicating agents learn to share congested resources efficiently? This is a challenging task when the agents can access the same resource simultaneously (in contrast to multi-agent multi-armed bandit problems) and the resource valuations differ among agents. We present a fully distributed algorithm for learning to share in congested environments and prove that the agents' regret with respect to the optimal allocation is poly-logarithmic in the time horizon. Performance in the non-asymptotic regime is illustrated in numerical simulations. The distributed algorithm has applications in cloud computing and spectrum sharing.
Keywords: Distributed learning, congestion games, poly-logarithmic regret.
△ Less
Submitted 11 May, 2021; v1 submitted 29 March, 2021;
originally announced March 2021.
-
A Directed, Bi-Populated Preferential Attachment Model with Applications to Analyzing the Glass Ceiling Effect
Authors:
Buddhika Nettasinghe,
Nazanin Alipourfard,
Vikram Krishnamurthy,
Kristina Lerman
Abstract:
Preferential attachment, homophily and, their consequences such as the glass ceiling effect have been well-studied in the context of undirected networks. However, the lack of an intuitive, theoretically tractable model of a directed, bi-populated~(i.e.,~containing two groups) network with variable levels of preferential attachment, homophily and growth dynamics~(e.g.,~the rate at which new nodes j…
▽ More
Preferential attachment, homophily and, their consequences such as the glass ceiling effect have been well-studied in the context of undirected networks. However, the lack of an intuitive, theoretically tractable model of a directed, bi-populated~(i.e.,~containing two groups) network with variable levels of preferential attachment, homophily and growth dynamics~(e.g.,~the rate at which new nodes join, whether the new nodes mostly follow existing nodes or the existing nodes follow them, etc.) has largely prevented such consequences from being explored in the context of directed networks, where they more naturally occur due to the asymmetry of links. To this end, we present a rigorous theoretical analysis of the \emph{Directed Mixed Preferential Attachment} model and, use it to analyze the glass ceiling effect in directed networks. More specifically, we derive the closed-form expressions for the power-law exponents of the in- and out- degree distributions of each group~(minority and majority) and, compare them with each other to obtain insights. In particular, our results yield answers to questions such as: \emph{when does the minority group have a heavier out-degree (or in-degree) distribution compared to the majority group? what effect does frequent addition of edges between existing nodes have on the in- and out- degree distributions of the majority and minority groups?}. Such insights shed light on the interplay between the structure~(i.e., the in- and out- degree distributions of the two groups) and dynamics~(characterized collectively by the homophily, preferential attachment, group sizes and growth dynamics) of various real-world networks. Finally, we utilize the obtained analytical results to characterize the conditions under which the glass ceiling effect emerge in a directed network. Our analytical results are supported by detailed numerical results.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
Emergence of Structural Inequalities in Scientific Citation Networks
Authors:
Buddhika Nettasinghe,
Nazanin Alipourfard,
Vikram Krishnamurthy,
Kristina Lerman
Abstract:
Structural inequalities persist in society, conferring systematic advantages to some people at the expense of others, for example, by giving them substantially more influence and opportunities. Using bibliometric data about authors of scientific publications, we identify two types of structural inequalities in scientific citations. First, female authors, who represent a minority of researchers, re…
▽ More
Structural inequalities persist in society, conferring systematic advantages to some people at the expense of others, for example, by giving them substantially more influence and opportunities. Using bibliometric data about authors of scientific publications, we identify two types of structural inequalities in scientific citations. First, female authors, who represent a minority of researchers, receive less recognition for their work (through citations) relative to male authors; second, authors affiliated with top-ranked institutions, who are also a minority, receive substantially more recognition compared to other authors. We present a model for the growth of directed citation networks and show that citations disparities arise from individual preferences to cite authors from the same group (homophily), highly cited or active authors (preferential attachment), as well as the size of the group and how frequently new authors join. We analyze the model and show that its predictions align well with real-world observations. Our theoretical and empirical analysis also suggests potential strategies to mitigate structural inequalities in science. In particular, we find that merely increasing the minority group size does little to narrow the disparities. Instead, reducing the homophily of each group, frequently adding new authors to a research field while providing them an accessible platform among existing, established authors, together with balanced group sizes can have the largest impact on reducing inequality. Our work highlights additional complexities of mitigating structural disparities stemming from asymmetric relations (e.g., directed citations) compared to symmetric relations (e.g., collaborations).
△ Less
Submitted 1 May, 2021; v1 submitted 19 March, 2021;
originally announced March 2021.
-
Rationally Inattentive Utility Maximization for Interpretable Deep Image Classification
Authors:
Kunal Pattanayak,
Vikram Krishnamurthy
Abstract:
Are deep convolutional neural networks (CNNs) for image classification explainable by utility maximization with information acquisition costs? We demonstrate that deep CNNs behave equivalently (in terms of necessary and sufficient conditions) to rationally inattentive utility maximizers, a generative model used extensively in economics for human decision making. Our claim is based by extensive exp…
▽ More
Are deep convolutional neural networks (CNNs) for image classification explainable by utility maximization with information acquisition costs? We demonstrate that deep CNNs behave equivalently (in terms of necessary and sufficient conditions) to rationally inattentive utility maximizers, a generative model used extensively in economics for human decision making. Our claim is based by extensive experiments on 200 deep CNNs from 5 popular architectures. The parameters of our interpretable model are computed efficiently via convex feasibility algorithms. As an application, we show that our economics-based interpretable model can predict the classification performance of deep CNNs trained with arbitrary parameters with accuracy exceeding 94% . This eliminates the need to re-train the deep CNNs for image classification. The theoretical foundation of our approach lies in Bayesian revealed preference studied in micro-economics. All our results are on GitHub and completely reproducible.
△ Less
Submitted 30 July, 2021; v1 submitted 8 February, 2021;
originally announced February 2021.
-
Echo Chambers and Segregation in Social Networks: Markov Bridge Models and Estimation
Authors:
Rui Luo,
Buddhika Nettasinghe,
Vikram Krishnamurthy
Abstract:
This paper deals with the modeling and estimation of the sociological phenomena called echo chambers and segregation in social networks. Specifically, we present a novel community-based graph model that represents the emergence of segregated echo chambers as a Markov bridge process. A Markov bridge is a one-dimensional Markov random field that facilitates modeling the formation and disassociation…
▽ More
This paper deals with the modeling and estimation of the sociological phenomena called echo chambers and segregation in social networks. Specifically, we present a novel community-based graph model that represents the emergence of segregated echo chambers as a Markov bridge process. A Markov bridge is a one-dimensional Markov random field that facilitates modeling the formation and disassociation of communities at deterministic times which is important in social networks with known timed events. We justify the proposed model with six real world examples and examine its performance on a recent Twitter dataset. We provide model parameter estimation algorithm based on maximum likelihood and, a Bayesian filtering algorithm for recursively estimating the level of segregation using noisy samples obtained from the network. Numerical results indicate that the proposed filtering algorithm outperforms the conventional hidden Markov modeling in terms of the mean-squared error. The proposed filtering method is useful in computational social science where data-driven estimation of the level of segregation from noisy data is required.
△ Less
Submitted 25 December, 2020;
originally announced December 2020.
-
You Do (Not) Belong Here: Detecting DPI Evasion Attacks with Context Learning
Authors:
Shitong Zhu,
Shasha Li,
Zhongjie Wang,
Xun Chen,
Zhiyun Qian,
Srikanth V. Krishnamurthy,
Kevin S. Chan,
Ananthram Swami
Abstract:
As Deep Packet Inspection (DPI) middleboxes become increasingly popular, a spectrum of adversarial attacks have emerged with the goal of evading such middleboxes. Many of these attacks exploit discrepancies between the middlebox network protocol implementations, and the more rigorous/complete versions implemented at end hosts. These evasion attacks largely involve subtle manipulations of packets t…
▽ More
As Deep Packet Inspection (DPI) middleboxes become increasingly popular, a spectrum of adversarial attacks have emerged with the goal of evading such middleboxes. Many of these attacks exploit discrepancies between the middlebox network protocol implementations, and the more rigorous/complete versions implemented at end hosts. These evasion attacks largely involve subtle manipulations of packets to cause different behaviours at DPI and end hosts, to cloak malicious network traffic that is otherwise detectable. With recent automated discovery, it has become prohibitively challenging to manually curate rules for detecting these manipulations. In this work, we propose CLAP, the first fully-automated, unsupervised ML solution to accurately detect and localize DPI evasion attacks. By learning what we call the packet context, which essentially captures inter-relationships across both (1) different packets in a connection; and (2) different header fields within each packet, from benign traffic traces only, CLAP can detect and pinpoint packets that violate the benign packet contexts (which are the ones that are specially crafted for evasion purposes). Our evaluations with 73 state-of-the-art DPI evasion attacks show that CLAP achieves an Area Under the Receiver Operating Characteristic Curve (AUC-ROC) of 0.963, an Equal Error Rate (EER) of only 0.061 in detection, and an accuracy of 94.6% in localization. These results suggest that CLAP can be a promising tool for thwarting DPI evasion attacks.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
Adaptive Non-reversible Stochastic Gradient Langevin Dynamics
Authors:
Vikram Krishnamurthy,
George Yin
Abstract:
It is well known that adding any skew symmetric matrix to the gradient of Langevin dynamics algorithm results in a non-reversible diffusion with improved convergence rate. This paper presents a gradient algorithm to adaptively optimize the choice of the skew symmetric matrix. The resulting algorithm involves a non-reversible diffusion algorithm cross coupled with a stochastic gradient algorithm th…
▽ More
It is well known that adding any skew symmetric matrix to the gradient of Langevin dynamics algorithm results in a non-reversible diffusion with improved convergence rate. This paper presents a gradient algorithm to adaptively optimize the choice of the skew symmetric matrix. The resulting algorithm involves a non-reversible diffusion algorithm cross coupled with a stochastic gradient algorithm that adapts the skew symmetric matrix. The algorithm uses the same data as the classical Langevin algorithm. A weak convergence proof is given for the optimality of the choice of the skew symmetric matrix. The improved convergence rate of the algorithm is illustrated numerically in Bayesian learning and tracking examples.
△ Less
Submitted 26 September, 2020;
originally announced September 2020.
-
A Markov Decision Process Approach to Active Meta Learning
Authors:
Bingjia Wang,
Alec Koppel,
Vikram Krishnamurthy
Abstract:
In supervised learning, we fit a single statistical model to a given data set, assuming that the data is associated with a singular task, which yields well-tuned models for specific use, but does not adapt well to new contexts. By contrast, in meta-learning, the data is associated with numerous tasks, and we seek a model that may perform well on all tasks simultaneously, in pursuit of greater gene…
▽ More
In supervised learning, we fit a single statistical model to a given data set, assuming that the data is associated with a singular task, which yields well-tuned models for specific use, but does not adapt well to new contexts. By contrast, in meta-learning, the data is associated with numerous tasks, and we seek a model that may perform well on all tasks simultaneously, in pursuit of greater generalization. One challenge in meta-learning is how to exploit relationships between tasks and classes, which is overlooked by commonly used random or cyclic passes through data. In this work, we propose actively selecting samples on which to train by discerning covariates inside and between meta-training sets. Specifically, we cast the problem of selecting a sample from a number of meta-training sets as either a multi-armed bandit or a Markov Decision Process (MDP), depending on how one encapsulates correlation across tasks. We develop scheduling schemes based on Upper Confidence Bound (UCB), Gittins Index and tabular Markov Decision Problems (MDPs) solved with linear programming, where the reward is the scaled statistical accuracy to ensure it is a time-invariant function of state and action. Across a variety of experimental contexts, we observe significant reductions in sample complexity of active selection scheme relative to cyclic or i.i.d. sampling, demonstrating the merit of exploiting covariates in practice.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.