Zum Hauptinhalt springen

Showing 1–50 of 64 results for author: Mahadevan, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.07343  [pdf, other

    eess.SY cs.LG stat.ME

    Graph neural networks for power grid operational risk assessment under evolving grid topology

    Authors: Yadong Zhang, Pranav M Karve, Sankaran Mahadevan

    Abstract: This article investigates the ability of graph neural networks (GNNs) to identify risky conditions in a power grid over the subsequent few hours, without explicit, high-resolution information regarding future generator on/off status (grid topology) or power dispatch decisions. The GNNs are trained using supervised learning, to predict the power grid's aggregated bus-level (either zonal or system-l… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: Manuscript submitted to Applied Energy

  2. arXiv:2405.01540  [pdf, other

    cs.AI cs.LG

    Universal Imitation Games

    Authors: Sridhar Mahadevan

    Abstract: Alan Turing proposed in 1950 a framework called an imitation game to decide if a machine could think. Using mathematics developed largely after Turing -- category theory -- we analyze a broader class of universal imitation games (UIGs), which includes static, dynamic, and evolutionary games. In static games, the participants are in a steady state. In dynamic UIGs, "learner" participants are trying… ▽ More

    Submitted 1 February, 2024; originally announced May 2024.

    Comments: 98 pages. arXiv admin note: substantial text overlap with arXiv:2402.18732

  3. arXiv:2402.18732  [pdf, other

    cs.AI cs.LG

    GAIA: Categorical Foundations of Generative AI

    Authors: Sridhar Mahadevan

    Abstract: In this paper, we propose GAIA, a generative AI architecture based on category theory. GAIA is based on a hierarchical model where modules are organized as a simplicial complex. Each simplicial complex updates its internal parameters biased on information it receives from its superior simplices and in turn relays updates to its subordinate sub-simplices. Parameter updates are formulated in terms o… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 65 pages. arXiv admin note: text overlap with arXiv:2212.08981

  4. arXiv:2402.05917  [pdf, other

    cs.CV

    Point-VOS: Pointing Up Video Object Segmentation

    Authors: Idil Esen Zulfikar, Sabarinath Mahadevan, Paul Voigtlaender, Bastian Leibe

    Abstract: Current state-of-the-art Video Object Segmentation (VOS) methods rely on dense per-object mask annotations both during training and testing. This requires time-consuming and costly video annotation mechanisms. We propose a novel Point-VOS task with a spatio-temporally sparse point-wise annotation scheme that substantially reduces the annotation effort. We apply our annotation scheme to two large-s… ▽ More

    Submitted 10 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Accepted to CVPR2024!

  5. arXiv:2311.12309  [pdf, other

    cs.LG eess.SY

    Power grid operational risk assessment using graph neural network surrogates

    Authors: Yadong Zhang, Pranav M Karve, Sankaran Mahadevan

    Abstract: We investigate the utility of graph neural networks (GNNs) as proxies of power grid operational decision-making algorithms (optimal power flow (OPF) and security-constrained unit commitment (SCUC)) to enable rigorous quantification of the operational risk. To conduct principled risk analysis, numerous Monte Carlo (MC) samples are drawn from the (foretasted) probability distributions of spatio-temp… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Manuscript submitted to IEEE PES GM 2024

  6. arXiv:2311.03661  [pdf, other

    eess.SY cs.LG

    Operational risk quantification of power grids using graph neural network surrogates of the DC OPF

    Authors: Yadong Zhang, Pranav M Karve, Sankaran Mahadevan

    Abstract: A DC OPF surrogate modeling framework is developed for Monte Carlo (MC) sampling-based risk quantification in power grid operation. MC simulation necessitates solving a large number of DC OPF problems corresponding to the samples of stochastic grid variables (power demand and renewable generation), which is computationally prohibitive. Computationally inexpensive surrogates of OPF provide an attra… ▽ More

    Submitted 21 April, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: Manuscript submitted to IEEE Transactions on Power Systems

  7. arXiv:2307.08352  [pdf, ps, other

    cs.LG stat.ML

    Zero-th Order Algorithm for Softmax Attention Optimization

    Authors: Yichuan Deng, Zhihang Li, Sridhar Mahadevan, Zhao Song

    Abstract: Large language models (LLMs) have brought about significant transformations in human society. Among the crucial computations in LLMs, the softmax unit holds great importance. Its helps the model generating a probability distribution on potential subsequent words or phrases, considering a series of input words. By utilizing this distribution, the model selects the most probable next word or phrase,… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

  8. arXiv:2306.00977  [pdf, other

    cs.CV cs.HC

    AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation

    Authors: Yuanwen Yue, Sabarinath Mahadevan, Jonas Schult, Francis Engelmann, Bastian Leibe, Konrad Schindler, Theodora Kontogianni

    Abstract: During interactive segmentation, a model and a user work together to delineate objects of interest in a 3D point cloud. In an iterative process, the model assigns each data point to an object (or the background), while the user corrects errors in the resulting segmentation and feeds them back into the model. The current best practice formulates the problem as binary classification and segments obj… ▽ More

    Submitted 10 April, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: ICLR 2024 camera-ready. Project page: https://ywyue.github.io/AGILE3D

  9. arXiv:2304.06668  [pdf, other

    cs.CV

    DynaMITe: Dynamic Query Bootstrapping for Multi-object Interactive Segmentation Transformer

    Authors: Amit Kumar Rana, Sabarinath Mahadevan, Alexander Hermans, Bastian Leibe

    Abstract: Most state-of-the-art instance segmentation methods rely on large amounts of pixel-precise ground-truth annotations for training, which are expensive to create. Interactive segmentation networks help generate such annotations based on an image and the corresponding user interactions such as clicks. Existing methods for this task can only process a single instance at a time and each user interactio… ▽ More

    Submitted 22 August, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: Accepted to ICCV 2023

  10. arXiv:2304.04397  [pdf, ps, other

    cs.DS cs.LG

    Randomized and Deterministic Attention Sparsification Algorithms for Over-parameterized Feature Dimension

    Authors: Yichuan Deng, Sridhar Mahadevan, Zhao Song

    Abstract: Large language models (LLMs) have shown their power in different areas. Attention computation, as an important subroutine of LLMs, has also attracted interests in theory. Recently the static computation and dynamic maintenance of attention matrix has been studied by [Alman and Song 2023] and [Brand, Song and Zhou 2023] from both algorithmic perspective and hardness perspective. In this work, we co… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  11. arXiv:2303.16504  [pdf, ps, other

    cs.LG stat.ML

    An Over-parameterized Exponential Regression

    Authors: Yeqi Gao, Sridhar Mahadevan, Zhao Song

    Abstract: Over the past few years, there has been a significant amount of research focused on studying the ReLU activation function, with the aim of achieving neural network convergence through over-parametrization. However, recent developments in the field of Large Language Models (LLMs) have sparked interest in the use of exponential activation functions, specifically in the attention mechanism. Mathema… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

  12. arXiv:2212.08981  [pdf, other

    cs.AI cs.LG math.CT

    A Layered Architecture for Universal Causality

    Authors: Sridhar Mahadevan

    Abstract: We propose a layered hierarchical architecture called UCLA (Universal Causality Layered Architecture), which combines multiple levels of categorical abstraction for causal inference. At the top-most level, causal interventions are modeled combinatorially using a simplicial category of ordinal numbers. At the second layer, causal models are defined by a graph-type category. The non-random ``surgica… ▽ More

    Submitted 17 December, 2022; originally announced December 2022.

    Comments: 33

  13. arXiv:2211.03758  [pdf, other

    stat.ME cs.AI cs.HC

    Privacy Aware Experiments without Cookies

    Authors: Shiv Shankar, Ritwik Sinha, Saayan Mitra, Viswanathan Swaminathan, Sridhar Mahadevan, Moumita Sinha

    Abstract: Consider two brands that want to jointly test alternate web experiences for their customers with an A/B test. Such collaborative tests are today enabled using \textit{third-party cookies}, where each brand has information on the identity of visitors to another website. With the imminent elimination of third-party cookies, such A/B tests will become untenable. We propose a two-stage experimental de… ▽ More

    Submitted 6 February, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: Technical report supplementing paper accepted to WSDM 23

  14. arXiv:2210.06528  [pdf, other

    math.NA cs.DC

    Parallel Domain Decomposition techniques applied to Multivariate Functional Approximation of discrete data

    Authors: Vijay S. Mahadevan, David Lenz, Iulian Grindeanu, Thomas Peterka

    Abstract: Compactly expressing large-scale datasets through Multivariate Functional Approximations (MFA) can be critically important for analysis and visualization to drive scientific discovery. Tackling such problems requires scalable data partitioning approaches to compute MFA representations in amenable wall clock times. We introduce a fully parallel scheme to reduce the total work per task in combinatio… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: Submitted to SIAM Journal of Scientific Computing

    MSC Class: 65D05; 65D15; 65Y05

  15. arXiv:2209.14858  [pdf, other

    cs.CV

    4D-StOP: Panoptic Segmentation of 4D LiDAR using Spatio-temporal Object Proposal Generation and Aggregation

    Authors: Lars Kreuzberg, Idil Esen Zulfikar, Sabarinath Mahadevan, Francis Engelmann, Bastian Leibe

    Abstract: In this work, we present a new paradigm, called 4D-StOP, to tackle the task of 4D Panoptic LiDAR Segmentation. 4D-StOP first generates spatio-temporal proposals using voting-based center predictions, where each point in the 4D volume votes for a corresponding center. These tracklet proposals are further aggregated using learned geometric features. The tracklet aggregation method effectively genera… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: Accepted to the ECCV 2022 AVVision Workshop

    Journal ref: European Conference on Computer Vision Workshops 2022

  16. arXiv:2209.12762  [pdf, other

    eess.SY cs.LG

    Just-In-Time Learning for Operational Risk Assessment in Power Grids

    Authors: Oliver Stover, Pranav Karve, Sankaran Mahadevan, Wenbo Chen, Haoruo Zhao, Mathieu Tanneau, Pascal Van Hentenryck

    Abstract: In a grid with a significant share of renewable generation, operators will need additional tools to evaluate the operational risk due to the increased volatility in load and generation. The computational requirements of the forward uncertainty propagation problem, which must solve numerous security-constrained economic dispatch (SCED) optimizations, is a major barrier for such real-time risk asses… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

  17. arXiv:2209.06262  [pdf, other

    cs.AI math.CT

    Unifying Causal Inference and Reinforcement Learning using Higher-Order Category Theory

    Authors: Sridhar Mahadevan

    Abstract: We present a unified formalism for structure discovery of causal models and predictive state representation (PSR) models in reinforcement learning (RL) using higher-order category theory. Specifically, we model structure discovery in both settings using simplicial objects, contravariant functors from the category of ordinal numbers into any category. Fragments of causal models that are equivalent… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: 21 pages

  18. arXiv:2208.14197  [pdf, other

    cs.CE cs.AI cs.LG

    A Comprehensive Review of Digital Twin -- Part 1: Modeling and Twinning Enabling Technologies

    Authors: Adam Thelen, Xiaoge Zhang, Olga Fink, Yan Lu, Sayan Ghosh, Byeng D. Youn, Michael D. Todd, Sankaran Mahadevan, Chao Hu, Zhen Hu

    Abstract: As an emerging technology in the era of Industry 4.0, digital twin is gaining unprecedented attention because of its promise to further optimize process design, quality control, health monitoring, decision and policy making, and more, by comprehensively modeling the physical world as a group of interconnected digital models. In a two-part series of papers, we examine the fundamental role of differ… ▽ More

    Submitted 30 September, 2022; v1 submitted 26 August, 2022; originally announced August 2022.

  19. arXiv:2208.12904  [pdf, other

    cs.LG math.OC

    A Comprehensive Review of Digital Twin -- Part 2: Roles of Uncertainty Quantification and Optimization, a Battery Digital Twin, and Perspectives

    Authors: Adam Thelen, Xiaoge Zhang, Olga Fink, Yan Lu, Sayan Ghosh, Byeng D. Youn, Michael D. Todd, Sankaran Mahadevan, Chao Hu, Zhen Hu

    Abstract: As an emerging technology in the era of Industry 4.0, digital twin is gaining unprecedented attention because of its promise to further optimize process design, quality control, health monitoring, decision and policy making, and more, by comprehensively modeling the physical world as a group of interconnected digital models. In a two-part series of papers, we examine the fundamental role of differ… ▽ More

    Submitted 26 August, 2022; originally announced August 2022.

  20. arXiv:2208.11077  [pdf, other

    cs.AI cs.LG math.CT

    Categoroids: Universal Conditional Independence

    Authors: Sridhar Mahadevan

    Abstract: Conditional independence has been widely used in AI, causal inference, machine learning, and statistics. We introduce categoroids, an algebraic structure for characterizing universal properties of conditional independence. Categoroids are defined as a hybrid of two categories: one encoding a preordered lattice structure defined by objects and arrows between them; the second dual parameterization i… ▽ More

    Submitted 23 August, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

    Comments: 26 pages

  21. arXiv:2207.02917  [pdf, other

    cs.AI

    On The Universality of Diagrams for Causal Inference and The Causal Reproducing Property

    Authors: Sridhar Mahadevan

    Abstract: We propose Universal Causality, an overarching framework based on category theory that defines the universal property that underlies causal inference independent of the underlying representational formalism used. More formally, universal causal models are defined as categories consisting of objects and morphisms between them representing causal influences, as well as structures for carrying out in… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

  22. arXiv:2205.06118  [pdf, other

    cs.CL

    Findings of the Shared Task on Offensive Span Identification from Code-Mixed Tamil-English Comments

    Authors: Manikandan Ravikiran, Bharathi Raja Chakravarthi, Anand Kumar Madasamy, Sangeetha Sivanesan, Ratnavel Rajalakshmi, Sajeetha Thavareesan, Rahul Ponnusamy, Shankar Mahadevan

    Abstract: Offensive content moderation is vital in social media platforms to support healthy online discussions. However, their prevalence in codemixed Dravidian languages is limited to classifying whole comments without identifying part of it contributing to offensiveness. Such limitation is primarily due to the lack of annotated data for offensive spans. Accordingly, in this shared task, we provide Tamil-… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: System Description of Shared Task https://competitions.codalab.org/competitions/36395

  23. arXiv:2204.10979  [pdf, other

    cs.LG cs.AI

    Smoothed Online Combinatorial Optimization Using Imperfect Predictions

    Authors: Kai Wang, Zhao Song, Georgios Theocharous, Sridhar Mahadevan

    Abstract: Smoothed online combinatorial optimization considers a learner who repeatedly chooses a combinatorial decision to minimize an unknown changing cost function with a penalty on switching decisions in consecutive rounds. We study smoothed online combinatorial optimization problems when an imperfect predictive model is available, where the model can forecast the future cost functions with uncertainty.… ▽ More

    Submitted 13 January, 2023; v1 submitted 22 April, 2022; originally announced April 2022.

  24. arXiv:2112.01847  [pdf, other

    math.AT cs.AI

    Causal Homotopy

    Authors: Sridhar Mahadevan

    Abstract: We characterize homotopical equivalences between causal DAG models, exploiting the close connections between partially ordered set representations of DAGs (posets) and finite Alexandroff topologies. Alexandroff spaces yield a directional topological space: the topology is defined by a unique minimal basis defined by an open set for each variable x, specified as the intersection of all open sets co… ▽ More

    Submitted 20 September, 2021; originally announced December 2021.

    Comments: 18 pages. arXiv admin note: text overlap with arXiv:2110.15431

  25. arXiv:2111.07774  [pdf, other

    cs.CV

    D^2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos

    Authors: Christian Schmidt, Ali Athar, Sabarinath Mahadevan, Bastian Leibe

    Abstract: Despite receiving significant attention from the research community, the task of segmenting and tracking objects in monocular videos still has much room for improvement. Existing works have simultaneously justified the efficacy of dilated and deformable convolutions for various image-level segmentation tasks. This gives reason to believe that 3D extensions of such convolutions should also yield pe… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

    Comments: Accepted to WACV 2022

  26. arXiv:2110.15431  [pdf, other

    cs.AI cs.LG

    Universal Decision Models

    Authors: Sridhar Mahadevan

    Abstract: Humans are universal decision makers: we reason causally to understand the world; we act competitively to gain advantage in commerce, games, and war; and we are able to learn to make better decisions through trial and error. In this paper, we propose Universal Decision Model (UDM), a mathematical formalism based on category theory. Decision objects in a UDM correspond to instances of decision task… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

  27. arXiv:2109.11344  [pdf, other

    math.OC cs.AI

    Causal Inference in Network Economics

    Authors: Sridhar Mahadevan

    Abstract: Network economics is the study of a rich class of equilibrium problems that occur in the real world, from traffic management to supply chains and two-sided online marketplaces. In this paper we explore causal inference in network economics, building on the mathematical framework of variational inequalities, which is a generalization of classical optimization. Our framework can be viewed as a synth… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

    Comments: 12 pages

  28. arXiv:2109.09653  [pdf, other

    cs.AI

    Asymptotic Causal Inference

    Authors: Sridhar Mahadevan

    Abstract: We investigate causal inference in the asymptotic regime as the number of variables approaches infinity using an information-theoretic framework. We define structural entropy of a causal model in terms of its description complexity measured by the logarithmic growth rate, measured in bits, of all directed acyclic graphs (DAGs), parameterized by the edge density d. Structural entropy yields non-int… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

    Comments: 16 pages

  29. arXiv:2109.09222  [pdf, other

    cs.LG

    Multiscale Manifold Warping

    Authors: Sridhar Mahadevan, Anup Rao, Georgios Theocharous, Jennifer Healey

    Abstract: Many real-world applications require aligning two temporal sequences, including bioinformatics, handwriting recognition, activity recognition, and human-robot coordination. Dynamic Time Warping (DTW) is a popular alignment method, but can fail on high-dimensional real-world data where the dimensions of aligned sequences are often unequal. In this paper, we show that exploiting the multiscale manif… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

    Comments: 18 pages

  30. arXiv:2012.12351  [pdf, ps, other

    q-bio.NC cs.LG eess.SP math.DS math.OC

    Is the brain macroscopically linear? A system identification of resting state dynamics

    Authors: Erfan Nozari, Maxwell A. Bertolero, Jennifer Stiso, Lorenzo Caciagli, Eli J. Cornblath, Xiaosong He, Arun S. Mahadevan, George J. Pappas, Dani Smith Bassett

    Abstract: A central challenge in the computational modeling of neural dynamics is the trade-off between accuracy and simplicity. At the level of individual neurons, nonlinear dynamics are both experimentally established and essential for neuronal functioning. An implicit assumption has thus formed that an accurate computational model of whole-brain dynamics must also be highly nonlinear, whereas linear mode… ▽ More

    Submitted 11 August, 2021; v1 submitted 22 December, 2020; originally announced December 2020.

  31. arXiv:2008.11516  [pdf, other

    cs.CV

    Making a Case for 3D Convolutions for Object Segmentation in Videos

    Authors: Sabarinath Mahadevan, Ali Athar, Aljoša Ošep, Sebastian Hennen, Laura Leal-Taixé, Bastian Leibe

    Abstract: The task of object segmentation in videos is usually accomplished by processing appearance and motion information separately using standard 2D convolutional networks, followed by a learned fusion of the two sources of information. On the other hand, 3D convolutional networks have been successfully applied for video classification tasks, but have not been leveraged as effectively to problems involv… ▽ More

    Submitted 1 September, 2023; v1 submitted 26 August, 2020; originally announced August 2020.

    Comments: BMVC '20

  32. arXiv:2006.14364  [pdf, other

    cs.LG stat.ML

    Finite-Sample Analysis of Proximal Gradient TD Algorithms

    Authors: Bo Liu, Ji Liu, Mohammad Ghavamzadeh, Sridhar Mahadevan, Marek Petrik

    Abstract: In this paper, we analyze the convergence rate of the gradient temporal difference learning (GTD) family of algorithms. Previous analyses of this class of algorithms use ODE techniques to prove asymptotic convergence, and to the best of our knowledge, no finite-sample analysis has been done. Moreover, there has been not much work on finite-sample analysis for convergent off-policy reinforcement le… ▽ More

    Submitted 3 July, 2020; v1 submitted 6 June, 2020; originally announced June 2020.

    Comments: 31st Conference on Uncertainty in Artificial Intelligence (UAI). arXiv admin note: substantial text overlap with arXiv:2006.03976

  33. arXiv:2006.05314  [pdf, other

    cs.LG stat.ML

    Regularized Off-Policy TD-Learning

    Authors: Bo Liu, Sridhar Mahadevan, Ji Liu

    Abstract: We present a novel $l_1$ regularized off-policy convergent TD-learning method (termed RO-TD), which is able to learn sparse representations of value functions with low computational complexity. The algorithmic framework underlying RO-TD integrates two key ideas: off-policy convergent gradient TD methods, such as TDC, and a convex-concave saddle-point formulation of non-smooth convex optimization,… ▽ More

    Submitted 6 June, 2020; originally announced June 2020.

    Comments: 26th Advances in Neural Information Processing Systems (NIPS). arXiv admin note: substantial text overlap with arXiv:1405.6757

  34. arXiv:2006.03976  [pdf, other

    cs.LG stat.ML

    Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity

    Authors: Bo Liu, Ian Gemp, Mohammad Ghavamzadeh, Ji Liu, Sridhar Mahadevan, Marek Petrik

    Abstract: In this paper, we introduce proximal gradient temporal difference learning, which provides a principled way of designing and analyzing true stochastic gradient temporal difference learning algorithms. We show how gradient TD (GTD) reinforcement learning methods can be formally derived, not by starting from their original objective functions, as previously attempted, but rather from a primal-dual s… ▽ More

    Submitted 6 June, 2020; originally announced June 2020.

    Comments: Journal of Artificial Intelligence (JAIR)

  35. arXiv:2005.08158  [pdf, other

    cs.LG stat.ML

    Optimizing for the Future in Non-Stationary MDPs

    Authors: Yash Chandak, Georgios Theocharous, Shiv Shankar, Martha White, Sridhar Mahadevan, Philip S. Thomas

    Abstract: Most reinforcement learning methods are based upon the key assumption that the transition dynamics and reward functions are fixed, that is, the underlying Markov decision process is stationary. However, in many real-world applications, this assumption is violated, and using existing algorithms may result in a performance lag. To proactively search for a good future policy, we present a policy grad… ▽ More

    Submitted 21 September, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

    Comments: Thirty-seventh International Conference on Machine Learning (ICML 2020)

  36. STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos

    Authors: Ali Athar, Sabarinath Mahadevan, Aljoša Ošep, Laura Leal-Taixé, Bastian Leibe

    Abstract: Existing methods for instance segmentation in videos typically involve multi-stage pipelines that follow the tracking-by-detection paradigm and model a video clip as a sequence of images. Multiple networks are used to detect objects in individual frames, and then associate these detections over time. Hence, these methods are often non-end-to-end trainable and highly tailored to specific tasks. In… ▽ More

    Submitted 1 September, 2023; v1 submitted 18 March, 2020; originally announced March 2020.

    Comments: ECCV 2020 28 pages, 6 figures

    MSC Class: 68T45; 68T10; 62H30 ACM Class: I.2.10; I.4.6; I.4.8; I.5.3

  37. arXiv:1808.01531  [pdf, other

    cs.LG stat.ML

    Global Convergence to the Equilibrium of GANs using Variational Inequalities

    Authors: Ian Gemp, Sridhar Mahadevan

    Abstract: In optimization, the negative gradient of a function denotes the direction of steepest descent. Furthermore, traveling in any direction orthogonal to the gradient maintains the value of the function. In this work, we show that these orthogonal directions that are ignored by gradient descent can be critical in equilibrium problems. Equilibrium problems have drawn heightened attention in machine lea… ▽ More

    Submitted 20 May, 2019; v1 submitted 4 August, 2018; originally announced August 2018.

  38. arXiv:1805.04398  [pdf, other

    cs.CV

    Iteratively Trained Interactive Segmentation

    Authors: Sabarinath Mahadevan, Paul Voigtlaender, Bastian Leibe

    Abstract: Deep learning requires large amounts of training data to be effective. For the task of object segmentation, manually labeling data is very expensive, and hence interactive methods are needed. Following recent approaches, we develop an interactive object segmentation system which uses user input in the form of clicks as the input to a convolutional network. While previous methods use heuristic clic… ▽ More

    Submitted 11 May, 2018; originally announced May 2018.

  39. arXiv:1804.10834  [pdf, other

    cs.LG stat.ML

    A Unified Framework for Domain Adaptation using Metric Learning on Manifolds

    Authors: Sridhar Mahadevan, Bamdev Mishra, Shalini Ghosh

    Abstract: We present a novel framework for domain adaptation, whereby both geometric and statistical differences between a labeled source domain and unlabeled target domain can be integrated by exploiting the curved Riemannian geometry of statistical manifolds. Our approach is based on formulating transfer from source to target as a problem of geometric mean metric learning on manifolds. Specifically, we ex… ▽ More

    Submitted 28 April, 2018; originally announced April 2018.

  40. arXiv:1710.07328  [pdf, other

    cs.GT cs.LG math.OC

    Online Monotone Games

    Authors: Ian Gemp, Sridhar Mahadevan

    Abstract: Algorithmic game theory (AGT) focuses on the design and analysis of algorithms for interacting agents, with interactions rigorously formalized within the framework of games. Results from AGT find applications in domains such as online bidding auctions for web advertisements and network routing protocols. Monotone games are games where agent strategies naturally converge to an equilibrium state. Pr… ▽ More

    Submitted 19 October, 2017; originally announced October 2017.

  41. arXiv:1703.02992  [pdf, other

    cs.LG

    A Manifold Approach to Learning Mutually Orthogonal Subspaces

    Authors: Stephen Giguere, Francisco Garcia, Sridhar Mahadevan

    Abstract: Although many machine learning algorithms involve learning subspaces with particular characteristics, optimizing a parameter matrix that is constrained to represent a subspace can be challenging. One solution is to use Riemannian optimization methods that enforce such constraints implicitly, leveraging the fact that the feasible parameter values form a manifold. While Riemannian methods exist for… ▽ More

    Submitted 8 March, 2017; originally announced March 2017.

    Comments: 9 pages, 3 Figures

    ACM Class: G.1.6; I.2.6

  42. arXiv:1611.01673  [pdf, other

    cs.LG cs.MA cs.NE

    Generative Multi-Adversarial Networks

    Authors: Ishan Durugkar, Ian Gemp, Sridhar Mahadevan

    Abstract: Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game. In this paper, we propose the \emph{Generative Multi-Adversarial Network} (GMAN), a framework that extends GANs to multiple discriminators. In previous work, the successful training of GANs requires modifying the minimax objective to accelerate training early on. In contrast… ▽ More

    Submitted 2 March, 2017; v1 submitted 5 November, 2016; originally announced November 2016.

    Comments: Accepted as a conference paper (poster) at ICLR 2017

  43. arXiv:1608.07888  [pdf, other

    cs.LG math.OC

    Online Monotone Optimization

    Authors: Ian Gemp, Sridhar Mahadevan

    Abstract: This paper presents a new framework for analyzing and designing no-regret algorithms for dynamic (possibly adversarial) systems. The proposed framework generalizes the popular online convex optimization framework and extends it to its natural limit allowing it to capture a notion of regret that is intuitive for more general problems such as those encountered in game theory and variational inequali… ▽ More

    Submitted 28 August, 2016; originally announced August 2016.

    Comments: 23 pages, 6 figures

  44. arXiv:1608.05983  [pdf, other

    cs.LG stat.ML

    Inverting Variational Autoencoders for Improved Generative Accuracy

    Authors: Ian Gemp, Ishan Durugkar, Mario Parente, M. Darby Dyar, Sridhar Mahadevan

    Abstract: Recent advances in semi-supervised learning with deep generative models have shown promise in generalizing from small labeled datasets ($\mathbf{x},\mathbf{y}$) to large unlabeled ones ($\mathbf{x}$). In the case where the codomain has known structure, a large unfeatured dataset ($\mathbf{y}$) is potentially available. We develop a parameter-efficient, deep semi-supervised generative model for the… ▽ More

    Submitted 24 August, 2017; v1 submitted 21 August, 2016; originally announced August 2016.

  45. arXiv:1606.04615  [pdf, other

    cs.LG cs.AI cs.NE

    Deep Reinforcement Learning With Macro-Actions

    Authors: Ishan P. Durugkar, Clemens Rosenbaum, Stefan Dernbach, Sridhar Mahadevan

    Abstract: Deep reinforcement learning has been shown to be a powerful framework for learning policies from complex high-dimensional sensory inputs to actions in complex tasks, such as the Atari domain. In this paper, we explore output representation modeling in the form of temporal abstraction to improve convergence and reliability of deep reinforcement learning approaches. We concentrate on macro-actions,… ▽ More

    Submitted 14 June, 2016; originally announced June 2016.

  46. arXiv:1507.07636  [pdf, other

    cs.CL

    Reasoning about Linguistic Regularities in Word Embeddings using Matrix Manifolds

    Authors: Sridhar Mahadevan, Sarath Chandar

    Abstract: Recent work has explored methods for learning continuous vector space word representations reflecting the underlying semantics of words. Simple vector space arithmetic using cosine distances has been shown to capture certain types of analogies, such as reasoning about plurals from singulars, past tense from present tense, etc. In this paper, we introduce a new approach to capture analogies in cont… ▽ More

    Submitted 27 July, 2015; originally announced July 2015.

  47. arXiv:1502.00780  [pdf, ps, other

    cs.SI physics.soc-ph

    Measure the similarity of nodes in the complex networks

    Authors: Qi Zhang, Meizhu Li, Yong Deng, Sankaran Mahadevan

    Abstract: Measure the similarity of the nodes in the complex networks have interested many researchers to explore it. In this paper, a new method which is based on the degree centrality and the Relative-entropy is proposed to measure the similarity of the nodes in the complex networks. The results in this paper show that, the nodes which have a common structure property always have a high similarity to othe… ▽ More

    Submitted 3 February, 2015; originally announced February 2015.

    Comments: 6 figures

  48. arXiv:1502.00111  [pdf, ps, other

    cs.SI physics.soc-ph

    Nonextensive analysis on the local structure entropy of complex networks

    Authors: Qi Zhang, Meizhu Li, Yuxian Du, Yong Deng, Sankaran Mahadevan

    Abstract: The local structure entropy is a new method which is proposed to identify the influential nodes in the complex networks. In this paper a new form of the local structure entropy of the complex networks is proposed based on the Tsallis entropy. The value of the entropic index $q$ will influence the property of the local structure entropy. When the value of $q$ is equal to 0, the nonextensive local s… ▽ More

    Submitted 31 January, 2015; originally announced February 2015.

    Comments: 5 figures

  49. arXiv:1501.06042  [pdf, ps, other

    cs.SI physics.soc-ph

    Tsallis entropy of complex networks

    Authors: Qi Zhang, Meizhu Li, Yong Deng, Sankaran Mahadevan

    Abstract: How complex of the complex networks has attracted many researchers to explore it. The entropy is an useful method to describe the degree of the $complex$ of the complex networks. In this paper, a new method which is based on the Tsallis entropy is proposed to describe the $complex$ of the complex networks. The results in this paper show that the complex of the complex networks not only decided by… ▽ More

    Submitted 24 January, 2015; originally announced January 2015.

    Comments: 12 pages

  50. arXiv:1411.6082  [pdf, ps, other

    cs.SI physics.soc-ph

    A new structure entropy of complex networks based on Tsallis nonextensive statistical mechanics

    Authors: Qi Zhang, Xi Lu, Meizhu Li, Yong Deng, Sankaran Mahadevan

    Abstract: The structure entropy is one of the most important parameters to describe the structure property of the complex networks. Most of the existing struc- ture entropies are based on the degree or the betweenness centrality. In order to describe the structure property of the complex networks more reasonably, a new structure entropy of the complex networks based on the Tsallis nonextensive statistical m… ▽ More

    Submitted 21 November, 2014; originally announced November 2014.

    Comments: 21 pages,10 figures