-
Intelligent Monitoring Framework for Cloud Services: A Data-Driven Approach
Authors:
Pooja Srinivas,
Fiza Husain,
Anjaly Parayil,
Ayush Choure,
Chetan Bansal,
Saravan Rajmohan
Abstract:
Cloud service owners need to continuously monitor their services to ensure high availability and reliability. Gaps in monitoring can lead to delay in incident detection and significant negative customer impact. Current process of monitor creation is ad-hoc and reactive in nature. Developers create monitors using their tribal knowledge and, primarily, a trial and error based process. As a result, m…
▽ More
Cloud service owners need to continuously monitor their services to ensure high availability and reliability. Gaps in monitoring can lead to delay in incident detection and significant negative customer impact. Current process of monitor creation is ad-hoc and reactive in nature. Developers create monitors using their tribal knowledge and, primarily, a trial and error based process. As a result, monitors often have incomplete coverage which leads to production issues, or, redundancy which results in noise and wasted effort.
In this work, we address this issue by proposing an intelligent monitoring framework that recommends monitors for cloud services based on their service properties. We start by mining the attributes of 30,000+ monitors from 791 production services at Microsoft and derive a structured ontology for monitors. We focus on two crucial dimensions: what to monitor (resources) and which metrics to monitor. We conduct an extensive empirical study and derive key insights on the major classes of monitors employed by cloud services at Microsoft, their associated dimensions, and the interrelationship between service properties and this ontology. Using these insights, we propose a deep learning based framework that recommends monitors based on the service properties. Finally, we conduct a user study with engineers from Microsoft which demonstrates the usefulness of the proposed framework. The proposed framework along with the ontology driven projections, succeeded in creating production quality recommendations for majority of resource classes. This was also validated by the users from the study who rated the framework's usefulness as 4.27 out of 5.
△ Less
Submitted 29 February, 2024;
originally announced March 2024.
-
Rich-Item Recommendations for Rich-Users: Exploiting Dynamic and Static Side Information
Authors:
Amar Budhiraja,
Gaurush Hiranandani,
Darshak Chhatbar,
Aditya Sinha,
Navya Yarrabelly,
Ayush Choure,
Oluwasanmi Koyejo,
Prateek Jain
Abstract:
In this paper, we study the problem of recommendation system where the users and items to be recommended are rich data structures with multiple entity types and with multiple sources of side-information in the form of graphs. We provide a general formulation for the problem that captures the complexities of modern real-world recommendations and generalizes many existing formulations. In our formul…
▽ More
In this paper, we study the problem of recommendation system where the users and items to be recommended are rich data structures with multiple entity types and with multiple sources of side-information in the form of graphs. We provide a general formulation for the problem that captures the complexities of modern real-world recommendations and generalizes many existing formulations. In our formulation, each user/document that requires a recommendation and each item or tag that is to be recommended, both are modeled by a set of static entities and a dynamic component. The relationships between entities are captured by several weighted bipartite graphs. To effectively exploit these complex interactions and learn the recommendation model, we propose MEDRES- a multiple graph-CNN based novel deep-learning architecture. MEDRES uses AL-GCN, a novel graph convolution network block, that harnesses strong representative features from the underlying graphs. Moreover, in order to capture highly heterogeneous engagement of different users with the system and constraints on the number of items to be recommended, we propose a novel ranking metric pAp@k along with a method to optimize the metric directly. We demonstrate effectiveness of our method on two benchmarks: a) citation data, b) Flickr data. In addition, we present two real-world case studies of our formulation and the MEDRES architecture. We show how our technique can be used to naturally model the message recommendation problem and the teams recommendation problem in the Microsoft Teams (MSTeams) product and demonstrate that it is 5-6% points more accurate than the production-grade models.
△ Less
Submitted 26 July, 2020; v1 submitted 28 January, 2020;
originally announced January 2020.
-
Improved bounds on the sandpile diffusions on Grid graphs
Authors:
Ayush Choure,
Sundar Vishwanathan
Abstract:
The Abelian Sandpile Model is a discrete diffusion process defined on graphs (Dhar [10], Dhar et al. [11]) which serves as the standard model of self-organized criticality. The transience class of a sandpile is defined as the maximum number of particles that can be added without making the system recurrent ([3]). Using elementary combinatorial arguments and symmetry properties, Babai and Gorodezky…
▽ More
The Abelian Sandpile Model is a discrete diffusion process defined on graphs (Dhar [10], Dhar et al. [11]) which serves as the standard model of self-organized criticality. The transience class of a sandpile is defined as the maximum number of particles that can be added without making the system recurrent ([3]). Using elementary combinatorial arguments and symmetry properties, Babai and Gorodezky (SODA 2007,[2]) demonstrated a bound of O(n^30) on the transience class of an nxn grid. This was later improved by Choure and Vishwanathan (SODA 2012,[7]) to O(n^7) using techniques based on harmonic functions on graphs. We improve this bound to O(n^7 log n). We also demonstrate tight bounds on certain resistance ratios over grid networks. The tools used for deriving these bounds may be of independent interest.
△ Less
Submitted 13 May, 2013; v1 submitted 16 October, 2012;
originally announced October 2012.
-
On graph parameters guaranteeing fast Sandpile diffusion
Authors:
Ayush Choure,
Sundar Vishwanathan
Abstract:
The Abelian Sandpile Model is a discrete diffusion process defined on graphs (Dhar \cite{DD90}, Dhar et al. \cite{DD95}) which serves as the standard model of self-organized criticality. The transience class of a sandpile is defined as the maximum number of particles that can be added without making the system recurrent (\cite{BT05}). We demonstrate a class of sandpile which have polynomially boun…
▽ More
The Abelian Sandpile Model is a discrete diffusion process defined on graphs (Dhar \cite{DD90}, Dhar et al. \cite{DD95}) which serves as the standard model of self-organized criticality. The transience class of a sandpile is defined as the maximum number of particles that can be added without making the system recurrent (\cite{BT05}). We demonstrate a class of sandpile which have polynomially bound transience classes by identifying key graph properties that play a role in the rapid diffusion process. These are the volume growth parameters, boundary regularity type properties and non-empty interior type constraints. This generalizes a previous result by Babai and Gorodezky (SODA 2007,\cite{LB07}), in which they establish polynomial bounds on $n \times n$ grid. Indeed the properties we show are based on ideas extracted from their proof as well as the continuous analogs in complex analysis. We conclude with a discussion on the notion of degeneracy and dimensions in graphs.
△ Less
Submitted 1 November, 2012; v1 submitted 2 July, 2012;
originally announced July 2012.
-
Random Walks, Electric Networks and The Transience Class problem of Sandpiles
Authors:
Ayush Choure,
Sundar Vishwanathan
Abstract:
The Abelian Sandpile Model is a discrete diffusion process defined on graphs (Dhar \cite{DD90}, Dhar et al. \cite{DD95}) which serves as the standard model of \textit{self-organized criticality}. The transience class of a sandpile is defined as the maximum number of particles that can be added without making the system recurrent (\cite{BT05}). We develop the theory of discrete diffusions in contra…
▽ More
The Abelian Sandpile Model is a discrete diffusion process defined on graphs (Dhar \cite{DD90}, Dhar et al. \cite{DD95}) which serves as the standard model of \textit{self-organized criticality}. The transience class of a sandpile is defined as the maximum number of particles that can be added without making the system recurrent (\cite{BT05}). We develop the theory of discrete diffusions in contrast to continuous harmonic functions on graphs and establish deep connections between standard results in the study of random walks on graphs and sandpiles on graphs. Using this connection and building other necessary machinery we improve the main result of Babai and Gorodezky (SODA 2007,\cite{LB07}) of the bound on the transience class of an $n \times n$ grid, from $O(n^{30})$ to $O(n^{7})$. Proving that the transience class is small validates the general notion that for most natural phenomenon, the time during which the system is transient is small. In addition, we use the machinery developed to prove a number of auxiliary results. We exhibit an equivalence between two other tessellations of plane, the honeycomb and triangular lattices. We give general upper bounds on the transience class as a function of the number of edges to the sink.
Further, for planar sandpiles we derive an explicit algebraic expression which provably approximates the transience class of $G$ to within $O(|E(G)|)$. This expression is based on the spectrum of the Laplacian of the dual of the graph $G$. We also show a lower bound of $Ω(n^{3})$ on the transience class on the grid improving the obvious bound of $Ω(n^{2})$.
△ Less
Submitted 16 October, 2012; v1 submitted 17 May, 2011;
originally announced May 2011.