Zum Hauptinhalt springen

Showing 1–50 of 52 results for author: Dhulipala, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.13362  [pdf, ps, other

    cs.DS cs.DC

    Parallel Set Cover and Hypergraph Matching via Uniform Random Sampling

    Authors: Laxman Dhulipala, Michael Dinitz, Jakub Łącki, Slobodan Mitrović

    Abstract: The SetCover problem has been extensively studied in many different models of computation, including parallel and distributed settings. From an approximation point of view, there are two standard guarantees: an $O(\log Δ)$-approximation (where $Δ$ is the maximum set size) and an $O(f)$-approximation (where $f$ is the maximum number of sets containing any given element). In this paper, we introdu… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

  2. arXiv:2406.05066  [pdf, other

    cs.DS

    Efficient Centroid-Linkage Clustering

    Authors: MohammadHossein Bateni, Laxman Dhulipala, Willem Fletcher, Kishen N Gowda, D Ellis Hershkowitz, Rajesh Jayaram, Jakub Łącki

    Abstract: We give an efficient algorithm for Centroid-Linkage Hierarchical Agglomerative Clustering (HAC), which computes a $c$-approximate clustering in roughly $n^{1+O(1/c^2)}$ time. We obtain our result by combining a new Centroid-Linkage HAC algorithm with a novel fully dynamic data structure for nearest neighbor search which works under adaptive updates. We also evaluate our algorithm empirically. By… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2405.19504  [pdf, other

    cs.DS cs.DB cs.IR

    MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings

    Authors: Laxman Dhulipala, Majid Hadian, Rajesh Jayaram, Jason Lee, Vahab Mirrokni

    Abstract: Neural embedding models have become a fundamental component of modern information retrieval (IR) pipelines. These models produce a single embedding $x \in \mathbb{R}^d$ per data-point, allowing for fast retrieval via highly optimized maximum inner product search (MIPS) algorithms. Recently, beginning with the landmark ColBERT paper, multi-vector models, which produce a set of embedding per data po… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  4. arXiv:2405.11671  [pdf, other

    cs.DS

    BYO: A Unified Framework for Benchmarking Large-Scale Graph Containers

    Authors: Brian Wheatman, Xiaojun Dong, Zheqi Shen, Laxman Dhulipala, Jakub Łącki, Prashant Pandey, Helen Xu

    Abstract: A fundamental building block in any graph algorithm is a graph container - a data structure used to represent the graph. Ideally, a graph container enables efficient access to the underlying graph, has low space usage, and supports updating the graph efficiently. In this paper, we conduct an extensive empirical evaluation of graph containers designed to support running algorithms on large graphs.… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  5. arXiv:2404.19019  [pdf, other

    cs.DS cs.DC

    Optimal Parallel Algorithms for Dendrogram Computation and Single-Linkage Clustering

    Authors: Laxman Dhulipala, Xiaojun Dong, Kishen N Gowda, Yan Gu

    Abstract: Computing a Single-Linkage Dendrogram (SLD) is a key step in the classic single-linkage hierarchical clustering algorithm. Given an input edge-weighted tree $T$, the SLD of $T$ is a binary dendrogram that summarizes the $n-1$ clusterings obtained by contracting the edges of $T$ in order of weight. Existing algorithms for computing the SLD all require $Ω(n\log n)$ work where $n = |T|$. Furthermore,… ▽ More

    Submitted 12 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: To appear at SPAA 2024

  6. arXiv:2404.14730  [pdf, other

    cs.DS cs.CC cs.DC

    It's Hard to HAC with Average Linkage!

    Authors: MohammadHossein Bateni, Laxman Dhulipala, Kishen N Gowda, D Ellis Hershkowitz, Rajesh Jayaram, Jakub Łącki

    Abstract: Average linkage Hierarchical Agglomerative Clustering (HAC) is an extensively studied and applied method for hierarchical clustering. Recent applications to massive datasets have driven significant interest in near-linear-time and efficient parallel algorithms for average linkage HAC. We provide hardness results that rule out such algorithms. On the sequential side, we establish a runtime lower… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: To appear at ICALP 2024

  7. arXiv:2403.03337  [pdf, ps, other

    cs.DS cs.CR

    Fine-Grained Privacy Guarantees for Coverage Problems

    Authors: Laxman Dhulipala, George Z. Li

    Abstract: We introduce a new notion of neighboring databases for coverage problems such as Max Cover and Set Cover under differential privacy. In contrast to the standard privacy notion for these problems, which is analogous to node-privacy in graphs, our new definition gives a more fine-grained privacy guarantee, which is analogous to edge-privacy. We illustrate several scenarios of Set Cover and Max Cover… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 14 pages; abstract shortened to fit requirements

  8. arXiv:2403.01797  [pdf, other

    cs.DS cs.IR

    Unleashing Graph Partitioning for Large-Scale Nearest Neighbor Search

    Authors: Lars Gottesbüren, Laxman Dhulipala, Rajesh Jayaram, Jakub Lacki

    Abstract: We consider the fundamental problem of decomposing a large-scale approximate nearest neighbor search (ANNS) problem into smaller sub-problems. The goal is to partition the input points into neighborhood-preserving shards, so that the nearest neighbors of any point are contained in only a few shards. When a query arrives, a routing algorithm is used to identify the shards which should be searched f… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  9. arXiv:2402.00943  [pdf, other

    cs.DS cs.IR cs.LG

    Approximate Nearest Neighbor Search with Window Filters

    Authors: Joshua Engels, Benjamin Landrum, Shangdi Yu, Laxman Dhulipala, Julian Shun

    Abstract: We define and investigate the problem of $\textit{c-approximate window search}$: approximate nearest neighbor search where each point in the dataset has a numeric label, and the goal is to find nearest neighbors to queries within arbitrary label ranges. Many semantic search problems, such as image and document search with timestamp filters, or product search with cost filters, are natural examples… ▽ More

    Submitted 4 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Code available: https://github.com/JoshEngels/RangeFilteredANN

  10. arXiv:2401.05244  [pdf, other

    stat.ML cs.LG stat.AP stat.CO

    Reliability Analysis of Complex Systems using Subset Simulations with Hamiltonian Neural Networks

    Authors: Denny Thaler, Somayajulu L. N. Dhulipala, Franz Bamer, Bernd Markert, Michael D. Shields

    Abstract: We present a new Subset Simulation approach using Hamiltonian neural network-based Monte Carlo sampling for reliability analysis. The proposed strategy combines the superior sampling of the Hamiltonian Monte Carlo method with computationally efficient gradient evaluations using Hamiltonian neural networks. This combination is especially advantageous because the neural network architecture conserve… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  11. arXiv:2401.00710  [pdf, other

    cs.DS cs.DC

    Parallel Integer Sort: Theory and Practice

    Authors: Xiaojun Dong, Laxman Dhulipala, Yan Gu, Yihan Sun

    Abstract: Integer sorting is a fundamental problem in computer science. This paper studies parallel integer sort both in theory and in practice. In theory, we show tighter bounds for a class of existing practical integer sort algorithms, which provides a solid theoretical foundation for their widespread usage in practice and strong performance. In practice, we design a new integer sorting algorithm, \textsf… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  12. arXiv:2312.07706  [pdf, other

    cs.DS cs.CR cs.SI

    Near-Optimal Differentially Private k-Core Decomposition

    Authors: Laxman Dhulipala, George Z. Li, Quanquan C. Liu

    Abstract: Recent work by Dhulipala et al. \cite{DLRSSY22} initiated the study of the $k$-core decomposition problem under differential privacy via a connection between low round/depth distributed/parallel graph algorithms and private algorithms with small error bounds. They showed that one can output differentially private approximate $k$-core numbers, while only incurring a multiplicative error of… ▽ More

    Submitted 28 February, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: 20 pages. Abstract shortened to fit requirements. In the new version, we show that our techniques can also help give better analysis of the algorithms in [DLRSSY22]

  13. arXiv:2311.04333  [pdf, other

    cs.DS cs.DC

    Practical Parallel Algorithms for Near-Optimal Densest Subgraphs on Massive Graphs

    Authors: Pattara Sukprasert, Quanquan C. Liu, Laxman Dhulipala, Julian Shun

    Abstract: The densest subgraph problem has received significant attention, both in theory and in practice, due to its applications in problems such as community detection, social network analysis, and spam detection. Due to the high cost of obtaining exact solutions, much attention has focused on designing approximate densest subgraph algorithms. However, existing approaches are not able to scale to massive… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: To appear in ALENEX 2024

  14. arXiv:2308.03578  [pdf, other

    cs.DS cs.DB cs.DC cs.IR

    TeraHAC: Hierarchical Agglomerative Clustering of Trillion-Edge Graphs

    Authors: Laxman Dhulipala, Jason Lee, Jakub Łącki, Vahab Mirrokni

    Abstract: We introduce TeraHAC, a $(1+ε)$-approximate hierarchical agglomerative clustering (HAC) algorithm which scales to trillion-edge graphs. Our algorithm is based on a new approach to computing $(1+ε)$-approximate HAC, which is a novel combination of the nearest-neighbor chain algorithm and the notion of $(1+ε)$-approximate HAC. Our approach allows us to partition the graph among multiple machines and… ▽ More

    Submitted 11 June, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: SIGMOD 2024

  15. arXiv:2306.08623  [pdf, other

    cs.DC cs.DS

    Parallel Algorithms for Hierarchical Nucleus Decomposition

    Authors: Jessica Shi, Laxman Dhulipala, Julian Shun

    Abstract: Nucleus decompositions have been shown to be a useful tool for finding dense subgraphs. The coreness value of a clique represents its density based on the number of other cliques it is adjacent to. One useful output of nucleus decomposition is to generate a hierarchy among dense subgraphs at different resolutions. However, existing parallel algorithms for nucleus decomposition do not generate this… ▽ More

    Submitted 19 January, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

  16. arXiv:2305.04359  [pdf, other

    cs.IR cs.LG

    ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate Nearest Neighbor Search Algorithms

    Authors: Magdalen Dobson Manohar, Zheqi Shen, Guy E. Blelloch, Laxman Dhulipala, Yan Gu, Harsha Vardhan Simhadri, Yihan Sun

    Abstract: Approximate nearest-neighbor search (ANNS) algorithms are a key part of the modern deep learning stack due to enabling efficient similarity search over high-dimensional vector space representations (i.e., embeddings) of data. Among various ANNS algorithms, graph-based algorithms are known to achieve the best throughput-recall tradeoffs. Despite the large scale of modern ANNS datasets, existing par… ▽ More

    Submitted 8 February, 2024; v1 submitted 7 May, 2023; originally announced May 2023.

  17. High-Performance and Flexible Parallel Algorithms for Semisort and Related Problems

    Authors: Xiaojun Dong, Yunshu Wu, Zhongqi Wang, Laxman Dhulipala, Yan Gu, Yihan Sun

    Abstract: Semisort is a fundamental algorithmic primitive widely used in the design and analysis of efficient parallel algorithms. It takes input as an array of records and a function extracting a \emph{key} per record, and reorders them so that records with equal keys are contiguous. Since many applications only require collecting equal values, but not fully sorting the input, semisort is broadly applicabl… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  18. Towards Lightweight and Automated Representation Learning System for Networks

    Authors: Yuyang Xie, Jiezhong Qiu, Laxman Dhulipala, Wenjian Yu, Jie Tang, Richard Peng, Chi Wang

    Abstract: We propose LIGHTNE 2.0, a cost-effective, scalable, automated, and high-quality network embedding system that scales to graphs with hundreds of billions of edges on a single machine. In contrast to the mainstream belief that distributed architecture and GPUs are needed for large-scale network embedding with good quality, we prove that we can achieve higher quality, better scalability, lower cost,… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Journal ref: IEEE Transactions on Knowledge and Data Engineering, 2023

  19. arXiv:2212.03375  [pdf, other

    cs.LG math.PR math.ST stat.ML

    General multi-fidelity surrogate models: Framework and active learning strategies for efficient rare event simulation

    Authors: Promit Chakroborty, Somayajulu L. N. Dhulipala, Yifeng Che, Wen Jiang, Benjamin W. Spencer, Jason D. Hales, Michael D. Shields

    Abstract: Estimating the probability of failure for complex real-world systems using high-fidelity computational models is often prohibitively expensive, especially when the probability is small. Exploiting low-fidelity models can make this process more feasible, but merging information from multiple low-fidelity and high-fidelity models poses several challenges. This paper presents a robust multi-fidelity… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

  20. arXiv:2211.10887  [pdf, ps, other

    cs.DS

    Differential Privacy from Locally Adjustable Graph Algorithms: $k$-Core Decomposition, Low Out-Degree Ordering, and Densest Subgraphs

    Authors: Laxman Dhulipala, Quanquan C. Liu, Sofya Raskhodnikova, Jessica Shi, Julian Shun, Shangdi Yu

    Abstract: Differentially private algorithms allow large-scale data analytics while preserving user privacy. Designing such algorithms for graph data is gaining importance with the growth of large networks that model various (sensitive) relationships between individuals. While there exists a rich history of important literature in this space, to the best of our knowledge, no results formalize a relationship… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

  21. arXiv:2211.10516  [pdf, other

    cs.DB cs.DC cs.DS cs.PF

    PIM-tree: A Skew-resistant Index for Processing-in-Memory

    Authors: Hongbo Kang, Yiwei Zhao, Guy E. Blelloch, Laxman Dhulipala, Yan Gu, Charles McGuffey, Phillip B. Gibbons

    Abstract: The performance of today's in-memory indexes is bottlenecked by the memory latency/bandwidth wall. Processing-in-memory (PIM) is an emerging approach that potentially mitigates this bottleneck, by enabling low-latency memory access whose aggregate memory bandwidth scales with the number of PIM nodes. There is an inherent tension, however, between minimizing inter-node communication and achieving l… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    MSC Class: 68P05 ACM Class: H.2.4

  22. arXiv:2209.09349  [pdf, other

    stat.ML cs.LG

    Physics-Informed Machine Learning of Dynamical Systems for Efficient Bayesian Inference

    Authors: Somayajulu L. N. Dhulipala, Yifeng Che, Michael D. Shields

    Abstract: Although the no-u-turn sampler (NUTS) is a widely adopted method for performing Bayesian inference, it requires numerous posterior gradients which can be expensive to compute in practice. Recently, there has been a significant interest in physics-based machine learning of dynamical (or Hamiltonian) systems and Hamiltonian neural networks (HNNs) is a noteworthy architecture. But these types of arch… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

  23. arXiv:2208.06120  [pdf, other

    cs.LG stat.CO stat.ML

    Bayesian Inference with Latent Hamiltonian Neural Networks

    Authors: Somayajulu L. N. Dhulipala, Yifeng Che, Michael D. Shields

    Abstract: When sampling for Bayesian inference, one popular approach is to use Hamiltonian Monte Carlo (HMC) and specifically the No-U-Turn Sampler (NUTS) which automatically decides the end time of the Hamiltonian trajectory. However, HMC and NUTS can require numerous numerical gradients of the target density, and can prove slow in practice. We propose Hamiltonian neural networks (HNNs) with HMC and NUTS f… ▽ More

    Submitted 24 October, 2022; v1 submitted 12 August, 2022; originally announced August 2022.

    Comments: Added code repository (https://github.com/IdahoLabResearch/BIhNNs)

    MSC Class: 60J22; 68T07; 65C05; 37Jxx; 62F15

  24. arXiv:2207.01834  [pdf, other

    cs.CG

    ParGeo: A Library for Parallel Computational Geometry

    Authors: Yiqiu Wang, Rahul Yesantharao, Shangdi Yu, Laxman Dhulipala, Yan Gu, Julian Shun

    Abstract: This paper presents ParGeo, a multicore library for computational geometry. ParGeo contains modules for fundamental tasks including $k$d-tree based spatial search, spatial graph generation, and algorithms in computational geometry. We focus on three new algorithmic contributions provided in the library. First, we present a new parallel convex hull algorithm based on a reservation technique to en… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

  25. arXiv:2206.11654  [pdf, other

    cs.DS cs.DC

    Hierarchical Agglomerative Graph Clustering in Poly-Logarithmic Depth

    Authors: Laxman Dhulipala, David Eisenstat, Jakub Łącki, Vahab Mirronki, Jessica Shi

    Abstract: Obtaining scalable algorithms for hierarchical agglomerative clustering (HAC) is of significant interest due to the massive size of real-world datasets. At the same time, efficiently parallelizing HAC is difficult due to the seemingly sequential nature of the algorithm. In this paper, we address this issue and present ParHAC, the first efficient parallel HAC algorithm with sublinear depth for the… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

  26. arXiv:2206.10829  [pdf, other

    cs.LG math.DS math.PR

    Efficient Interdependent Systems Recovery Modeling with DeepONets

    Authors: Somayajulu L. N. Dhulipala, Ryan C. Hruska

    Abstract: Modeling the recovery of interdependent critical infrastructure is a key component of quantifying and optimizing societal resilience to disruptive events. However, simulating the recovery of large-scale interdependent systems under random disruptive events is computationally expensive. Therefore, we propose the application of Deep Operator Networks (DeepONets) in this paper to accelerate the recov… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

  27. arXiv:2205.04956  [pdf, other

    cs.DS cs.CC cs.DC

    Parallel Batch-Dynamic Minimum Spanning Forest and the Efficiency of Dynamic Agglomerative Graph Clustering

    Authors: Tom Tseng, Laxman Dhulipala, Julian Shun

    Abstract: Hierarchical agglomerative clustering (HAC) is a popular algorithm for clustering data, but despite its importance, no dynamic algorithms for HAC with good theoretical guarantees exist. In this paper, we study dynamic HAC on edge-weighted graphs. As single-linkage HAC reduces to computing a minimum spanning forest (MSF), our first result is a parallel batch-dynamic algorithm for maintaining MSFs.… ▽ More

    Submitted 12 July, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

    Comments: SPAA 2022

  28. arXiv:2204.06077  [pdf, other

    cs.DS cs.DC

    PaC-trees: Supporting Parallel and Compressed Purely-Functional Collections

    Authors: Laxman Dhulipala, Guy E. Blelloch, Yan Gu, Yihan Sun

    Abstract: Many modern programming languages are shifting toward a functional style for collection interfaces such as sets, maps, and sequences. Functional interfaces offer many advantages, including being safe for parallelism and providing simple and lightweight snapshots. However, existing high-performance functional interfaces such as PAM, which are based on balanced purely-functional trees, incur large s… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: This is a preliminary version of a paper that will appear at the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2022)

  29. arXiv:2112.06188  [pdf, other

    cs.DS cs.DB

    Parallel Batch-Dynamic $k$d-Trees

    Authors: Rahul Yesantharao, Yiqiu Wang, Laxman Dhulipala, Julian Shun

    Abstract: $k$d-trees are widely used in parallel databases to support efficient neighborhood/similarity queries. Supporting parallel updates to $k$d-trees is therefore an important operation. In this paper, we present BDL-tree, a parallel, batch-dynamic implementation of a $k$d-tree that allows for efficient parallel $k… ▽ More

    Submitted 12 December, 2021; originally announced December 2021.

  30. arXiv:2111.10980  [pdf, other

    cs.DC cs.DS

    Theoretically and Practically Efficient Parallel Nucleus Decomposition

    Authors: Jessica Shi, Laxman Dhulipala, Julian Shun

    Abstract: This paper studies the nucleus decomposition problem, which has been shown to be useful in finding dense substructures in graphs. We present a novel parallel algorithm that is efficient both in theory and in practice. Our algorithm achieves a work complexity matching the best sequential algorithm while also having low depth (parallel running time), which significantly improves upon the only existi… ▽ More

    Submitted 11 August, 2022; v1 submitted 21 November, 2021; originally announced November 2021.

  31. arXiv:2108.01731  [pdf, other

    cs.SI cs.AI cs.DC cs.LG

    Scalable Community Detection via Parallel Correlation Clustering

    Authors: Jessica Shi, Laxman Dhulipala, David Eisenstat, Jakub Łącki, Vahab Mirrokni

    Abstract: Graph clustering and community detection are central problems in modern data mining. The increasing need for analyzing billion-scale data calls for faster and more scalable algorithms for these problems. There are certain trade-offs between the quality and speed of such clustering algorithms. In this paper, we design scalable algorithms that achieve high quality when evaluated based on ground trut… ▽ More

    Submitted 27 July, 2021; originally announced August 2021.

    Comments: This is a preliminary version of a paper that will appear at VLDB'21

  32. arXiv:2106.13790  [pdf, other

    stat.ML cs.LG stat.AP

    Active Learning with Multifidelity Modeling for Efficient Rare Event Simulation

    Authors: S. L. N. Dhulipala, M. D. Shields, B. W. Spencer, C. Bolisetti, A. E. Slaughter, V. M. Laboure, P. Chakroborty

    Abstract: While multifidelity modeling provides a cost-effective way to conduct uncertainty quantification with computationally expensive models, much greater efficiency can be achieved by adaptively deciding the number of required high-fidelity (HF) simulations, depending on the type and complexity of the problem and the desired accuracy in the results. We propose a framework for active learning with multi… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

  33. arXiv:2106.05610  [pdf, other

    cs.DS cs.AI cs.CV cs.LG

    Hierarchical Agglomerative Graph Clustering in Nearly-Linear Time

    Authors: Laxman Dhulipala, David Eisenstat, Jakub Łącki, Vahab Mirrokni, Jessica Shi

    Abstract: We study the widely used hierarchical agglomerative clustering (HAC) algorithm on edge-weighted graphs. We define an algorithmic framework for hierarchical agglomerative graph clustering that provides the first efficient $\tilde{O}(m)$ time exact algorithms for classic linkage measures, such as complete- and WPGMA-linkage, as well as other measures. Furthermore, for average-linkage, arguably the m… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: This is the full version of the paper appearing in ICML'21

  34. arXiv:2106.04727  [pdf, other

    cs.DS cs.DB cs.DC cs.LG

    ParChain: A Framework for Parallel Hierarchical Agglomerative Clustering using Nearest-Neighbor Chain

    Authors: Shangdi Yu, Yiqiu Wang, Yan Gu, Laxman Dhulipala, Julian Shun

    Abstract: This paper studies the hierarchical clustering problem, where the goal is to produce a dendrogram that represents clusters at varying scales of a data set. We propose the ParChain framework for designing parallel hierarchical agglomerative clustering (HAC) algorithms, and using the framework we obtain novel parallel algorithms for the complete linkage, average linkage, and Ward's linkage criteria.… ▽ More

    Submitted 14 February, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

  35. arXiv:2106.03824  [pdf, other

    cs.DS cs.DC

    Parallel Batch-Dynamic Algorithms for $k$-Core Decomposition and Related Graph Problems

    Authors: Quanquan C. Liu, Jessica Shi, Shangdi Yu, Laxman Dhulipala, Julian Shun

    Abstract: Maintaining a $k$-core decomposition quickly in a dynamic graph has important applications in network analysis. The main challenge for designing efficient exact algorithms is that a single update to the graph can cause significant global changes. Our paper focuses on \emph{approximation} algorithms with small approximation factors that are much more efficient than what exact algorithms can obtain.… ▽ More

    Submitted 26 September, 2023; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: Abstract truncated per arXiv limits; add densest subgraph observation, fix Table 3 typos

  36. arXiv:2012.11188  [pdf, other

    cs.DB cs.DC cs.DS

    Parallel Index-Based Structural Graph Clustering and Its Approximation

    Authors: Tom Tseng, Laxman Dhulipala, Julian Shun

    Abstract: SCAN (Structural Clustering Algorithm for Networks) is a well-studied, widely used graph clustering algorithm. For large graphs, however, sequential SCAN variants are prohibitively slow, and parallel SCAN variants do not effectively share work among queries with different SCAN parameter settings. Since users of SCAN often explore many parameter settings to find good clusterings, it is worthwhile t… ▽ More

    Submitted 30 March, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

  37. arXiv:2009.11552  [pdf, other

    cs.DC cs.DS

    Parallel Graph Algorithms in Constant Adaptive Rounds: Theory meets Practice

    Authors: Soheil Behnezhad, Laxman Dhulipala, Hossein Esfandiari, Jakub Łącki, Vahab Mirrokni, Warren Schudy

    Abstract: We study fundamental graph problems such as graph connectivity, minimum spanning forest (MSF), and approximate maximum (weight) matching in a distributed setting. In particular, we focus on the Adaptive Massively Parallel Computation (AMPC) model, which is a theoretical model that captures MapReduce-like computation augmented with a distributed hash table. We show the first AMPC algorithms for a… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

  38. Exploring the Design Space of Static and Incremental Graph Connectivity Algorithms on GPUs

    Authors: Changwan Hong, Laxman Dhulipala, Julian Shun

    Abstract: Connected components and spanning forest are fundamental graph algorithms due to their use in many important applications, such as graph clustering and image segmentation. GPUs are an ideal platform for graph algorithms due to their high peak performance and memory bandwidth. While there exist several GPU connectivity algorithms in the literature, many design choices have not yet been explored. In… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

    Journal ref: Proceedings of the 2020 International Conference on Parallel Architectures and Compilation Techniques

  39. arXiv:2008.03909  [pdf, other

    cs.DC cs.DS

    ConnectIt: A Framework for Static and Incremental Parallel Graph Connectivity Algorithms

    Authors: Laxman Dhulipala, Changwan Hong, Julian Shun

    Abstract: Connected components is a fundamental kernel in graph applications. The fastest existing parallel multicore algorithms for connectivity are based on some form of edge sampling and/or linking and compressing trees. However, many combinations of these design choices have been left unexplored. In this paper, we design the ConnectIt framework, which provides different sampling strategies as well as va… ▽ More

    Submitted 24 August, 2021; v1 submitted 10 August, 2020; originally announced August 2020.

    Comments: This is an extended version of a paper in PVLDB Volume 14 (appeared at VLDB'21)

  40. arXiv:2003.13585  [pdf, other

    cs.DS cs.DC

    Parallel Batch-Dynamic $k$-Clique Counting

    Authors: Laxman Dhulipala, Quanquan C. Liu, Julian Shun, Shangdi Yu

    Abstract: In this paper, we study new batch-dynamic algorithms for the $k$-clique counting problem, which are dynamic algorithms where the updates are batches of edge insertions and deletions. We study this problem in the parallel setting, where the goal is to obtain algorithms with low (polylogarithmic) depth. Our first result is a new parallel batch-dynamic triangle counting algorithm with… ▽ More

    Submitted 13 December, 2020; v1 submitted 30 March, 2020; originally announced March 2020.

  41. arXiv:2002.10047  [pdf, other

    cs.DS cs.DC

    Parallel Clique Counting and Peeling Algorithms

    Authors: Jessica Shi, Laxman Dhulipala, Julian Shun

    Abstract: We present a new parallel algorithm for $k$-clique counting/listing that has polylogarithmic span (parallel time) and is work-efficient (matches the work of the best sequential algorithm) for sparse graphs. Our algorithm is based on computing low out-degree orientations, which we present new linear-work and polylogarithmic-span algorithms for computing in parallel. We also present new parallel alg… ▽ More

    Submitted 16 July, 2021; v1 submitted 23 February, 2020; originally announced February 2020.

  42. Parallel Batch-dynamic Trees via Change Propagation

    Authors: Umut A. Acar, Daniel Anderson, Guy E. Blelloch, Laxman Dhulipala, Sam Westrick

    Abstract: The dynamic trees problem is to maintain a forest subject to edge insertions and deletions while facilitating queries such as connectivity, path weights, and subtree weights. Dynamic trees are a fundamental building block of a large number of graph algorithms. Although traditionally studied in the single-update setting, dynamic algorithms capable of supporting batches of updates are increasingly r… ▽ More

    Submitted 17 May, 2020; v1 submitted 12 February, 2020; originally announced February 2020.

    Journal ref: Proceedings of The 28th Annual European Symposium on Algorithms (ESA '20) (2020) 2:1-2:23

  43. arXiv:1911.07260  [pdf, other

    cs.PL cs.DC

    Optimizing Ordered Graph Algorithms with GraphIt

    Authors: Yunming Zhang, Ajay Brahmakshatriya, Xinyi Chen, Laxman Dhulipala, Shoaib Kamil, Saman Amarasinghe, Julian Shun

    Abstract: Many graph problems can be solved using ordered parallel graph algorithms that achieve significant speedup over their unordered counterparts by reducing redundant work. This paper introduces a new priority-based extension to GraphIt, a domain-specific language for writing graph applications, to simplify writing high-performance parallel ordered graph algorithms. The extension enables vertices to b… ▽ More

    Submitted 26 January, 2020; v1 submitted 17 November, 2019; originally announced November 2019.

    Journal ref: CGO 2020

  44. arXiv:1910.12310  [pdf, other

    cs.DC cs.DS

    Sage: Parallel Semi-Asymmetric Graph Algorithms for NVRAMs

    Authors: Laxman Dhulipala, Charlie McGuffey, Hongbo Kang, Yan Gu, Guy E. Blelloch, Phillip B. Gibbons, Julian Shun

    Abstract: Non-volatile main memory (NVRAM) technologies provide an attractive set of features for large-scale graph analytics, including byte-addressability, low idle power, and improved memory-density. NVRAM systems today have an order of magnitude more NVRAM than traditional memory (DRAM). NVRAM systems could therefore potentially allow very large graph problems to be solved on a single machine, at a mode… ▽ More

    Submitted 28 May, 2020; v1 submitted 27 October, 2019; originally announced October 2019.

    Comments: This is an extended version of a paper in PVLDB (to be presented at VLDB'20)

  45. arXiv:1910.05385  [pdf, other

    cs.DS cs.DC

    Near-Optimal Massively Parallel Graph Connectivity

    Authors: Soheil Behnezhad, Laxman Dhulipala, Hossein Esfandiari, Jakub Łącki, Vahab Mirrokni

    Abstract: Identifying the connected components of a graph, apart from being a fundamental problem with countless applications, is a key primitive for many other algorithms. In this paper, we consider this problem in parallel settings. Particularly, we focus on the Massively Parallel Computations (MPC) model, which is the standard theoretical model for modern parallel frameworks such as MapReduce, Hadoop, or… ▽ More

    Submitted 11 March, 2020; v1 submitted 11 October, 2019; originally announced October 2019.

    Comments: A preliminary version of this paper is to appear in the proceedings of The 60th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2019)

  46. arXiv:1908.01956  [pdf, ps, other

    cs.DS

    Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds

    Authors: David Durfee, Laxman Dhulipala, Janardhan Kulkarni, Richard Peng, Saurabh Sawlani, Xiaorui Sun

    Abstract: In this paper we study the problem of dynamically maintaining graph properties under batches of edge insertions and deletions in the massively parallel model of computation. In this setting, the graph is stored on a number of machines, each having space strongly sublinear with respect to the number of vertices, that is, $n^ε$ for some constant $0 < ε< 1$. Our goal is to handle batches of updates a… ▽ More

    Submitted 6 August, 2019; originally announced August 2019.

  47. arXiv:1905.07533  [pdf, other

    cs.DC cs.DS

    Massively Parallel Computation via Remote Memory Access

    Authors: Soheil Behnezhad, Laxman Dhulipala, Hossein Esfandiari, Jakub Łącki, Warren Schudy, Vahab Mirrokni

    Abstract: We introduce the Adaptive Massively Parallel Computation (AMPC) model, which is an extension of the Massively Parallel Computation (MPC) model. At a high level, the AMPC model strengthens the MPC model by storing all messages sent within a round in a distributed data store. In the following round, all machines are provided with random read access to the data store, subject to the same constraints… ▽ More

    Submitted 18 May, 2019; originally announced May 2019.

  48. arXiv:1904.08380  [pdf, other

    cs.DC cs.DS cs.PL

    Low-Latency Graph Streaming Using Compressed Purely-Functional Trees

    Authors: Laxman Dhulipala, Julian Shun, Guy Blelloch

    Abstract: Due to the dynamic nature of real-world graphs, there has been a growing interest in the graph-streaming setting where a continuous stream of graph updates is mixed with arbitrary graph queries. In principle, purely-functional trees are an ideal choice for this setting due as they enable safe parallelism, lightweight snapshots, and strict serializability for queries. However, directly using them f… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

    Comments: This is the full version of the paper appearing in the ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI), 2019

  49. Parallel Batch-Dynamic Graph Connectivity

    Authors: Umut A. Acar, Daniel Anderson, Guy E. Blelloch, Laxman Dhulipala

    Abstract: In this paper, we study batch parallel algorithms for the dynamic connectivity problem, a fundamental problem that has received considerable attention in the sequential setting. The most well known sequential algorithm for dynamic connectivity is the elegant level-set algorithm of Holm, de Lichtenberg and Thorup (HDT), which achieves $O(\log^2 n)$ amortized time per edge insertion or deletion, and… ▽ More

    Submitted 17 May, 2020; v1 submitted 20 March, 2019; originally announced March 2019.

    Comments: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 2019

    Journal ref: Proceedings of The 31st ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '19) (2019) 381-392

  50. arXiv:1810.10738  [pdf, other

    cs.DS

    Batch-Parallel Euler Tour Trees

    Authors: Thomas Tseng, Laxman Dhulipala, Guy Blelloch

    Abstract: The dynamic trees problem is to maintain a forest undergoing edge insertions and deletions while supporting queries for information such as connectivity. There are many existing data structures for this problem, but few of them are capable of exploiting parallelism in the batch-setting, in which large batches of edges are inserted or deleted from the forest at once. In this paper, we demonstrate t… ▽ More

    Submitted 5 March, 2022; v1 submitted 25 October, 2018; originally announced October 2018.

    Comments: Edits: fix typo in bibliography, fix definition of "with high probability" used in this paper