Zum Hauptinhalt springen

Showing 1–18 of 18 results for author: Hovland, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.13316  [pdf, other

    quant-ph cs.ET

    QuCLEAR: Clifford Extraction and Absorption for Significant Reduction in Quantum Circuit Size

    Authors: Ji Liu, Alvin Gonzales, Benchen Huang, Zain Hamid Saleem, Paul Hovland

    Abstract: Quantum computing carries significant potential for addressing practical problems. However, currently available quantum devices suffer from noisy quantum gates, which degrade the fidelity of executed quantum circuits. Therefore, quantum circuit optimization is crucial for obtaining useful results. In this paper, we present QuCLEAR, a compilation framework designed to optimize quantum circuits. QuC… ▽ More

    Submitted 23 August, 2024; originally announced August 2024.

    Comments: 13 pages, 9 figures, 2 tables

  2. arXiv:2404.17039  [pdf, other

    cs.MS math.NA

    Differentiating Through Linear Solvers

    Authors: Paul Hovland, Jan Hückelheim

    Abstract: Computer programs containing calls to linear solvers are a known challenge for automatic differentiation. Previous publications advise against differentiating through the low-level solver implementation, and instead advocate for high-level approaches that express the derivative in terms of a modified linear system that can be solved with a separate solver call. Despite this ubiquitous advice, we a… ▽ More

    Submitted 6 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  3. arXiv:2402.09222  [pdf, other

    cs.PF

    Integrating ytopt and libEnsemble to Autotune OpenMC

    Authors: Xingfu Wu, John R. Tramm, Jeffrey Larson, John-Luke Navarro, Prasanna Balaprakash, Brice Videau, Michael Kruse, Paul Hovland, Valerie Taylor, Mary Hall

    Abstract: ytopt is a Python machine-learning-based autotuning software package developed within the ECP PROTEAS-TUNE project. The ytopt software adopts an asynchronous search framework that consists of sampling a small number of input parameter configurations and progressively fitting a surrogate model over the input-output space until exhausting the user-defined maximum number of evaluations or the wall-cl… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  4. Transfer-Learning-Based Autotuning Using Gaussian Copula

    Authors: Thomas Randall, Jaehoon Koo, Brice Videau, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall, Rong Ge, Prasanna Balaprakash

    Abstract: As diverse high-performance computing (HPC) systems are built, many opportunities arise for applications to solve larger problems than ever before. Given the significantly increased complexity of these HPC systems and application tuning, empirical performance tuning, such as autotuning, has emerged as a promising approach in recent years. Despite its effectiveness, autotuning is often a computatio… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: 13 pages, 5 figures, 7 tables, the definitive version of this work is published in the Proceedings of the ACM International Conference on Supercomputing 2023, available at https://dl.acm.org/doi/10.1145/3577193.3593712

    ACM Class: I.2.4; G.3; D.2.8

    Journal ref: Proceedings of the 37th International Conference on Supercomputing (2023) 37-49

  5. arXiv:2305.18198  [pdf, ps, other

    cs.PL cs.DC

    Model Checking Race-freedom When "Sequential Consistency for Data-race-free Programs" is Guaranteed

    Authors: Wenhao Wu, Jan Hückelheim, Paul D. Hovland, Ziqing Luo, Stephen F. Siegel

    Abstract: Many parallel programming models guarantee that if all sequentially consistent (SC) executions of a program are free of data races, then all executions of the program will appear to be sequentially consistent. This greatly simplifies reasoning about the program, but leaves open the question of how to verify that all SC executions are race-free. In this paper, we show that with a few simple modific… ▽ More

    Submitted 20 July, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

  6. arXiv:2305.07546  [pdf, other

    math.NA cs.AI cs.CE

    Understanding Automatic Differentiation Pitfalls

    Authors: Jan Hückelheim, Harshitha Menon, William Moses, Bruce Christianson, Paul Hovland, Laurent Hascoët

    Abstract: Automatic differentiation, also known as backpropagation, AD, autodiff, or algorithmic differentiation, is a popular technique for computing derivatives of computer programs accurately and efficiently. Sometimes, however, the derivatives computed by AD could be interpreted as incorrect. These pitfalls occur systematically across tools and approaches. In this paper we broadly categorize problematic… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

  7. arXiv:2305.02939  [pdf, other

    quant-ph cs.ET

    Tackling the Qubit Mapping Problem with Permutation-Aware Synthesis

    Authors: Ji Liu, Ed Younis, Mathias Weiden, Paul Hovland, John Kubiatowicz, Costin Iancu

    Abstract: We propose a novel hierarchical qubit mapping and routing algorithm. First, a circuit is decomposed into blocks that span an identical number of qubits. In the second stage permutation-aware synthesis (PAS), each block is optimized and synthesized in isolation. In the third stage a permutation-aware mapping (PAM) algorithm maps the blocks to the target device based on the information from the seco… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: 12 pages, 9 figures, 5 tables

  8. arXiv:2303.16245  [pdf, other

    cs.DC cs.LG cs.PF

    ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales

    Authors: Xingfu Wu, Prasanna Balaprakash, Michael Kruse, Jaehoon Koo, Brice Videau, Paul Hovland, Valerie Taylor, Brad Geltz, Siddhartha Jana, Mary Hall

    Abstract: As we enter the exascale computing era, efficiently utilizing power and optimizing the performance of scientific applications under power and energy constraints has become critical and challenging. We propose a low-overhead autotuning framework to autotune performance and energy for various hybrid MPI/OpenMP scientific applications at large scales and to explore the tradeoffs between application r… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Journal ref: to be pushilshed in CUG2023

  9. QContext: Context-Aware Decomposition for Quantum Gates

    Authors: Ji Liu, Max Bowman, Pranav Gokhale, Siddharth Dangwal, Jeffrey Larson, Frederic T. Chong, Paul D. Hovland

    Abstract: In this paper we propose QContext, a new compiler structure that incorporates context-aware and topology-aware decompositions. Because of circuit equivalence rules and resynthesis, variants of a gate-decomposition template may exist. QContext exploits the circuit information and the hardware topology to select the gate variant that increases circuit optimization opportunities. We study the basis-g… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

    Comments: 10 pages

  10. arXiv:2105.04555  [pdf

    cs.PL cs.AI cs.DC cs.LG cs.PF

    Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations

    Authors: Jaehoon Koo, Prasanna Balaprakash, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall

    Abstract: Polly is the LLVM project's polyhedral loop nest optimizer. Recently, user-directed loop transformation pragmas were proposed based on LLVM/Clang and Polly. The search space exposed by the transformation pragmas is a tree, wherein each node represents a specific combination of loop transformations that can be applied to the code resulting from the parent node's loop transformations. We have develo… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

  11. arXiv:2104.13242  [pdf, other

    cs.LG cs.PF

    Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization (extended version)

    Authors: Xingfu Wu, Michael Kruse, Prasanna Balaprakash, Hal Finkel, Paul Hovland, Valerie Taylor, Mary Hall

    Abstract: In this paper, we develop a ytopt autotuning framework that leverages Bayesian optimization to explore the parameter space search and compare four different supervised learning methods within Bayesian optimization and evaluate their effectiveness. We select six of the most complex PolyBench benchmarks and apply the newly developed LLVM Clang/Polly loop optimization pragmas to the benchmarks to opt… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

    Comments: Submitted to CCPE journal. arXiv admin note: substantial text overlap with arXiv:2010.08040

  12. arXiv:2010.08040  [pdf, other

    cs.PF cs.LG cs.PL

    Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization

    Authors: Xingfu Wu, Michael Kruse, Prasanna Balaprakash, Hal Finkel, Paul Hovland, Valerie Taylor, Mary Hall

    Abstract: An autotuning is an approach that explores a search space of possible implementations/configurations of a kernel or an application by selecting and evaluating a subset of implementations/configurations on a target platform and/or use models to identify a high performance implementation/configuration. In this paper, we develop an autotuning framework that leverages Bayesian optimization to explore… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: to be published in the 11th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS20)

  13. arXiv:1909.02836  [pdf, other

    cs.MS

    Computing Derivatives for PETSc Adjoint Solvers using Algorithmic Differentiation

    Authors: J. G. Wallwork, P. Hovland, H. Zhang, O. Marin

    Abstract: Most nonlinear partial differential equation (PDE) solvers require the Jacobian matrix associated to the differential operator. In PETSc, this is typically achieved by either an analytic derivation or numerical approximation method such as finite differences. For complex applications, hand-coding the Jacobian can be time-consuming and error-prone, yet computationally efficient. Whilst finite diffe… ▽ More

    Submitted 6 September, 2019; originally announced September 2019.

    Comments: 14 pages, 3 figures, 2 listings, 1 table

    MSC Class: 68U01

  14. arXiv:1907.02818  [pdf, other

    cs.DC cs.LG cs.PF cs.SC

    Automatic Differentiation for Adjoint Stencil Loops

    Authors: Jan Hückelheim, Navjot Kukreja, Sri Hari Krishna Narayanan, Fabio Luporini, Gerard Gorman, Paul Hovland

    Abstract: Stencil loops are a common motif in computations including convolutional neural networks, structured-mesh solvers for partial differential equations, and image processing. Stencil loops are easy to parallelise, and their fast execution is aided by compilers, libraries, and domain-specific languages. Reverse-mode automatic differentiation, also known as algorithmic differentiation, autodiff, adjoin… ▽ More

    Submitted 5 July, 2019; originally announced July 2019.

    Comments: ICPP 2019

  15. arXiv:1903.03051  [pdf, other

    cs.DC

    Training on the Edge: The why and the how

    Authors: Navjot Kukreja, Alena Shilova, Olivier Beaumont, Jan Huckelheim, Nicola Ferrier, Paul Hovland, Gerard Gorman

    Abstract: Edge computing is the natural progression from Cloud computing, where, instead of collecting all data and processing it centrally, like in a cloud computing environment, we distribute the computing power and try to do as much processing as possible, close to the source of the data. There are various reasons this model is being adopted quickly, including privacy, and reduced power and bandwidth req… ▽ More

    Submitted 13 February, 2019; originally announced March 2019.

    Comments: Submitted to PAISE 2019

  16. arXiv:1810.05268  [pdf, other

    cs.CE

    Combining Checkpointing and Data Compression to Accelerate Adjoint-Based Optimization Problems

    Authors: Navjot Kukreja, Jan Hueckelheim, Mathias Louboutin, Fabio Luporini, Paul Hovland, Gerard Gorman

    Abstract: Seismic inversion and imaging are adjoint-based optimization problems that process up to terabytes of data, regularly exceeding the memory capacity of available computers. Data compression is an effective strategy to reduce this memory requirement by a certain factor, particularly if some loss in accuracy is acceptable. A popular alternative is checkpointing, where data is stored at selected point… ▽ More

    Submitted 20 September, 2021; v1 submitted 11 October, 2018; originally announced October 2018.

    Comments: Accepted in European Conference on Parallel Proessing (EuroPar) 2019. Part of the Lecture Notes in Computer Science book series (LNCS, volume 11725)

  17. arXiv:1705.07478  [pdf, other

    cs.DC

    Report of the HPC Correctness Summit, Jan 25--26, 2017, Washington, DC

    Authors: Ganesh Gopalakrishnan, Paul D. Hovland, Costin Iancu, Sriram Krishnamoorthy, Ignacio Laguna, Richard A. Lethin, Koushik Sen, Stephen F. Siegel, Armando Solar-Lezama

    Abstract: Maintaining leadership in HPC requires the ability to support simulations at large scales and fidelity. In this study, we detail one of the most significant productivity challenges in achieving this goal, namely the increasing proclivity to bugs, especially in the face of growing hardware and software heterogeneity and sheer system scale. We identify key areas where timely new research must be pro… ▽ More

    Submitted 21 May, 2017; originally announced May 2017.

    Comments: 57 pages

  18. arXiv:1309.1780  [pdf, ps, other

    cs.CE cs.MS cs.SE

    Software Abstractions and Methodologies for HPC Simulation Codes on Future Architectures

    Authors: A. Dubey, S. Brandt, R. Brower, M. Giles, P. Hovland, D. Q. Lamb, F. Loffler, B. Norris, B. OShea, C. Rebbi, M. Snir, R. Thakur

    Abstract: Large, complex, multi-scale, multi-physics simulation codes, running on high performance com-puting (HPC) platforms, have become essential to advancing science and engineering. These codes simulate multi-scale, multi-physics phenomena with unprecedented fidelity on petascale platforms, and are used by large communities. Continued ability of these codes to run on future platforms is as crucial to t… ▽ More

    Submitted 6 September, 2013; originally announced September 2013.

    Comments: Position Paper