Zum Hauptinhalt springen

Showing 1–4 of 4 results for author: Mathuriya, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:1808.04728  [pdf, other

    astro-ph.CO astro-ph.IM cs.LG physics.comp-ph

    CosmoFlow: Using Deep Learning to Learn the Universe at Scale

    Authors: Amrita Mathuriya, Deborah Bard, Peter Mendygral, Lawrence Meadows, James Arnemann, Lei Shao, Siyu He, Tuomas Karna, Daina Moise, Simon J. Pennycook, Kristyn Maschoff, Jason Sewall, Nalini Kumar, Shirley Ho, Mike Ringenburg, Prabhat, Victor Lee

    Abstract: Deep learning is a promising tool to determine the physical model that describes our universe. To handle the considerable computational cost of this problem, we present CosmoFlow: a highly scalable deep learning application built on top of the TensorFlow framework. CosmoFlow uses efficient implementations of 3D convolution and pooling primitives, together with improvements in threading for many el… ▽ More

    Submitted 9 November, 2018; v1 submitted 14 August, 2018; originally announced August 2018.

    Comments: 11 pages, 6 pages, presented at SuperComputing 2018

  2. arXiv:1712.09388  [pdf, other

    cs.DC

    Scaling GRPC Tensorflow on 512 nodes of Cori Supercomputer

    Authors: Amrita Mathuriya, Thorsten Kurth, Vivek Rane, Mustafa Mustafa, Lei Shao, Debbie Bard, Prabhat, Victor W Lee

    Abstract: We explore scaling of the standard distributed Tensorflow with GRPC primitives on up to 512 Intel Xeon Phi (KNL) nodes of Cori supercomputer with synchronous stochastic gradient descent (SGD), and identify causes of scaling inefficiency at higher node counts. To our knowledge, this is the first exploration of distributed GRPC Tensorflow scalability on a HPC supercomputer at such large scale with s… ▽ More

    Submitted 26 December, 2017; originally announced December 2017.

    Comments: Published as a poster in NIPS 2017 Workshop: Deep Learning At Supercomputer Scale

  3. Embracing a new era of highly efficient and productive quantum Monte Carlo simulations

    Authors: Amrita Mathuriya, Ye Luo, Raymond C. Clay III, Anouar Benali, Luke Shulenburger, Jeongnim Kim

    Abstract: QMCPACK has enabled cutting-edge materials research on supercomputers for over a decade. It scales nearly ideally but has low single-node efficiency due to the physics-based abstractions using array-of-structures objects, causing inefficient vectorization. We present a systematic approach to transform QMCPACK to better exploit the new hardware features of modern CPUs in portable and maintainable w… ▽ More

    Submitted 8 August, 2017; originally announced August 2017.

    Comments: 12 pages, 10 figures, 2 tables, to be published at SC17

  4. Optimization and parallelization of B-spline based orbital evaluations in QMC on multi/many-core shared memory processors

    Authors: Amrita Mathuriya, Ye Luo, Anouar Benali, Luke Shulenburger, Jeongnim Kim

    Abstract: B-spline based orbital representations are widely used in Quantum Monte Carlo (QMC) simulations of solids, historically taking as much as 50% of the total run time. Random accesses to a large four-dimensional array make it challenging to efficiently utilize caches and wide vector units of modern CPUs. We present node-level optimizations of B-spline evaluations on multi/many-core shared memory proc… ▽ More

    Submitted 8 November, 2016; originally announced November 2016.

    Comments: 11 pages, 10 figures, 4 tables