Zum Hauptinhalt springen

Showing 1–3 of 3 results for author: Eliuk, S

.
  1. arXiv:1812.00669  [pdf, other

    cs.PF

    Hoard: A Distributed Data Caching System to Accelerate Deep Learning Training on the Cloud

    Authors: Christian Pinto, Yiannis Gkoufas, Andrea Reale, Seetharami Seelam, Steven Eliuk

    Abstract: Deep Learning system architects strive to design a balanced system where the computational accelerator -- FPGA, GPU, etc, is not starved for data. Feeding training data fast enough to effectively keep the accelerator utilization high is difficult when utilizing dedicated hardware like GPUs. As accelerators are getting faster, the storage media \& data buses feeding the data have not kept pace and… ▽ More

    Submitted 3 December, 2018; originally announced December 2018.

    Comments: 12 pages, 5 figures

  2. arXiv:1611.07819  [pdf, other

    cs.DC cs.MS cs.NE

    dMath: Distributed Linear Algebra for DL

    Authors: Steven Eliuk, Cameron Upright, Hars Vardhan, Stephen Walsh, Trevor Gale

    Abstract: The paper presents a parallel math library, dMath, that demonstrates leading scaling when using intranode, internode, and hybrid-parallelism for deep learning (DL). dMath provides easy-to-use distributed primitives and a variety of domain-specific algorithms including matrix multiplication, convolutions, and others allowing for rapid development of scalable applications like deep neural networks (… ▽ More

    Submitted 18 November, 2016; originally announced November 2016.

    Comments: 5 pages. arXiv admin note: text overlap with arXiv:1604.01416

  3. arXiv:1604.01416  [pdf, other

    cs.NE cs.DC cs.MS

    dMath: A Scalable Linear Algebra and Math Library for Heterogeneous GP-GPU Architectures

    Authors: Steven Eliuk, Cameron Upright, Anthony Skjellum

    Abstract: A new scalable parallel math library, dMath, is presented in this paper that demonstrates leading scaling when using intranode, or internode, hybrid-parallelism for deep-learning. dMath provides easy-to-use distributed base primitives and a variety of domain-specific algorithms. These include matrix multiplication, convolutions, and others allowing for rapid development of highly scalable applicat… ▽ More

    Submitted 5 April, 2016; originally announced April 2016.