Zum Hauptinhalt springen

Showing 1–50 of 80 results for author: Samsi, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.18593  [pdf, other

    cs.AR cs.AI cs.DC

    Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale

    Authors: Dan Zhao, Siddharth Samsi, Joseph McDonald, Baolin Li, David Bestor, Michael Jones, Devesh Tiwari, Vijay Gadepally

    Abstract: As research and deployment of AI grows, the computational burden to support and sustain its progress inevitably does too. To train or fine-tune state-of-the-art models in NLP, computer vision, etc., some form of AI hardware acceleration is virtually a requirement. Recent large language models require considerable resources to train and deploy, resulting in significant energy usage, potential carbo… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  2. arXiv:2401.16437  [pdf, other

    physics.ao-ph cs.LG

    A Benchmark Dataset for Tornado Detection and Prediction using Full-Resolution Polarimetric Weather Radar Data

    Authors: Mark S. Veillette, James M. Kurdzo, Phillip M. Stepanian, John Y. N. Cho, Siddharth Samsi, Joseph McDonald

    Abstract: Weather radar is the primary tool used by forecasters to detect and warn for tornadoes in near-real time. In order to assist forecasters in warning the public, several algorithms have been developed to automatically detect tornadic signatures in weather radar observations. Recently, Machine Learning (ML) algorithms, which learn directly from large amounts of labeled data, have been shown to be hig… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: 37 pages, 15 Figures, 2 Tables

  3. arXiv:2310.09145  [pdf, other

    cs.AI cs.DC

    Lincoln AI Computing Survey (LAICS) Update

    Authors: Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, Jeremy Kepner

    Abstract: This paper is an update of the survey of AI accelerators and processors from past four years, which is now called the Lincoln AI Computing Survey - LAICS (pronounced "lace"). As in past years, this paper collects and summarizes the current commercial accelerators that have been publicly announced with peak performance and peak power consumption numbers. The performance and power values are plotted… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

    Comments: 7 pages, 6 figures, 2023 IEEE High Performance Extreme Computing (HPEC) conference, September 2023

    ACM Class: C.1.4; C.4

  4. arXiv:2310.03003  [pdf, other

    cs.CL cs.DC

    From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference

    Authors: Siddharth Samsi, Dan Zhao, Joseph McDonald, Baolin Li, Adam Michaleas, Michael Jones, William Bergeron, Jeremy Kepner, Devesh Tiwari, Vijay Gadepally

    Abstract: Large language models (LLMs) have exploded in popularity due to their new generative capabilities that go far beyond prior state-of-the-art. These technologies are increasingly being leveraged in various domains such as law, finance, and medicine. However, these models carry significant computational challenges, especially the compute and energy costs required for inference. Inference energy costs… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  5. arXiv:2310.00522  [pdf, other

    cs.SI

    Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations

    Authors: Hayden Jananthan, Jeremy Kepner, Michael Jones, William Arcand, David Bestor, William Bergeron, Chansup Byun, Timothy Davis, Vijay Gadepally, Daniel Grant, Michael Houle, Matthew Hubbell, Anna Klein, Lauren Milechin, Guillermo Morales, Andrew Morris, Julie Mullen, Ritesh Patel, Alex Pentland, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Tyler Trigg , et al. (3 additional authors not shown)

    Abstract: Expanding the scientific tools available to protect computer networks can be aided by a deeper understanding of the underlying statistical distributions of network traffic and their potential geometric interpretations. Analyses of large scale network observations provide a unique window into studying those underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M associative ar… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    Comments: 9 pages, 7 figures, IEEE HPEC 2023 (accepted)

  6. pPython Performance Study

    Authors: Chansup Byun, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Anna Klein, Peter Michaleas, Lauren Milechin, Guillermo Morales, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner

    Abstract: pPython seeks to provide a parallel capability that provides good speed-up without sacrificing the ease of programming in Python by implementing partitioned global array semantics (PGAS) on top of a simple file-based messaging library (PythonMPI) in pure Python. pPython follows a SPMD (single program multiple data) model of computation. pPython runs on a single-node (e.g., a laptop) running Window… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2208.14908

  7. Deployment of Real-Time Network Traffic Analysis using GraphBLAS Hypersparse Matrices and D4M Associative Arrays

    Authors: Michael Jones, Jeremy Kepner, Andrew Prout, Timothy Davis, William Arcand, David Bestor, William Bergeron, Chansup Byun, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Hayden Jananthan, Anna Klein, Lauren Milechin, Guillermo Morales, Julie Mullen, Ritesh Patel, Sandeep Pisharody, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Peter Michaleas

    Abstract: Matrix/array analysis of networks can provide significant insight into their behavior and aid in their operation and protection. Prior work has demonstrated the analytic, performance, and compression capabilities of GraphBLAS (graphblas.org) hypersparse matrices and D4M (d4m.mit.edu) associative arrays (a mathematical superset of matrices). Obtaining the benefits of these capabilities requires int… ▽ More

    Submitted 8 December, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: Accepted to IEEE HPEC, 8 pages, 8 figures, 1 table, 69 references. arXiv admin note: text overlap with arXiv:2203.13934. text overlap with arXiv:2309.01806

  8. Focusing and Calibration of Large Scale Network Sensors using GraphBLAS Anonymized Hypersparse Matrices

    Authors: Jeremy Kepner, Michael Jones, Phil Dykstra, Chansup Byun, Timothy Davis, Hayden Jananthan, William Arcand, David Bestor, William Bergeron, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Anna Klein, Lauren Milechin, Guillermo Morales, Julie Mullen, Ritesh Patel, Alex Pentland, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Tyler Trigg, Charles Yee , et al. (1 additional authors not shown)

    Abstract: Defending community-owned cyber space requires community-based efforts. Large-scale network observations that uphold the highest regard for privacy are key to protecting our shared cyberspace. Deployment of the necessary network sensors requires careful sensor placement, focusing, and calibration with significant volumes of network observations. This paper demonstrates novel focusing and calibrati… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: Accepted to IEEE HPEC, 9 pages, 12 figures, 1 table, 63 references, 2 appendices

  9. Toward Sustainable HPC: Carbon Footprint Estimation and Environmental Implications of HPC Systems

    Authors: Baolin Li, Rohan Basu Roy, Daniel Wang, Siddharth Samsi, Vijay Gadepally, Devesh Tiwari

    Abstract: The rapid growth in demand for HPC systems has led to a rise in carbon footprint, which requires urgent intervention. In this work, we present a comprehensive analysis of the carbon footprint of high-performance computing (HPC) systems, considering the carbon footprint during both the hardware manufacturing and system operational stages. Our work employs HPC hardware component carbon footprint mod… ▽ More

    Submitted 18 November, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

  10. Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference Service

    Authors: Baolin Li, Siddharth Samsi, Vijay Gadepally, Devesh Tiwari

    Abstract: This paper presents a solution to the challenge of mitigating carbon emissions from hosting large-scale machine learning (ML) inference services. ML inference is critical to modern technology products, but it is also a significant contributor to carbon footprint. We introduce Clover, a carbon-friendly ML inference service runtime system that balances performance, accuracy, and carbon emissions thr… ▽ More

    Submitted 31 August, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

  11. arXiv:2301.11581  [pdf, other

    cs.AI cs.CY cs.DC cs.LG

    A Green(er) World for A.I

    Authors: Dan Zhao, Nathan C. Frey, Joseph McDonald, Matthew Hubbell, David Bestor, Michael Jones, Andrew Prout, Vijay Gadepally, Siddharth Samsi

    Abstract: As research and practice in artificial intelligence (A.I.) grow in leaps and bounds, the resources necessary to sustain and support their operations also grow at an increasing pace. While innovations and applications from A.I. have brought significant advances, from applications to vision and natural language to improvements to fields like medical imaging and materials engineering, their costs sho… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

    Comments: 8 pages, published in 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

    Journal ref: D. Zhao et al., "A Green(er) World for A.I.," 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Lyon, France, 2022, pp. 742-750

  12. KAIROS: Building Cost-Efficient Machine Learning Inference Systems with Heterogeneous Cloud Resources

    Authors: Baolin Li, Siddharth Samsi, Vijay Gadepally, Devesh Tiwari

    Abstract: Online inference is becoming a key service product for many businesses, deployed in cloud platforms to meet customer demands. Despite their revenue-generation capability, these services need to operate under tight Quality-of-Service (QoS) and cost budget constraints. This paper introduces KAIROS, a novel runtime framework that maximizes the query throughput while meeting QoS target and a cost budg… ▽ More

    Submitted 2 May, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

  13. AI and ML Accelerator Survey and Trends

    Authors: Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, Jeremy Kepner

    Abstract: This paper updates the survey of AI accelerators and processors from past three years. This paper collects and summarizes the current commercial accelerators that have been publicly announced with peak performance and power consumption numbers. The performance and power values are plotted on a scatter graph, and a number of dimensions and observations from the trends on this plot are again discuss… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

    Comments: 10 pages, 4 figures, 2022 IEEE High Performance Extreme Computing (HPEC) Conference. arXiv admin note: substantial text overlap with arXiv:2009.00993, arXiv:2109.08957

    ACM Class: C.1.4; C.4

  14. arXiv:2209.05725  [pdf, other

    cs.NI cs.DC

    Hypersparse Network Flow Analysis of Packets with GraphBLAS

    Authors: Tyler Trigg, Chad Meiners, Sandeep Pisharody, Hayden Jananthan, Michael Jones, Adam Michaleas, Timothy Davis, Erik Welch, William Arcand, David Bestor, William Bergeron, Chansup Byun, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Doug Stetson, Charles Yee , et al. (1 additional authors not shown)

    Abstract: Internet analysis is a major challenge due to the volume and rate of network traffic. In lieu of analyzing traffic as raw packets, network analysts often rely on compressed network flows (netflows) that contain the start time, stop time, source, destination, and number of packets in each direction. However, many traffic analyses benefit from temporal aggregation of multiple simultaneous netflows,… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: text overlap with arXiv:2203.13934, arXiv:2108.06653, arXiv:2008.00307

  15. arXiv:2209.05300  [pdf, other

    cs.LG cs.DC

    An Evaluation of Low Overhead Time Series Preprocessing Techniques for Downstream Machine Learning

    Authors: Matthew L. Weiss, Joseph McDonald, David Bestor, Charles Yee, Daniel Edelman, Michael Jones, Andrew Prout, Andrew Bowne, Lindsey McEvoy, Vijay Gadepally, Siddharth Samsi

    Abstract: In this paper we address the application of pre-processing techniques to multi-channel time series data with varying lengths, which we refer to as the alignment problem, for downstream machine learning. The misalignment of multi-channel time series data may occur for a variety of reasons, such as missing data, varying sampling rates, or inconsistent collection times. We consider multi-channel time… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

  16. Python Implementation of the Dynamic Distributed Dimensional Data Model

    Authors: Hayden Jananthan, Lauren Milechin, Michael Jones, William Arcand, William Bergeron, David Bestor, Chansup Byun, Michael Houle, Matthew Hubbell, Vijay Gadepally, Anna Klein, Peter Michaleas, Guillermo Morales, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner

    Abstract: Python has become a standard scientific computing language with fast-growing support of machine learning and data analysis modules, as well as an increasing usage of big data. The Dynamic Distributed Dimensional Data Model (D4M) offers a highly composable, unified data model with strong performance built to handle big data fast and efficiently. In this work we present an implementation of D4M in P… ▽ More

    Submitted 22 November, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: 8 pages, 7 figures, accepted to HPEC 2022

  17. pPython for Parallel Python Programming

    Authors: Chansup Byun, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Kurt Keville, Anna Klein, Peter Michaleas, Lauren Milechin, Guillermo Morales, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner

    Abstract: pPython seeks to provide a parallel capability that provides good speed-up without sacrificing the ease of programming in Python by implementing partitioned global array semantics (PGAS) on top of a simple file-based messaging library (PythonMPI) in pure Python. The core data structure in pPython is a distributed numerical array whose distribution onto multiple processors is specified with a map c… ▽ More

    Submitted 31 August, 2022; originally announced August 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:astro-ph/0606464

  18. MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant Systems for Machine Learning

    Authors: Baolin Li, Tirthak Patel, Siddarth Samsi, Vijay Gadepally, Devesh Tiwari

    Abstract: GPU technology has been improving at an expedited pace in terms of size and performance, empowering HPC and AI/ML researchers to advance the scientific discovery process. However, this also leads to inefficient resource usage, as most GPU workloads, including complicated AI/ML models, are not able to utilize the GPU resources to their fullest extent -- encouraging support for GPU multi-tenancy. We… ▽ More

    Submitted 6 October, 2022; v1 submitted 23 July, 2022; originally announced July 2022.

  19. arXiv:2207.07033  [pdf, other

    cs.AI cs.CY

    Developing a Series of AI Challenges for the United States Department of the Air Force

    Authors: Vijay Gadepally, Gregory Angelides, Andrei Barbu, Andrew Bowne, Laura J. Brattain, Tamara Broderick, Armando Cabrera, Glenn Carl, Ronisha Carter, Miriam Cha, Emilie Cowen, Jesse Cummings, Bill Freeman, James Glass, Sam Goldberg, Mark Hamilton, Thomas Heldt, Kuan Wei Huang, Phillip Isola, Boris Katz, Jamie Koerner, Yen-Chen Lin, David Mayo, Kyle McAlpin, Taylor Perron , et al. (17 additional authors not shown)

    Abstract: Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Department of the Air Force (DAF). The DAF-MIT AI Accelerator is an initiative between the DAF and MIT to bridge the gap between AI researchers and DAF mission requireme… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  20. Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models

    Authors: Joseph McDonald, Baolin Li, Nathan Frey, Devesh Tiwari, Vijay Gadepally, Siddharth Samsi

    Abstract: The energy requirements of current natural language processing models continue to grow at a rapid, unsustainable pace. Recent works highlighting this problem conclude there is an urgent need for methods that reduce the energy needs of NLP and machine learning more broadly. In this article, we investigate techniques that can be used to reduce the energy consumption of common NLP applications. In pa… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Journal ref: Findings of the Association for Computational Linguistics: NAACL 2022

  21. The MIT Supercloud Workload Classification Challenge

    Authors: Benny J. Tang, Qiqi Chen, Matthew L. Weiss, Nathan Frey, Joseph McDonald, David Bestor, Charles Yee, William Arcand, Chansup Byun, Daniel Edelman, Matthew Hubbell, Michael Jones, Jeremy Kepner, Anna Klein, Adam Michaleas, Peter Michaleas, Lauren Milechin, Julia Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Andrew Bowne, Lindsey McEvoy, Baolin Li, Devesh Tiwari , et al. (2 additional authors not shown)

    Abstract: High-Performance Computing (HPC) centers and cloud providers support an increasingly diverse set of applications on heterogenous hardware. As Artificial Intelligence (AI) and Machine Learning (ML) workloads have become an increasingly larger share of the compute workloads, new approaches to optimized resource usage, allocation, and deployment of new AI frameworks are needed. By identifying compute… ▽ More

    Submitted 13 April, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: Accepted at IPDPS ADOPT'22

  22. arXiv:2203.13934  [pdf, other

    cs.NI cs.DC cs.OS cs.SI

    GraphBLAS on the Edge: Anonymized High Performance Streaming of Network Traffic

    Authors: Michael Jones, Jeremy Kepner, Daniel Andersen, Aydin Buluc, Chansup Byun, K Claffy, Timothy Davis, William Arcand, Jonathan Bernays, David Bestor, William Bergeron, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Hayden Jananthan, Anna Klein, Chad Meiners, Lauren Milechin, Julie Mullen, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Jon Sreekanth , et al. (3 additional authors not shown)

    Abstract: Long range detection is a cornerstone of defense in many operating domains (land, sea, undersea, air, space, ..,). In the cyber domain, long range detection requires the analysis of significant network traffic from a variety of observatories and outposts. Construction of anonymized hypersparse traffic matrices on edge network devices can be a key enabler by providing significant data compression i… ▽ More

    Submitted 5 September, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: Accepted to IEEE HPEC, Outstanding Paper Award, 8 pages, 8 figures, 1 table, 70 references. arXiv admin note: text overlap with arXiv:2108.06653, arXiv:2008.00307, arXiv:2203.10230

  23. Temporal Correlation of Internet Observatories and Outposts

    Authors: Jeremy Kepner, Michael Jones, Daniel Andersen, Aydın Buluç, Chansup Byun, K Claffy, Timothy Davis, William Arcand, Jonathan Bernays, David Bestor, William Bergeron, Vijay Gadepally, Daniel Grant, Micheal Houle, Matthew Hubbell, Hayden Jananthan, Anna Klein, Chad Meiners, Lauren Milechin, Andrew Morris, Julie Mullen, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa , et al. (4 additional authors not shown)

    Abstract: The Internet has become a critical component of modern civilization requiring scientific exploration akin to endeavors to understand the land, sea, air, and space environments. Understanding the baseline statistical distributions of traffic are essential to the scientific understanding of the Internet. Correlating data from different Internet observatories and outposts can be a useful tool for gai… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    Comments: 8 pages, 8 figures, 2 tables, 59 references; accepted to GrAPL 2022. arXiv admin note: substantial text overlap with arXiv:2108.06653

  24. arXiv:2201.12423  [pdf, other

    cs.LG cs.DC

    Benchmarking Resource Usage for Efficient Distributed Deep Learning

    Authors: Nathan C. Frey, Baolin Li, Joseph McDonald, Dan Zhao, Michael Jones, David Bestor, Devesh Tiwari, Vijay Gadepally, Siddharth Samsi

    Abstract: Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. Neural architecture searches, hyperparameter sweeps, and rapid prototyping consume immense resources that can prevent resource-constrained researchers from experimenting with large models and carry considerable environmental impact. As such, it becomes essential to understand how… ▽ More

    Submitted 28 January, 2022; originally announced January 2022.

    Comments: 14 pages, 17 figures

  25. arXiv:2201.06096  [pdf, other

    cs.NI cs.CY cs.DC cs.SI

    New Phenomena in Large-Scale Internet Traffic

    Authors: Jeremy Kepner, Kenjiro Cho, KC Claffy, Vijay Gadepally, Sarah McGuire, Lauren Milechin, William Arcand, David Bestor, William Bergeron, Chansup Byun, Matthew Hubbell, Michael Houle, Michael Jones, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Peter Michaleas

    Abstract: The Internet is transforming our society, necessitating a quantitative understanding of Internet traffic. Our team collects and curates the largest publicly available Internet traffic data sets. An analysis of 50 billion packets using 10,000 processors in the MIT SuperCloud reveals a new phenomenon: the importance of otherwise unseen leaf nodes and isolated links in Internet traffic. Our analysis… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

    Comments: 53 pages, 27 figures, 8 tables, 121 references. Portions of this work originally appeared as arXiv:1904.04396v1 which has been split for publication in the book "Massive Graph Analytics" (edited by David Bader)

  26. arXiv:2112.04977  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.chem-ph

    Bringing Atomistic Deep Learning to Prime Time

    Authors: Nathan C. Frey, Siddharth Samsi, Bharath Ramsundar, Connor W. Coley, Vijay Gadepally

    Abstract: Artificial intelligence has not yet revolutionized the design of materials and molecules. In this perspective, we identify four barriers preventing the integration of atomistic deep learning, molecular science, and high-performance computing. We outline focused research efforts to address the opportunities presented by these challenges.

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: 6 pages, 1 figure, NeurIPS 2021 AI for Science workshop

  27. arXiv:2112.03364  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.chem-ph

    Scalable Geometric Deep Learning on Molecular Graphs

    Authors: Nathan C. Frey, Siddharth Samsi, Joseph McDonald, Lin Li, Connor W. Coley, Vijay Gadepally

    Abstract: Deep learning in molecular and materials sciences is limited by the lack of integration between applied science, artificial intelligence, and high-performance computing. Bottlenecks with respect to the amount of training data, the size and complexity of model architectures, and the scale of the compute infrastructure are all key factors limiting the scaling of deep learning for molecules and mater… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: 7 pages, 3 figures, NeurIPS 2021 AI for Science workshop

  28. arXiv:2111.07140  [pdf, ps, other

    eess.SP cs.LG

    The Pseudo Projection Operator: Applications of Deep Learning to Projection Based Filtering in Non-Trivial Frequency Regimes

    Authors: Matthew L. Weiss, Nathan C. Frey, Siddharth Samsi, Randy C. Paffenroth, Vijay Gadepally

    Abstract: Traditional frequency based projection filters, or projection operators (PO), separate signal and noise through a series of transformations which remove frequencies where noise is present. However, this technique relies on a priori knowledge of what frequencies contain signal and noise and that these frequencies do not overlap, which is difficult to achieve in practice. To address these issues, we… ▽ More

    Submitted 13 April, 2022; v1 submitted 13 November, 2021; originally announced November 2021.

  29. AI Accelerator Survey and Trends

    Authors: Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, Jeremy Kepner

    Abstract: Over the past several years, new machine learning accelerators were being announced and released every month for a variety of applications from speech recognition, video object detection, assisted driving, and many data center applications. This paper updates the survey of AI accelerators and processors from past two years. This paper collects and summarizes the current commercial accelerators tha… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

    Comments: 9 pages, 2 figures, IEEE High Performance Extreme Computing Conference 2021

    ACM Class: C.1.4; C.4

  30. 3D Real-Time Supercomputer Monitoring

    Authors: Bill Bergeron, Matthew Hubbell, Dylan Sequeira, Winter Williams, William Arcand, David Bestor, Chansup, Byun, Vijay Gadepally, Michael Houle, Michael Jones, Anna Klien, Peter Michaleas, Lauren Milechin, Julie Mullen Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner

    Abstract: Supercomputers are complex systems producing vast quantities of performance data from multiple sources and of varying types. Performance data from each of the thousands of nodes in a supercomputer tracks multiple forms of storage, memory, networks, processors, and accelerators. Optimization of application performance is critical for cost effective usage of a supercomputer and requires efficient me… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

  31. arXiv:2108.11525  [pdf, other

    cs.DB cs.DC cs.GR cs.HC cs.MM

    Supercomputing Enabled Deployable Analytics for Disaster Response

    Authors: Kaira Samuel, Jeremy Kepner, Michael Jones, Lauren Milechin, Vijay Gadepally, William Arcand, David Bestor, William Bergeron, Chansup Byun, Matthew Hubbell, Michael Houle, Anna Klein, Victor Lopez, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Sid Samsi, Charles Yee, Peter Michaleas

    Abstract: First responders and other forward deployed essential workers can benefit from advanced analytics. Limited network access and software security requirements prevent the usage of standard cloud based microservice analytic platforms that are typically used in industry. One solution is to precompute a wide range of analytics as files that can be used with standard preinstalled software that does not… ▽ More

    Submitted 25 August, 2021; originally announced August 2021.

    Comments: 5 pages, 11 figures, 17 references, accepted to IEEE HPEC 2021

  32. arXiv:2108.11503  [pdf, other

    cs.AI cs.CV cs.DC cs.PF

    Maneuver Identification Challenge

    Authors: Kaira Samuel, Vijay Gadepally, David Jacobs, Michael Jones, Kyle McAlpin, Kyle Palko, Ben Paulk, Sid Samsi, Ho Chit Siu, Charles Yee, Jeremy Kepner

    Abstract: AI algorithms that identify maneuvers from trajectory data could play an important role in improving flight safety and pilot training. AI challenges allow diverse teams to work together to solve hard problems and are an effective tool for developing AI solutions. AI challenges are also a key driver of AI computational requirements. The Maneuver Identification Challenge hosted at maneuver-id.mit.ed… ▽ More

    Submitted 25 August, 2021; originally announced August 2021.

    Comments: 7 pages, 8 figures, 1 table, 33 references, accepted to IEEE HPEC 2021

  33. Node-Based Job Scheduling for Large Scale Simulations of Short Running Jobs

    Authors: Chansup Byun, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner

    Abstract: Diverse workloads such as interactive supercomputing, big data analysis, and large-scale AI algorithm development, requires a high-performance scheduler. This paper presents a novel node-based scheduling approach for large scale simulations of short running jobs on MIT SuperCloud systems, that allows the resources to be fully utilized for both long running batch jobs while simultaneously providing… ▽ More

    Submitted 25 August, 2021; originally announced August 2021.

    Comments: IEEE HPEC 2021

  34. arXiv:2108.06653  [pdf, other

    cs.NI cs.DC cs.PF cs.SI

    Spatial Temporal Analysis of 40,000,000,000,000 Internet Darkspace Packets

    Authors: Jeremy Kepner, Michael Jones, Daniel Andersen, Aydin Buluc, Chansup Byun, K Claffy, Timothy Davis, William Arcand, Jonathan Bernays, David Bestor, William Bergeron, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Anna Klein, Chad Meiners, Lauren Milechin, Julie Mullen, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Doug Stetson, Adam Tse , et al. (2 additional authors not shown)

    Abstract: The Internet has never been more important to our society, and understanding the behavior of the Internet is essential. The Center for Applied Internet Data Analysis (CAIDA) Telescope observes a continuous stream of packets from an unsolicited darkspace representing 1/256 of the Internet. During 2019 and 2020 over 40,000,000,000,000 unique packets were collected representing the largest ever assem… ▽ More

    Submitted 14 August, 2021; originally announced August 2021.

    Comments: 8 pages, 9 figures, 2 tables, 43 references, accepted to IEEE HPEC 2021. arXiv admin note: substantial text overlap with arXiv:2008.00307

  35. arXiv:2108.06650  [pdf, other

    cs.DC cs.DM cs.MS cs.NI cs.PF

    Vertical, Temporal, and Horizontal Scaling of Hierarchical Hypersparse GraphBLAS Matrices

    Authors: Jeremy Kepner, Tim Davis, Chansup Byun, William Arcand, David Bestor, William Bergeron, Vijay Gadepally, Matthew Hubbell, Michael Houle, Michael Jones, Anna Klein, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Peter Michaleas

    Abstract: Hypersparse matrices are a powerful enabler for a variety of network, health, finance, and social applications. Hierarchical hypersparse GraphBLAS matrices enable rapid streaming updates while preserving algebraic analytic power and convenience. In many contexts, the rate of these updates sets the bounds on performance. This paper explores hierarchical hypersparse update performance on a variety o… ▽ More

    Submitted 14 August, 2021; originally announced August 2021.

    Comments: 6 pages, 5 figures, 32 references, accepted to IEEE HPEC 2021. arXiv admin note: text overlap with arXiv:2001.06935

  36. arXiv:2108.02037  [pdf

    cs.DC cs.AI cs.LG

    The MIT Supercloud Dataset

    Authors: Siddharth Samsi, Matthew L Weiss, David Bestor, Baolin Li, Michael Jones, Albert Reuther, Daniel Edelman, William Arcand, Chansup Byun, John Holodnack, Matthew Hubbell, Jeremy Kepner, Anna Klein, Joseph McDonald, Adam Michaleas, Peter Michaleas, Lauren Milechin, Julia Mullen, Charles Yee, Benjamin Price, Andrew Prout, Antonio Rosa, Allan Vanterpool, Lindsey McEvoy, Anson Cheng , et al. (2 additional authors not shown)

    Abstract: Artificial intelligence (AI) and Machine learning (ML) workloads are an increasingly larger share of the compute workloads in traditional High-Performance Computing (HPC) centers and commercial cloud systems. This has led to changes in deployment approaches of HPC clusters and the commercial cloud, as well as a new focus on approaches to optimized resource usage, allocations and deployment of new… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

  37. Survey of Machine Learning Accelerators

    Authors: Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, Jeremy Kepner

    Abstract: New machine learning accelerators are being announced and released each month for a variety of applications from speech recognition, video object detection, assisted driving, and many data center applications. This paper updates the survey of of AI accelerators and processors from last year's IEEE-HPEC paper. This paper collects and summarizes the current accelerators that have been publicly annou… ▽ More

    Submitted 31 August, 2020; originally announced September 2020.

    Comments: 12 pages, 2 figures, IEEE-HPEC conference, Waltham, MA, September 21-25, 2020. arXiv admin note: text overlap with arXiv:1908.11348

  38. Accuracy and Performance Comparison of Video Action Recognition Approaches

    Authors: Matthew Hutchinson, Siddharth Samsi, William Arcand, David Bestor, Bill Bergeron, Chansup Byun, Micheal Houle, Matthew Hubbell, Micheal Jones, Jeremy Kepner, Andrew Kirby, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Albert Reuther, Charles Yee, Vijay Gadepally

    Abstract: Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-t… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

    Comments: Accepted for publication at IEEE HPEC 2020

  39. Compute, Time and Energy Characterization of Encoder-Decoder Networks with Automatic Mixed Precision Training

    Authors: Siddharth Samsi, Michael Jones, Mark M. Veillette

    Abstract: Deep neural networks have shown great success in many diverse fields. The training of these networks can take significant amounts of time, compute and energy. As datasets get larger and models become more complex, the exploration of model architectures becomes prohibitive. In this paper we examine the compute, energy and time costs of training a UNet based deep neural network for the problem of pr… ▽ More

    Submitted 18 August, 2020; originally announced August 2020.

    Comments: Accepted for publication at IEEE HPEC 2020

  40. Benchmarking network fabrics for data distributed training of deep neural networks

    Authors: Siddharth Samsi, Andrew Prout, Michael Jones, Andrew Kirby, Bill Arcand, Bill Bergeron, David Bestor, Chansup Byun, Vijay Gadepally, Michael Houle, Matthew Hubbell, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Antonio Rosa, Charles Yee, Albert Reuther, Jeremy Kepner

    Abstract: Artificial Intelligence/Machine Learning applications require the training of complex models on large amounts of labelled data. The large computational requirements for training deep models have necessitated the development of new methods for faster training. One such approach is the data parallel approach, where the training data is distributed across multiple compute nodes. This approach is simp… ▽ More

    Submitted 18 August, 2020; originally announced August 2020.

    Comments: Accepted for publication at IEEE HPEC 2020

  41. Best of Both Worlds: High Performance Interactive and Batch Launching

    Authors: Chansup Byun, Jeremy Kepner, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Andrew Kirby, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther

    Abstract: Rapid launch of thousands of jobs is essential for effective interactive supercomputing, big data analysis, and AI algorithm development. Achieving thousands of launches per second has required hardware to be available to receive these jobs. This paper presents a novel preemptive approach to implement spot jobs on MIT SuperCloud systems allowing the resources to be fully utilized for both long run… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

  42. Multi-Temporal Analysis and Scaling Relations of 100,000,000,000 Network Packets

    Authors: Jeremy Kepner, Chad Meiners, Chansup Byun, Sarah McGuire, Timothy Davis, William Arcand, Jonathan Bernays, David Bestor, William Bergeron, Vijay Gadepally, Raul Harnasch, Matthew Hubbell, Micheal Houle, Micheal Jones, Andrew Kirby, Anna Klein, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Doug Stetson, Adam Tse, Charles Yee , et al. (1 additional authors not shown)

    Abstract: Our society has never been more dependent on computer networks. Effective utilization of networks requires a detailed understanding of the normal background behaviors of network traffic. Large-scale measurements of networks are computationally challenging. Building on prior work in interactive supercomputing and GraphBLAS hypersparse hierarchical traffic matrices, we have developed an efficient me… ▽ More

    Submitted 1 August, 2020; originally announced August 2020.

    Comments: 6 pages, 6 figures,3 tables, 49 references, accepted to IEEE HPEC 2020

  43. arXiv:2007.07336  [pdf, other

    cs.LG cs.DC cs.PF stat.ML

    Layer-Parallel Training with GPU Concurrency of Deep Residual Neural Networks via Nonlinear Multigrid

    Authors: Andrew C. Kirby, Siddharth Samsi, Michael Jones, Albert Reuther, Jeremy Kepner, Vijay Gadepally

    Abstract: A Multigrid Full Approximation Storage algorithm for solving Deep Residual Networks is developed to enable neural network parallelized layer-wise training and concurrent computational kernel execution on GPUs. This work demonstrates a 10.2x speedup over traditional layer-wise model parallelism techniques using the same number of compute units.

    Submitted 30 August, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

    Comments: 7 pages, 6 figures, 27 citations. Accepted to 2020 IEEE High Performance Extreme Computing Conference - Outstanding Paper Award

  44. Fast Mapping onto Census Blocks

    Authors: Jeremy Kepner, Andreas Kipf, Darren Engwirda, Navin Vembar, Michael Jones, Lauren Milechin, Vijay Gadepally, Chris Hill, Tim Kraska, William Arcand, David Bestor, William Bergeron, Chansup Byun, Matthew Hubbell, Michael Houle, Andrew Kirby, Anna Klein, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Sid Samsi, Charles Yee, Peter Michaleas

    Abstract: Pandemic measures such as social distancing and contact tracing can be enhanced by rapidly integrating dynamic location data and demographic data. Projecting billions of longitude and latitude locations onto hundreds of thousands of highly irregular demographic census block polygons is computationally challenging in both research and deployment contexts. This paper describes two approaches labeled… ▽ More

    Submitted 1 August, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

    Comments: 8 pages, 7 figures, 55 references; accepted to IEEE HPEC 2020

  45. arXiv:2004.01181  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    GraphChallenge.org Sparse Deep Neural Network Performance

    Authors: Jeremy Kepner, Simon Alford, Vijay Gadepally, Michael Jones, Lauren Milechin, Albert Reuther, Ryan Robinett, Sid Samsi

    Abstract: The MIT/IEEE/Amazon GraphChallenge.org encourages community approaches to developing new solutions for analyzing graphs and sparse data. Sparse AI analytics present unique scalability difficulties. The Sparse Deep Neural Network (DNN) Challenge draws upon prior challenges from machine learning, high performance computing, and visual analytics to create a challenge that is reflective of emerging sp… ▽ More

    Submitted 5 April, 2020; v1 submitted 24 March, 2020; originally announced April 2020.

    Comments: 7 pages, 7 figures, 80 references, to be submitted to IEEE HPEC 2020. This work reports new updated results on prior work reported in arXiv:1909.05631. arXiv admin note: substantial text overlap with arXiv:1807.03165, arXiv:1708.02937. arXiv admin note: text overlap with arXiv:2003.09269

  46. GraphChallenge.org Triangle Counting Performance

    Authors: Siddharth Samsi, Jeremy Kepner, Vijay Gadepally, Michael Hurley, Michael Jones, Edward Kao, Sanjeev Mohindra, Albert Reuther, Steven Smith, William Song, Diane Staheli, Paul Monticciolo

    Abstract: The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems. GraphChallenge.org provides a wide range of pre-pa… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

    Comments: 10 pages, 8 figures, 121 references, to be submitted to IEEE HPEC 2020. This work reports new updated results on prior work reported in arXiv:1805.09675 & arXiv:1708.06866

  47. arXiv:2001.06935  [pdf, other

    cs.DC cs.DB cs.DS cs.PF cs.SI

    75,000,000,000 Streaming Inserts/Second Using Hierarchical Hypersparse GraphBLAS Matrices

    Authors: Jeremy Kepner, Tim Davis, Chansup Byun, William Arcand, David Bestor, William Bergeron, Vijay Gadepally, Matthew Hubbell, Michael Houle, Michael Jones, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther

    Abstract: The SuiteSparse GraphBLAS C-library implements high performance hypersparse matrices with bindings to a variety of languages (Python, Julia, and Matlab/Octave). GraphBLAS provides a lightweight in-memory database implementation of hypersparse matrices that are ideal for analyzing many types of network data, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of h… ▽ More

    Submitted 16 March, 2020; v1 submitted 19 January, 2020; originally announced January 2020.

    Comments: 4 pages, 4 figures, 28 references, accepted to IPDPS GrAPL 2020. arXiv admin note: substantial text overlap with arXiv:1907.04217

  48. arXiv:2001.06731  [pdf, other

    cs.DB

    AI Data Wrangling with Associative Arrays

    Authors: Jeremy Kepner, Vijay Gadepally, Hayden Jananthan, Lauren Milechin, Siddharth Samsi

    Abstract: The AI revolution is data driven. AI "data wrangling" is the process by which unusable data is transformed to support AI algorithm development (training) and deployment (inference). Significant time is devoted to translating diverse data representations supporting the many query and analysis steps found in an AI pipeline. Rigorous mathematical representations of these data enables data translation… ▽ More

    Submitted 18 January, 2020; originally announced January 2020.

    Comments: 3 pages, 2 figures, 23 references, accepted for Northeast Database day (NEDB) 2020. arXiv admin note: text overlap with arXiv:1907.04217

  49. arXiv:1909.05631  [pdf, other

    cs.CV cs.LG cs.NE stat.ML

    Sparse Deep Neural Network Graph Challenge

    Authors: Jeremy Kepner, Simon Alford, Vijay Gadepally, Michael Jones, Lauren Milechin, Ryan Robinett, Sid Samsi

    Abstract: The MIT/IEEE/Amazon GraphChallenge.org encourages community approaches to developing new solutions for analyzing graphs and sparse data. Sparse AI analytics present unique scalability difficulties. The proposed Sparse Deep Neural Network (DNN) Challenge draws upon prior challenges from machine learning, high performance computing, and visual analytics to create a challenge that is reflective of em… ▽ More

    Submitted 1 September, 2019; originally announced September 2019.

    Comments: 7 pages, 5 figures, 3 tables, 60 references, accepted to IEEE HPEC 2019. arXiv admin note: substantial text overlap with arXiv:1807.03165, arXiv:1708.02937, arXiv:1708.06866

  50. Large Scale Parallelization Using File-Based Communications

    Authors: Chansup Byun, Jeremy Kepner, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Anna Klein, Peter Michaleas, Julie Mullen, Andrew Prout, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther

    Abstract: In this paper, we present a novel and new file-based communication architecture using the local filesystem for large scale parallelization. This new approach eliminates the issues with filesystem overload and resource contention when using the central filesystem for large parallel jobs. The new approach incurs additional overhead due to inter-node message file transfers when both the sending and r… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.