Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Durg, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2304.05301  [pdf, other

    cs.DC cs.LG

    TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Machine Learning

    Authors: William Won, Midhilesh Elavazhagan, Sudarshan Srinivasan, Ajaya Durg, Samvit Kaul, Swati Gupta, Tushar Krishna

    Abstract: The surge of artificial intelligence, specifically large language models, has led to a rapid advent towards the development of large-scale machine learning training clusters. Collective communications within these clusters tend to be heavily bandwidth-bound, necessitating techniques to optimally utilize the available network bandwidth. This puts the routing algorithm for the collective at the fore… ▽ More

    Submitted 29 March, 2024; v1 submitted 11 April, 2023; originally announced April 2023.