DeepSpeed

DeepSpeed
Original author(s)	Microsoft Research
Developer(s)	Microsoft
Initial release	May 18, 2020; 4 years ago
Stable release	v0.14.4 / June 21, 2024; 45 days ago
Repository	github.com/microsoft/DeepSpeed
Written in	Python, CUDA, C++
Typ	Software library
License	Apache License 2.0
Website	deepspeed.ai

DeepSpeed is an open source deep learning optimization library for PyTorch.^[1]

Library

The library is designed to reduce computing power and memory use and to train large distributed models with better parallelism on existing computer hardware.^[2]^[3] DeepSpeed is optimized for low latency, high throughput training. It includes the Zero Redundancy Optimizer (ZeRO) for training models with 1 trillion or more parameters.^[4] Features include mixed precision training, single-GPU, multi-GPU, and multi-node training as well as custom model parallelism. The DeepSpeed source code is licensed under MIT License and available on GitHub.^[5]

The team claimed to achieve up to a 6.2x throughput improvement, 2.8x faster convergence, and 4.6x less communication.^[6]

References

^ "Microsoft Updates Windows, Azure Tools with an Eye on The Future". PCMag UK. May 22, 2020.
^ Yegulalp, Serdar (February 10, 2020). "Microsoft speeds up PyTorch with DeepSpeed". InfoWorld.
^ "Microsoft unveils "fifth most powerful" supercomputer in the world". Neowin. 18 June 2023.
^ "Microsoft trains world's largest Transformer language model". February 10, 2020.
^ "microsoft/DeepSpeed". July 10, 2020 – via GitHub.
^ "DeepSpeed: Accelerating large-scale model inference and training via system optimizations and compression". Microsoft Research. 2021-05-24. Retrieved 2021-06-19.

External links

[1] "Microsoft Updates Windows, Azure Tools with an Eye on The Future". PCMag UK. May 22, 2020.

[2] Yegulalp, Serdar (February 10, 2020). "Microsoft speeds up PyTorch with DeepSpeed". InfoWorld.

[3] "Microsoft unveils "fifth most powerful" supercomputer in the world". Neowin. 18 June 2023.

[4] "Microsoft trains world's largest Transformer language model". February 10, 2020.

[5] "microsoft/DeepSpeed". July 10, 2020 – via GitHub.

[:0-6] "DeepSpeed: Accelerating large-scale model inference and training via system optimizations and compression". Microsoft Research. 2021-05-24. Retrieved 2021-06-19.

[1]

[2]

[3]

[4]

[5]

[6]

DeepSpeed

Inhalt

Library

See also

References

Further reading

External links