Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations
Authors:
Leo Donisch,
Sigurd Schacht,
Carsten Lanquillon
Abstract:
Large language models are ubiquitous in natural language processing because they can adapt to new tasks without retraining. However, their sheer scale and complexity present unique challenges and opportunities, prompting researchers and practitioners to explore novel model training, optimization, and deployment methods. This literature review focuses on various techniques for reducing resource req…
▽ More
Large language models are ubiquitous in natural language processing because they can adapt to new tasks without retraining. However, their sheer scale and complexity present unique challenges and opportunities, prompting researchers and practitioners to explore novel model training, optimization, and deployment methods. This literature review focuses on various techniques for reducing resource requirements and compressing large language models, including quantization, pruning, knowledge distillation, and architectural optimizations. The primary objective is to explore each method in-depth and highlight its unique challenges and practical applications. The discussed methods are categorized into a taxonomy that presents an overview of the optimization landscape and helps navigate it to understand the research trajectory better.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
Towards energy-efficient Deep Learning: An overview of energy-efficient approaches along the Deep Learning Lifecycle
Authors:
Vanessa Mehlin,
Sigurd Schacht,
Carsten Lanquillon
Abstract:
Deep Learning has enabled many advances in machine learning applications in the last few years. However, since current Deep Learning algorithms require much energy for computations, there are growing concerns about the associated environmental costs. Energy-efficient Deep Learning has received much attention from researchers and has already made much progress in the last couple of years. This pape…
▽ More
Deep Learning has enabled many advances in machine learning applications in the last few years. However, since current Deep Learning algorithms require much energy for computations, there are growing concerns about the associated environmental costs. Energy-efficient Deep Learning has received much attention from researchers and has already made much progress in the last couple of years. This paper aims to gather information about these advances from the literature and show how and at which points along the lifecycle of Deep Learning (IT-Infrastructure, Data, Modeling, Training, Deployment, Evaluation) it is possible to reduce energy consumption.
△ Less
Submitted 5 February, 2023;
originally announced March 2023.