Enabling Energy Efficient Training for AI Algorithms by Controlling Resource Allocation
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Training deep learning models typically involves large-scale computations that require significant energy resources, making the process both costly and environmentally unsustainable. One reason for this that the default strategy of using high frequencies during deep neural network training. However, the various layers in a deep learning network have varying computational and memory access patterns, leading to potential mismatches and bottlenecks. The purpose of this thesis was to address this challenge by exploring resource allocation strategies that can reduce the energy consumption on a fine-grained level when training CNNs on GPUs. The research focuses on predicting the computational and memory demands of different deep network layers, and creating appropriate execution strategies to reduce energy consumption by reducing idle times of compute and memory units. These resource allocation strategies are based on both analysis of arithmetic intensity as well as exhaustive searches, allocating the appropriate resources by adjusting the compute and memory clock frequency combinations of each layer. This thesis demonstrates that resource allocation strategies can potentially reduce energy consumption during deep learning training. This was analysed for two deep learning models, ResNet50 and VGG16, on two different GPUs, NVIDIA RTX A4000 and NVIDIA RTX 2000 Mobile. For full training executions using our execution strategies, there were no significant improvements to the energy efficiency that did not increase the execution time. With a slight increase in execution time, one strategy achieved moderate energy savings. Focusing on the forward propagation phase there was improved results. The same strategy yielded execution times comparable to the default, in some cases even better, with moderate energy savings. If users are willing to sacrifice some performance, another execution strategy achieves a significant reduction in energy consumption with only a slight increase in execution time.
Beskrivning
Ämne/nyckelord
resource allocation, machine learning, deep learning, energy efficiency, frequency configuration, DL training optimization, power consumption