A Version Oriented Parallel Asynchronous Evolution Strategy for Deep Learning

Examensarbete för masterexamen

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.12380/304364
Download file(s):
File Description SizeFormat 
CSE 21-153 Jang.pdf2.14 MBAdobe PDFView/Open
Bibliographical item details
Type: Examensarbete för masterexamen
Title: A Version Oriented Parallel Asynchronous Evolution Strategy for Deep Learning
Abstract: In this work we propose a new parallel asynchronous Evolution Strategy (ES) that outperforms the existing ESs, including the canonical ES and steady-state ES. ES has been considered a competitive alternative solution for optimizing neural networks in deep reinforcement learning, instead of using an optimizer and a backpropagation function. In this thesis, three different ES systems were implemented to compare the performances of each ES implementation. Two ES systems were implemented based on existing ES systems, which are the canonical ES and steady-steady ES, respectively. Lastly, the last ES system is the proposed ES system called Version Oriented Parallel Asynchronous Evolution Strategy (VOPAES). The canonical ES replaces all population individuals at each generation, whereas the steady-state ES replaces only the weakest population with the newly created one. By replacing all population individuals, the canonical ES could optimize the network faster than the steady-state ES. However, it requires synchronization which might increase CPU idle time. On the contrary, a parallel steady-state ES does not require synchronization, but its learning speed could be slower than the parallel canonical ES one. Therefore, we suggest VOPAES as an advanced ES solution that takes the benefits of both the parallel canonical ES and the parallel steady-state ES system. The test results of this work demonstrated that the canonical ES system can be implemented asynchronously using versions. Moreover, by merging the benefits, VOPAES could decrease CPU idle time and maintain high optimization accuracy and speed as the parallel canonical ES system. In conclusion, VOPAES achieved the fastest training speed among the implemented ES systems.
Keywords: Reinforcement Learning;Parallelism;Evolution Strategy;Back-propagation;Asynchronous;Optimization
Issue Date: 2021
Publisher: Chalmers tekniska högskola / Institutionen för data och informationsteknik
URI: https://hdl.handle.net/20.500.12380/304364
Collection:Examensarbeten för masterexamen // Master Theses

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.