A Version Oriented Parallel Asynchronous Evolution Strategy for Deep Learning
Ladda ner
Typ
Examensarbete för masterexamen
Program
Computer systems and networks (MPCSN), MSc
Publicerad
2021
Författare
JANG, MYEONG-JIN
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
In this work we propose a new parallel asynchronous Evolution Strategy (ES) that
outperforms the existing ESs, including the canonical ES and steady-state ES. ES
has been considered a competitive alternative solution for optimizing neural networks
in deep reinforcement learning, instead of using an optimizer and a backpropagation
function. In this thesis, three different ES systems were implemented to
compare the performances of each ES implementation. Two ES systems were implemented
based on existing ES systems, which are the canonical ES and steady-steady
ES, respectively. Lastly, the last ES system is the proposed ES system called Version
Oriented Parallel Asynchronous Evolution Strategy (VOPAES). The canonical ES
replaces all population individuals at each generation, whereas the steady-state ES
replaces only the weakest population with the newly created one. By replacing all
population individuals, the canonical ES could optimize the network faster than the
steady-state ES. However, it requires synchronization which might increase CPU
idle time. On the contrary, a parallel steady-state ES does not require synchronization,
but its learning speed could be slower than the parallel canonical ES one.
Therefore, we suggest VOPAES as an advanced ES solution that takes the benefits
of both the parallel canonical ES and the parallel steady-state ES system. The test
results of this work demonstrated that the canonical ES system can be implemented
asynchronously using versions. Moreover, by merging the benefits, VOPAES could
decrease CPU idle time and maintain high optimization accuracy and speed as the
parallel canonical ES system. In conclusion, VOPAES achieved the fastest training
speed among the implemented ES systems.
Beskrivning
Ämne/nyckelord
Reinforcement Learning , Parallelism , Evolution Strategy , Back-propagation , Asynchronous , Optimization