Towards artificially playing the game of double pong Combining learning and search algorithms
Publicerad
Författare
Typ
Examensarbete för masterexamen
Program
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
This thesis work investigates ways of enhancing search methods used in reinforcement
learning by utilizing neural networks. The environment in which the methods
are tested is the classical game of pong. Two primary networks were used called
Deep value network (DVN) and Fail-state network (F-Network). The first network
aids the search by estimating state-values, the second network is used to detect
search paths that lead to certain losses. Regarding the search methods, two algorithms
were implemented, Random rollouts and Monte Carlo tree search (MCTS).
It was concluded that the combination of search together with DVN drastically
outperforms plain search methods, especially in environments where deep searches
are unfeasible and CPU resources are restricted. The F-Network did not show any
promising results in our study, however, possible improvements are discussed.
Beskrivning
Ämne/nyckelord
Neural networks, Q-learning, Monte Carlo tree search, Pong, Search algorithms, Reinforcement learning, Deep Q-learning