Towards artificially playing the game of double pong Combining learning and search algorithms
Loading...
Date
Authors
Type
Examensarbete för masterexamen
Programme
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This thesis work investigates ways of enhancing search methods used in reinforcement
learning by utilizing neural networks. The environment in which the methods
are tested is the classical game of pong. Two primary networks were used called
Deep value network (DVN) and Fail-state network (F-Network). The first network
aids the search by estimating state-values, the second network is used to detect
search paths that lead to certain losses. Regarding the search methods, two algorithms
were implemented, Random rollouts and Monte Carlo tree search (MCTS).
It was concluded that the combination of search together with DVN drastically
outperforms plain search methods, especially in environments where deep searches
are unfeasible and CPU resources are restricted. The F-Network did not show any
promising results in our study, however, possible improvements are discussed.
Description
Keywords
Neural networks, Q-learning, Monte Carlo tree search, Pong, Search algorithms, Reinforcement learning, Deep Q-learning
