Towards artificially playing the game of double pong Combining learning and search algorithms

Publicerad

Typ

Examensarbete för masterexamen

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

This thesis work investigates ways of enhancing search methods used in reinforcement learning by utilizing neural networks. The environment in which the methods are tested is the classical game of pong. Two primary networks were used called Deep value network (DVN) and Fail-state network (F-Network). The first network aids the search by estimating state-values, the second network is used to detect search paths that lead to certain losses. Regarding the search methods, two algorithms were implemented, Random rollouts and Monte Carlo tree search (MCTS). It was concluded that the combination of search together with DVN drastically outperforms plain search methods, especially in environments where deep searches are unfeasible and CPU resources are restricted. The F-Network did not show any promising results in our study, however, possible improvements are discussed.

Beskrivning

Ämne/nyckelord

Neural networks, Q-learning, Monte Carlo tree search, Pong, Search algorithms, Reinforcement learning, Deep Q-learning

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced