Learning to Play Games from Multiple Imperfect Teachers
Download
Date
Authors
Type
Examensarbete för masterexamen
Master Thesis
Master Thesis
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This project evaluates the modularity of a recent Bayesian Inverse Reinforcement Learning approach [1] by inferring the sub-goals correlated with winning board games from observations of a set of agents. A feature based architecture is proposed together with a method for generating the reward function space, making inference tractable in large state spaces and allowing for the combination with models that approximate stateaction values. Further, a policy prior is suggested that allows for least squares policy evaluation using sample trajectories. The model is evaluated on randomly generated environments and on Tic-tac-toe, showing that a combination of the intentions inferred from all agents can generate strategies that outperform the corresponding strategies from each individual agent.
Description
Keywords
Data- och informationsvetenskap, Computer and Information Science