Learning to Play Games from Multiple Imperfect Teachers

Date

Type

Examensarbete för masterexamen
Master Thesis

Model builders

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This project evaluates the modularity of a recent Bayesian Inverse Reinforcement Learning approach [1] by inferring the sub-goals correlated with winning board games from observations of a set of agents. A feature based architecture is proposed together with a method for generating the reward function space, making inference tractable in large state spaces and allowing for the combination with models that approximate stateaction values. Further, a policy prior is suggested that allows for least squares policy evaluation using sample trajectories. The model is evaluated on randomly generated environments and on Tic-tac-toe, showing that a combination of the intentions inferred from all agents can generate strategies that outperform the corresponding strategies from each individual agent.

Description

Keywords

Data- och informationsvetenskap, Computer and Information Science

Citation

Architect

Location

Type of building

Build Year

Model type

Scale

Material / technology

Index

Collections

Endorsement

Review

Supplemented By

Referenced By