Learning to Play Games from Multiple Imperfect Teachers
dc.contributor.author | Karlsson, John | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers) | sv |
dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering (Chalmers) | en |
dc.date.accessioned | 2019-07-03T13:30:51Z | |
dc.date.available | 2019-07-03T13:30:51Z | |
dc.date.issued | 2014 | |
dc.description.abstract | This project evaluates the modularity of a recent Bayesian Inverse Reinforcement Learning approach [1] by inferring the sub-goals correlated with winning board games from observations of a set of agents. A feature based architecture is proposed together with a method for generating the reward function space, making inference tractable in large state spaces and allowing for the combination with models that approximate stateaction values. Further, a policy prior is suggested that allows for least squares policy evaluation using sample trajectories. The model is evaluated on randomly generated environments and on Tic-tac-toe, showing that a combination of the intentions inferred from all agents can generate strategies that outperform the corresponding strategies from each individual agent. | |
dc.identifier.uri | https://hdl.handle.net/20.500.12380/203067 | |
dc.language.iso | eng | |
dc.setspec.uppsok | Technology | |
dc.subject | Data- och informationsvetenskap | |
dc.subject | Computer and Information Science | |
dc.title | Learning to Play Games from Multiple Imperfect Teachers | |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.degree | Master Thesis | en |
dc.type.uppsok | H | |
local.programme | Complex adaptive systems (MPCAS), MSc |
Ladda ner
Original bundle
1 - 1 av 1
Hämtar...
- Namn:
- 203067.pdf
- Storlek:
- 595.92 KB
- Format:
- Adobe Portable Document Format
- Beskrivning:
- Fulltext