Reinforcement learning A comparison of learning agents in environments with large discrete state spaces

Andersson, Johan; Kristiansson, Emil; Persson, Joakim; Toom, Daniel; Sandberg Eriksson, Adam; Widstam, Joppe

Reinforcement learning A comparison of learning agents in environments with large discrete state spaces

dc.contributor.author	Andersson, Johan
dc.contributor.author	Kristiansson, Emil
dc.contributor.author	Persson, Joakim
dc.contributor.author	Toom, Daniel
dc.contributor.author	Sandberg Eriksson, Adam
dc.contributor.author	Widstam, Joppe
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering (Chalmers)	en
dc.date.accessioned	2019-07-03T13:30:55Z
dc.date.available	2019-07-03T13:30:55Z
dc.date.issued	2014
dc.description.abstract	For some real-world optimization problems where the best behavior is sought, it is infeasible to search for a solution by making a model of the problem and performing calculations on it. When this is the case, good solutions can sometimes be found by trial and error. Reinforcement learning is a way of finding optimal behavior by systematic trial and error. This thesis aims to compare different reinforcement learning techniques and evaluate them. Model-based interval estimation (MBIE) and Explicit Explore or Exploit using dynamic bayesian networks (DBN-E3) are two algorithms that are evaluated. To evaluate the techniques, learning agents were constructed using the algorithms and then simulated in the environment Invasive Species from the Reinforcement Learning Competition. The results of the study show that an optimized version of DBN-E3 is better than MBIE at finding an optimal or near optimal behavior policy in Invasive Species for a selection of environment parameters. Using a factored model like a DBN shows certain advantages operating in Invasive Species, which is a factored environment. For example it achieves a near optimal policy within fewer episodes than MBIE.
dc.identifier.uri	https://hdl.handle.net/20.500.12380/203119
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	Data- och informationsvetenskap
dc.subject	Computer and Information Science
dc.title	Reinforcement learning A comparison of learning agents in environments with large discrete state spaces
dc.type.degree	Examensarbete för kandidatexamen	sv
dc.type.degree	Bachelor Thesis	en
dc.type.uppsok	M2
local.programme	Datateknik 300 hp (civilingenjör)

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: 203119.pdf
Storlek:: 672.16 KB
Format:: Adobe Portable Document Format
Beskrivning:: Fulltext

Ladda ner

Samlingar

Examensarbeten för kandidatexamen