Pathfinding med reinforcement learning i delvis observerbara miljöer

dc.contributor.authorEngström, Anne
dc.contributor.authorLidin, Joel
dc.contributor.authorMolander, Gustav
dc.contributor.authorOnoszko, Noa
dc.contributor.authorMånsson, Olle
dc.contributor.authorÖlund, Hugo
dc.contributor.departmentChalmers tekniska högskola / Institutionen för matematiska vetenskapersv
dc.contributor.departmentChalmers University of Technology / Department of Mathematical Sciencesen
dc.date.accessioned2019-07-05T12:03:09Z
dc.date.available2019-07-05T12:03:09Z
dc.date.issued2019
dc.description.abstractReinforcement learning algorithms have the ability to solve problems without explicit knowledge of their underlying model. Instead, they infer a strategy directly from observations and rewards acquired by interacting with their environment. This makes them suitable candidates for solving pathfinding problems in a partially observable setting, where the aim is to find a path in an environment with restricted vision. This report aims to investigate how Markov decision processes and reinforcement learning can be used to model and solve partially observable pathfinding problems. Existing literature has been reviewed to give a theoretical background of the subject, before progressing to practical implementations. We have applied state-of-the-art algorithms taken from two subclasses of reinforcement learning methods: value based algorithms and policy based algorithms. We find that partially observable Markov decision processes can be used to model pathfinding problems, but not all reinforcement learning algorithms are suitable for solving them. In theory, value based algorithms show potential but when implemented they did not yield positive results. Conversely, the policy based algorithm Proximal Policy Optimization is able to solve the problem convincingly. This algorithm also performs well in environments previously not trained in, thus displaying some ability to generalize its policy.
dc.identifier.urihttps://hdl.handle.net/20.500.12380/257380
dc.language.isoswe
dc.setspec.uppsokPhysicsChemistryMaths
dc.subjectGrundläggande vetenskaper
dc.subjectMatematik
dc.subjectBasic Sciences
dc.subjectMathematics
dc.titlePathfinding med reinforcement learning i delvis observerbara miljöer
dc.type.degreeExamensarbete för kandidatexamensv
dc.type.degreeBachelor Thesisen
dc.type.uppsokM2
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
257380.pdf
Storlek:
1.45 MB
Format:
Adobe Portable Document Format
Beskrivning:
Fulltext