Modulating Reinforcement- Learning Parameters Using Agent Emotions

dc.contributor.authorvon Haugwitz, Rickard
dc.contributor.departmentChalmers tekniska högskola / Institutionen för tillämpad informationsteknologi (Chalmers)sv
dc.contributor.departmentChalmers University of Technology / Department of Applied Information Technology (Chalmers)en
dc.date.accessioned2019-07-03T13:07:30Z
dc.date.available2019-07-03T13:07:30Z
dc.date.issued2012
dc.description.abstractWhen faced with the problem of learning a strategy for social interaction in a multiagent environment, it is often difficult to satisfactorily define clear goals, and it might not be clear what would constitute a “good” course of action in most situations. In this case, by using a computational model of emotion to provide an intrinsic reward function, the task can be shifted to optimisation of emotional feedback, allowing more high-level goals to be defined. While of most interest in a general, not necessarily competitive, social setting on a continuing task, such a model can be better compared with more conventional reward functions on an episodic competitive task, where its benefit is not as readily visible. A reinforcement-learning system based on the actor-critic model of temporal-difference learning was implemented using a fuzzy inference system functioning as a normalised radial-basis-function network capable of dynamically allocating computational units as needed and to adapt its features to the actual observed input. While adding some computational overhead, such a system requires less manual tuning by the programmer and is able to make better use of existing resources. Tests were carried out on a small-scale multi-agent system with an initially hostile environment, with fixed learning parameters and separately with modulated parameters that were allowed to deviate from their base values depending on the emotional state of the agent. The latter approach was shown to give marginally better performance once the hostile elements were removed from the environment, indicating that emotion-modulated learning may lead to somewhat closer approximation of the optimal policy in a difficult environment by focusing learning on more useful input and increasing exploration when needed.
dc.identifier.urihttps://hdl.handle.net/20.500.12380/173825
dc.language.isoeng
dc.relation.ispartofseriesReport - IT University of Göteborg, Chalmers University of Technology and the University of Göteborg
dc.setspec.uppsokHumanitiesTheology
dc.subjectInformations- och kommunikationsteknik
dc.subjectMänniska-datorinteraktion (interaktionsdesign)
dc.subjectInformation & Communication Technology
dc.subjectHuman Computer Interaction
dc.titleModulating Reinforcement- Learning Parameters Using Agent Emotions
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
dc.type.uppsokH
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
173825.pdf
Storlek:
1.57 MB
Format:
Adobe Portable Document Format
Beskrivning:
Fulltext