Shaping Rewards with Temporal Information to Guide Reinforcement Learning

Lundgren, Linus

Shaping Rewards with Temporal Information to Guide Reinforcement Learning

Ladda ner

linulun_reward_shaping_thesis.pdf (6.85 MB)

Publicerad

2025

Författare

Lundgren, Linus

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Data science and AI (MPDSC), MSc
Systems, control and mechatronics (MPSYS), MSc

Sammanfattning

Reinforcement learning (RL) methods that apply pretrained Vision-Language Models (VLMs) to compute rewards typically use a single observation of the environment to do so. This is problematic because any information emerging from the sequential nature of RL, i.e. temporal information, is therefore disregarded. This thesis explored how temporal information can be incorporated into the VLM reward computation, by first distinguishing between fixed and adaptive temporal information. In fixed temporal information, additional inputs are provided to describe the environment’s progression through time, but these inputs remain unchanging throughout each episode. In contrast, adaptive temporal methods take additional inputs that can change as the episode progresses. Positional and directional rewards were defined to take advantage of fixed and adaptive temporal information respectively, along with new supervised finetuning methods for the directional reward functions. Evaluated with a sample efficiency metric over 6 robotic manipulation tasks, the best new positional rewards performed 18.4% better than previous methods, while directional rewards performed 23.0% better. Combining positional and directional rewards showed a 25.4% improvement, which was the best performance achieved by any method in this thesis.

Ämne/nyckelord

VLM, reinforcement learning, machine learning, transfer learning, neural networks

URI

http://hdl.handle.net/20.500.12380/310826

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Shaping Rewards with Temporal Information to Guide Reinforcement Learning

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced