Learning to Navigate Over Stochastic Transport Networks Using Multi-Armed Bandits: A Contextual Approach for Efficient Online Learning in Road Network Graphs with Multi-Armed Bandits to Minimize Long- Term Travel Time

Nilsson, Hannes; Johansson, Rikard

Learning to Navigate Over Stochastic Transport Networks Using Multi-Armed Bandits: A Contextual Approach for Efficient Online Learning in Road Network Graphs with Multi-Armed Bandits to Minimize Long- Term Travel Time

Ladda ner

CSE 24-27 HN RJ.pdf (13.07 MB)

Publicerad

2024

Författare

Nilsson, Hannes

Johansson, Rikard

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Data science and AI (MPDSC), MSc

Sammanfattning

As part of the ongoing process of phasing out fossil fuel vehicles, attempts have been made to extend the effective range and adoption rate of electric vehicles through navigation systems focused on energy consumption. One way to approach this problem is by viewing route selections as a multi-armed bandit problem. This allows the system to adapt and recommend better routes over time, to minimize energy consumption. For navigation systems to be useful in practice, guiding vehicles from one point to another in minimal time is crucial. Therefore, this project examines the effectiveness of multi-armed bandit algorithms for time-efficient navigation in complex real-world environments, without initial information. For this purpose, we adapt a previously studied online learning framework developed for energy efficiency, and extract road segment travel time distributions from the traffic simulation software SUMO. The framework is applied to the Luxembourg road network and our results demonstrate that contextual multi-armed bandits using tree ensembles are highly effective. More specifically, we show that TEUCB and TETS, which we implement using both XGBoost and random forest, outperform state-of-the-art contextual multi-armed bandits based on neural networks and linear models. Further, by additional comparison of TEUCB and TETS to other bandit algorithms based on tree models, we identify at least two properties to explain their high level of performance. First, tree ensemble methods appear to offer relatively accurate travel time predictions from the contextual information available in this problem. Second, the ability to generalize over different arms and infer the travel time on one road segment from observations gathered on other ones, based on similarities, seems highly advantageous for this problem.

Ämne/nyckelord

decision-making under uncertainty, online learning, machine learning, reinforcement learning, multi-armed bandits, navigation, shortest path problem

URI

http://hdl.handle.net/20.500.12380/308925

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Learning to Navigate Over Stochastic Transport Networks Using Multi-Armed Bandits: A Contextual Approach for Efficient Online Learning in Road Network Graphs with Multi-Armed Bandits to Minimize Long- Term Travel Time

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced