Comparison of Arm Selection Policies for the Multi-Armed Bandit Problem

dc.contributor.authorJohansson, Fifi
dc.contributor.authorMCHOME, MIRIAM
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)sv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineering (Chalmers)en
dc.date.accessioned2019-07-03T14:56:24Z
dc.date.available2019-07-03T14:56:24Z
dc.date.issued2018
dc.description.abstractWeb content optimization involves deciding what content to put on a web page, its layout and design. All of which involve selecting few options among many. With the advent of personalization, many companies seek to make this decision even on a per-user basis in order to improve customer experience and satisfaction. Contextual multi-armed bandit provides several strategies to address this online decision-making problem at a lower experimental cost than traditional A/B testing. In this study, we compare three common Contextual Bandit strategies that exist in literature namely E-greedy, LinUCB and Thompson Sampling, and apply two of them, E-greedy and LinUCB, to three datasets. In doing so we offer further empirical evidence on the performance of these strategies and insights for practitioners on what strategy might work for them. Our results suggest that both approaches, E-Greedy and LinUCB are effective in improving click-through rate compared to the random approach. The more sophisticated approach has better results with large datasets, and a quite unstable performance when the number of datapoints is small. On the other hand, we find that the more sophisticated approach is more sensitive to parameter tuning and can have significantly worse outcome when parameters are incorrect. Our study also finds that LinUCB can have higher data requirements when performing evaluation offline. Collectively the varying performance of these approaches across dataset signal the need for better tools and procedures to help practitioners decide on the appropriate approach.
dc.identifier.urihttps://hdl.handle.net/20.500.12380/256336
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectData- och informationsvetenskap
dc.subjectComputer and Information Science
dc.titleComparison of Arm Selection Policies for the Multi-Armed Bandit Problem
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
dc.type.uppsokH
local.programmeSoftware engineering and technology (MPSOF), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
256336.pdf
Storlek:
1.5 MB
Format:
Adobe Portable Document Format
Beskrivning:
Fulltext