An Empirical Survey of Bandits in an Industrial Recommender System Setting

dc.contributor.authorBrandby, Johan
dc.contributor.authorSchwarz, Tobias
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerDubhashi, Devdatt
dc.contributor.supervisorJorge, Emilio
dc.date.accessioned2023-09-21T13:01:51Z
dc.date.available2023-09-21T13:01:51Z
dc.date.issued2023
dc.date.submitted2023
dc.description.abstractIn this thesis, the effects of incorporating unstructured data—images in the wild—in contextual multi-armed bandits are investigated, when used within a recommender system setting, which focuses on picture-based content suggestion. The idea is to employ image features, extracted by a pre-trained convolutional neural network, and study the resulting bandit behaviors when including respective excluding this information in the typical context creation, which normally relies on structured data sources—such as metadata. The evaluation is made both online, through A/B-testing enabled by the industrial partner YouPic AB, and offline, effectuated by a simulation pipeline that models the online counterpart. The results are compiled as a survey, covering a selection of contextual bandit algorithms, highlighting the differences brought by the unstructured data. The offline result points towards that if the contextual bandit utilizes a joint or hybrid action-value function, with respect to the parameterization, the addition of the image vectors can significantly outperform the instances without it; however, if a disjoint model is instead employed, no noticeable change is observed. In comparison, those from the online trials can be interpreted as supporting the inclusion of convolutional features, but due to meager and unbalanced sample sizes, the outcomes are deemed inconclusive. To summarize, though there is support for incorporating unstructured data, given that the action-value function is joint or hybrid, the online experiments gave too little evidence for any trustworthy findings; in other words, the question is still partially open.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/307070
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectcomputer science
dc.subjectindustrial application
dc.subjectmachine learning
dc.subjectreinforcement learning
dc.subjectmulti-armed bandits
dc.subjectMAB
dc.subjectcontextual multi-armed bandits
dc.subjectsurvey
dc.subjectbatch learning
dc.titleAn Empirical Survey of Bandits in an Industrial Recommender System Setting
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeData science and AI (MPDSC), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 23-08 JB TS.pdf
Storlek:
9.76 MB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: