Predictive Caching

Publicerad

Typ

Examensarbete för masterexamen

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Though digitization has revolutionized the entertainment industry, streaming services like Netflix, Spotify, etc. are the ones who made the content available to the users through hand-held devices. These services require an active internet connection to deliver the requested content to the user device, consuming the expensive mobile data subscriptions of the user. The aim of the thesis project is to optimize the mobile data usage by predicting the content a user is most likely to download so that it can be pre-fetched when the user’s device is connected to high bandwidth, less-expensive network. Different use cases were considered to identify the potential candidates that a user is most likely to download through mobile data subscription. First, users are highly probable to download the personalized content recommended by these services. Hence, the user behavior on personalized content was modeled using a Logistic Regression algorithm as a generic baseline approach. Second, the users tend to use multiple devices to stream content and it is very likely that they play the same content from different devices. This has a strong pre-cache potential in the context that contents viewed/listened to in one device could be used to predict the possible streaming behavior in the user’s other devices. Third, the users prefer to play contents from different playlists provided by streaming services. The third use case exploited the user behavior on playlists to predict the contents a user is likely to download in future. We employed a Gradient Boosting algorithm to model the device sync and playlist use cases. The results were evaluated using a generic evaluation metric defined solely for the purpose, and different use cases were compared. The device sync model predicted 15% of the potential savings that were identified through data analysis, whereas the playlist model predicted 30%.

Beskrivning

Ämne/nyckelord

Computer science, engineering, thesis, machine learning, predictive caching, user behavior, xgboost, logistic regression

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced