Predictive Caching

Typ
Examensarbete för masterexamen
Program
Publicerad
2019
Författare
Lizbat Lawrence, Nickey
Sebastian, Renjith
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Though digitization has revolutionized the entertainment industry, streaming services like Netflix, Spotify, etc. are the ones who made the content available to the users through hand-held devices. These services require an active internet connection to deliver the requested content to the user device, consuming the expensive mobile data subscriptions of the user. The aim of the thesis project is to optimize the mobile data usage by predicting the content a user is most likely to download so that it can be pre-fetched when the user’s device is connected to high bandwidth, less-expensive network. Different use cases were considered to identify the potential candidates that a user is most likely to download through mobile data subscription. First, users are highly probable to download the personalized content recommended by these services. Hence, the user behavior on personalized content was modeled using a Logistic Regression algorithm as a generic baseline approach. Second, the users tend to use multiple devices to stream content and it is very likely that they play the same content from different devices. This has a strong pre-cache potential in the context that contents viewed/listened to in one device could be used to predict the possible streaming behavior in the user’s other devices. Third, the users prefer to play contents from different playlists provided by streaming services. The third use case exploited the user behavior on playlists to predict the contents a user is likely to download in future. We employed a Gradient Boosting algorithm to model the device sync and playlist use cases. The results were evaluated using a generic evaluation metric defined solely for the purpose, and different use cases were compared. The device sync model predicted 15% of the potential savings that were identified through data analysis, whereas the playlist model predicted 30%.
Beskrivning
Ämne/nyckelord
Computer science , engineering , thesis , machine learning , predictive caching , user behavior , xgboost , logistic regression
Citation
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Byggår
Modelltyp
Skala
Teknik / material
Index