Predicting retention among application users with online ensemble learning models
Typ
Examensarbete för masterexamen
Program
Engineering mathematics and computational science (MPENM), MSc
Publicerad
2019
Författare
EKBORG, OLOF
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Most service providing companies consider customer retention as the most important
asset for improving profitability. Even for services and applications without paying
customers the retention of users is essential, as more advertisement impressions are
generated and the reputation of the brand strengthens. The ability to foresee which
users will be retained and which are likely to churn is therefore highly valuable for
any expanding company.
Forza Football is one of the world’s most popular football live score applications
with millions of weekly active users. New data of application activity among users
arrives sequentially in the form of a stream. To predict future user activities, a
model must be able to adapt to seasonal drifts in activity. The model must furthermore
remain scalable and time efficient when analyzing new instance arrivals,
given that the size of each instance is several million observations. Motivated by
these requirements, this thesis approaches a data stream of previous user activities
to predict the activities of upcoming instances. State-of-the-art ensemble classification
methods are acclimatized to an online learning environment to incorporate
both historical and current information in a computationally low-cost manner.
Various predictive models are proposed which obtains accurate predictions that
are efficient in terms of storage and computational time. The models are stable in
detecting and adjusting to concept drifts.
Beskrivning
Ämne/nyckelord
Online learning, Data stream analysis, Concept drift, Random Forest, Decision tree ensembles, retention prediction, churn prevention