Anomaly Detection and Fault Localization An Automated Process for Advertising Systems

Typ
Examensarbete för masterexamen
Master Thesis
Program
Computer science – algorithms, languages and logic (MPALG), MSc
Publicerad
2018
Författare
Persson, Moa
Rudenius, Linnea
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
The aim of this thesis is to develop an automated process of identifying anomalies in time series and narrowing down the possible root causes. The thesis has been divided into three parts; forecasting, anomaly detection and fault localization. During the forecasting part, different time series models commonly used for forecasting were evaluated, and an exponential smoothing state space model was determined as the best fit for the data used in the project. For the anomaly detection part, an anomaly was defined as a significant deviation from a forecasted value, and different methods for determining a significant deviation were explored. For this part, a threshold learning algorithm was determined as the best method for identifying anomalies. The threshold learning algorithm uses input provided by operators and an updating rule for increasing or decreasing the current threshold. During the last part of this thesis, two different fault localization algorithms were implemented, and the results were compared in order to see which found the largest number of correct root causes. The best performing algorithm was a modified version of the Adtributor algorithm [3], where the modifications included making the algorithm recursive and adjusting the criteria used to determine root cause candidates. The results of the forecasting- and anomaly detection part of this thesis were varied. We believe this is due to the limited amount of labelled data available and the different characteristics present in the time series used. The results from the fault localization were, however, very promising but need to be evaluated using a larger test set. Combining these three components, we believe that the automated process has great potential for discovering anomalies and narrowing down the root causes in a real application.
Beskrivning
Ämne/nyckelord
Data- och informationsvetenskap , Computer and Information Science
Citation
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Byggår
Modelltyp
Skala
Teknik / material
Index