Critical Event Prediction in Logs at Customer Network

Hajizada, Elmar

Critical Event Prediction in Logs at Customer Network

Ladda ner

Master_s_Thesis_Elmar_Hajizada_2022.pdf (1.45 MB)

Publicerad

2022

Författare

Hajizada, Elmar

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Data science and AI (MPDSC), MSc

Sammanfattning

Implementing effective maintenance prognosis for Radio units at Ericsson can result in a number of benefits, including better system safety, improved operational reliability, longer equipment lifespan, and lower maintenance costs. Preventive investigations and repairs on the hardware and software level can be done to avoid the radio unit from failing by forecasting whether or not the radio unit will have an alarm in the near future. The goal of this thesis was to use multiple logs taken from a radio unit to predict whether an alarm would occur in the next one to nine days. The log file contents have been divided into chunks using different approaches like expanding window, independent chunks and time interval chunks where each chunk labeled according to timestamp of the alarm. Ericsson has used a combination of verdicts (features that are defined by subject matter experts) to extract the best features from the log files. This rule-based approach is inefficient since it requires modification of the script using expert knowledge when there is a change in the design of the hardware. The purpose of this thesis project was achieved using data-driven NLP approaches including log parsers and word embeddings. An independent chunks approach with Drain log parser using concatenated bag-of-words representations for each log file fitted on the Xgboost model outperformed other combination of log parsers and word embeddings. LSTM model was used with 1 day interval chunks to see if the complex sequential model can achieve a sufficient score. Experiments using complex sequential model, such as the LSTM many-to-many model with doc2vec embedding, have shown shown that they can predict alerts before they occur. All the tested models were evaluated using cross-validation. The Xgboost model with the independent chunks approach using Drain log parser and BOW embedding achieved an average F1-score of 0.873, LSTM model with time interval chunks approach using doc2vec embedding achieved average 0.853 F1-score across shifting time periods from one to nine days.

Ämne/nyckelord

NLP, log, predictive maintenance, classification, machine learning, word embedding, LSTM, XGBoost, AWSOM-LP, Drain.

URI

http://hdl.handle.net/20.500.12380/306943

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Critical Event Prediction in Logs at Customer Network

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced