Real World Implementation of LLM-based Log Anomaly Detection

Shintaro Cometti, Dante; Lopes Dionisio, Diogo

Real World Implementation of LLM-based Log Anomaly Detection

Ladda ner

CSE 24-40 DSC DLD.pdf (922.26 KB)

Publicerad

2024

Författare

Shintaro Cometti, Dante

Lopes Dionisio, Diogo

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Data science and AI (MPDSC), MSc

Sammanfattning

The complexity of systems have escalated to the point where automated techniques leveraging machine learning methods have become indispensable for log anomaly detection. In this project, carried out in collaboration with Ericsson, the feasibility of employing training-free approaches was explored. We implemented the RAPID method for log anomaly detection, which uses a small dataset of "normal" logs and a pre-trained DistilBERT model to classify unseen log lines by measuring distances between their representations, requiring no training or fine-tuning. The implementation was then modified to accommodate a dataset of logs provided by Ericsson, achieving an F1 score of 0.94 and correctly classifying 49991 out of 49993 anomalies. Additionally, we attempted fine-tuning the pre-trained DistilBERT model on a separate dataset comprised of normal log lines; however, this failed to yield significant improvements. The performance of the RAPID method was also compared to a baseline implementation, which utilizes bag-of-words representations. While the baseline method performed extremely well on both the Ericsson and BlueGene/L (BGL) datasets, it fell slightly short on the Ericsson dataset experiencing a drastic loss of performance in detecting anomalies. The results obtained from these experiments, coupled with the research conducted in the log anomaly detection space, highlight the importance of result replication in this field, the limitations of the F1 metric, challenges and trade-offs of fine-tuning models, the effectiveness of simple statistical methods versus LLMs, and the environmental and ethical concerns of using large models in machine learning.

Ämne/nyckelord

Logs, Anomaly Detection, BERT, Representations, Fine-tuning

URI

http://hdl.handle.net/20.500.12380/310423

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Real World Implementation of LLM-based Log Anomaly Detection

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

Endorsement

Review

Supplemented By

Referenced By