Uncovering Anomalies using Isolation Forest – A Machine Learning Approach for Request Analysis
Loading...
Download
Date
Authors
Type
Examensarbete på grundnivå
Programme
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In an increasingly digital era, the prevalence of misconduct increases as online social
networks enable the creation of bots posing as normal users. This type of
misconduct can appear in various forms, for example, emails containing unwanted
advertisements, attempts of malware distribution, or simply collecting user-sensitive
information. To detect this behaviour, using machine learning is well-considered and
researched, especially regarding the analysis of the content of messages and online
posts. This project explores the approach to analyze metadata from HTTP requests
to find patterns for anomalous behavior, with the end goal being a machine learning
module that can be integrated into a larger system for request analysis.
After reviewing different approaches suggested by previous research and theoretical
reasoning, the proposed system has been designed and implemented using the Isolation
Forest model. Feature engineering has been utilized to extract information
from sequences of input requests. The system consists of two different model instances
which operate on different sequence length intervals. The conclusion to use
the selected models has been obtained when evaluating differently trained Isolation
Forest instances using precision, recall, and the F1 score as metrics.
Description
Keywords
Machine learning, Isolation Forest, Unsupervised learning, Request analysis, Anomaly detection, Feature engineering
