Flow-Based Detection of Linux Backdoor Communication - A NetFlow Based ML-Approach to Backdoor Detection in Linux Environments

Espinosa, Naomi; Malki, Lenia

Flow-Based Detection of Linux Backdoor Communication - A NetFlow Based ML-Approach to Backdoor Detection in Linux Environments

dc.contributor.author	Espinosa, Naomi
dc.contributor.author	Malki, Lenia
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering	en
dc.contributor.examiner	Sabelfeld, Andrei
dc.contributor.supervisor	Sabelfeld, Andrei
dc.date.accessioned	2025-01-08T10:32:11Z
dc.date.available	2025-01-08T10:32:11Z
dc.date.issued	2024
dc.date.submitted
dc.description.abstract	The increasing prevalence of Linux-based systems and their susceptibility to malware attacks necessitates effective detection mechanisms for backdoor communication. This thesis explores the application of machine learning (ML) models to detect backdoor communication in Linux environments using flow-based data. Specifically, it leverages NetFlow traffic data. The study aims to determine the effectiveness of ML techniques in identifying malicious patterns associated with backdoor communication without inspecting the actual payload. Linux systems are underrepresented in existing benchmark datasets, which predominantly focus on Windows environments. To address this gap, our research trains models on flow data specific to Linux malware and environments. Through data preprocessing steps including feature mapping, aggregation, scaling, and feature selection methodologies like ANOVA F-test, models were trained and evaluated on both benign and malicious traffic datasets. The results indicate that ensemble models such as Random Forest (RF) and Extreme Gradient Boosting (XGBoost) can effectively distinguish between normal and anomalous traffic patterns, highlighting the potential of flow-based detection systems in enhancing network security. The Synthetic Minority Over-sampling Technique (SMOTE) was applied to address class imbalance, further improving the detection performance though in terms of precision. We conclude that flow-based data is a valuable tool for training models to classify malicious traffic in Linux environments. Future work will focus on acquiring or creating higher quality datasets of malicious Linux malware traffic to improve the capabilities of detection systems.
dc.identifier.coursecode	DATX05
dc.identifier.uri	http://hdl.handle.net/20.500.12380/309054
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	backdoor detection
dc.subject	machine learning
dc.subject	NetFlow
dc.subject	Linux
dc.subject	malware
dc.subject	network security
dc.subject	anomaly detection
dc.subject	flow-based data
dc.subject	big data
dc.title	Flow-Based Detection of Linux Backdoor Communication - A NetFlow Based ML-Approach to Backdoor Detection in Linux Environments
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Computer systems and networks (MPCSN), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 24-48 LM NSE.pdf
Storlek:: 1.8 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen