Gone Phishin’ An Investigation of Node Classification in Graphical Models for Domain Abuse Detection

Rosko, Joel; Truvé, William

Gone Phishin’ An Investigation of Node Classification in Graphical Models for Domain Abuse Detection

dc.contributor.author	Rosko, Joel
dc.contributor.author	Truvé, William
dc.contributor.department	Chalmers tekniska högskola / Institutionen för fysik	sv
dc.contributor.department	Chalmers University of Technology / Department of Physics	en
dc.contributor.examiner	Granath, Mats
dc.contributor.supervisor	Hansson, Anders
dc.date.accessioned	2023-06-27T12:50:19Z
dc.date.available	2023-06-27T12:50:19Z
dc.date.issued	2023
dc.date.submitted	2023
dc.description.abstract	In today’s digital era, cyber attacks pose a constant threat as attackers attempt to access proprietary data and disrupt operations on a daily basis. Phishing remains their number one attack method where users are tricked into entering sensitive in formation which attackers later will use or sell. The use of domain abuse detection algorithms restricts the range of attack possibilities. Furthermore, since an attack may begin as soon as a domain goes live, finding and evaluating domains quickly is of paramount importance when countering cyber threat actors. As of now, several feature based classifiers exist and are showing good results in detecting domain abuse. However, the results are dependent on a large set of fea tures, complicated to interpret, and struggles to generalize as attack patterns change. In this thesis we compare feature based classifiers with our implementation of belief propagation to evaluate if the use of structural information and less domain specific features can create a more interpretable and general solution. By constructing a bidirectional graph connecting autonomous system numbers, classless inter-domain routing blocks, IP addresses, domains, and tokens extracted from the URL string, a high connectivity between nodes to propagate inference is achieved. We experiment with various techniques when initiating the graph to find an appropriate setup for belief propagation. Our implementation of belief propagation achieves an accuracy of 91% on the en tire dataset which is worse than random forest having an accuracy of 94%, however with a smaller sample of false positives. With an AUC of 0.95 the classes are well distinguishable and when optimizing thresholds and allowing nodes to be classified as “unkown”, the accuracy increases to 96%. Overall, our findings demonstrate the potential to use belief propagation for ac curately identifying suspicious domains at scale, providing a valuable tool in the fight against cyber threats.
dc.identifier.coursecode	TIFX05
dc.identifier.uri	http://hdl.handle.net/20.500.12380/306450
dc.language.iso	eng
dc.setspec.uppsok	PhysicsChemistryMaths
dc.subject	phishing, random forest, belief propagation, loopy belief propagation
dc.title	Gone Phishin’ An Investigation of Node Classification in Graphical Models for Domain Abuse Detection
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Complex adaptive systems (MPCAS), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: Gone_Phishin_2023.pdf
Storlek:: 3.12 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen