Evaluating Machine Learning Algorithms in Design Pattern Recognition - Exploring the Performance of Classification and Clustering Algorithms in Design Pattern Recognition Utilising Large Language Models

Andersson, Simon; Berggren, Viktor

Evaluating Machine Learning Algorithms in Design Pattern Recognition - Exploring the Performance of Classification and Clustering Algorithms in Design Pattern Recognition Utilising Large Language Models

dc.contributor.author	Andersson, Simon
dc.contributor.author	Berggren, Viktor
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.department	Chalmers University of Technology / Department of Computer Science and Engineering	en
dc.contributor.examiner	Heyn, Hans-Martin
dc.contributor.supervisor	Horkoff, Jennifer
dc.date.accessioned	2024-10-17T13:27:01Z
dc.date.available	2024-10-17T13:27:01Z
dc.date.issued	2024
dc.date.submitted
dc.description.abstract	Design Pattern Recognition (DPR) is an ongoing research challenge in the field of software engineering for increasing software maintainability in code. Recent work has utilised Large Language Models (LLMs) for extracting semantic information from code. This study follows up on previous research and investigates, explores, and evaluates the performance of multiple classification and clustering algorithms when applied to embeddings extracted from LLMs. Performance is explored between contexts using different LLMs, design patterns, and programming languages. Data for design pattern implementations was gathered for Java, Python, and C# via GitHub and the P-MARt repository. Each algorithm was run with tuned hyperparameters, and their average performance across multiple runs was compared. The results indicate variance for the individual performance of the algorithms, but the overall performance order between the algorithms remains the same. Classification algorithms outperformed clustering algorithms, and clustering algorithms had low performance in the measured metrics across all tests. The results also showed a difference in performance between behavioral, creational, and structural design patterns. This study shows further promise for the use of LLMs for DPR and recognises the need for larger studies utilising LLMs for DPR.
dc.identifier.coursecode	DATX05
dc.identifier.uri	http://hdl.handle.net/20.500.12380/308924
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	computer science
dc.subject	design patterns
dc.subject	machine learning
dc.subject	large language models
dc.subject	software engineering
dc.subject	design pattern recognition
dc.subject	DPR
dc.subject	LLM
dc.title	Evaluating Machine Learning Algorithms in Design Pattern Recognition - Exploring the Performance of Classification and Clustering Algorithms in Design Pattern Recognition Utilising Large Language Models
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Software engineering and technology (MPSOF), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 24-25 SA VB.pdf
Storlek:: 3.7 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Storlek:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Beskrivning:

Ladda ner

Samlingar

Examensarbeten för masterexamen