Evaluating Machine Learning Algorithms in Design Pattern Recognition - Exploring the Performance of Classification and Clustering Algorithms in Design Pattern Recognition Utilising Large Language Models

Publicerad

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Design Pattern Recognition (DPR) is an ongoing research challenge in the field of software engineering for increasing software maintainability in code. Recent work has utilised Large Language Models (LLMs) for extracting semantic information from code. This study follows up on previous research and investigates, explores, and evaluates the performance of multiple classification and clustering algorithms when applied to embeddings extracted from LLMs. Performance is explored between contexts using different LLMs, design patterns, and programming languages. Data for design pattern implementations was gathered for Java, Python, and C# via GitHub and the P-MARt repository. Each algorithm was run with tuned hyperparameters, and their average performance across multiple runs was compared. The results indicate variance for the individual performance of the algorithms, but the overall performance order between the algorithms remains the same. Classification algorithms outperformed clustering algorithms, and clustering algorithms had low performance in the measured metrics across all tests. The results also showed a difference in performance between behavioral, creational, and structural design patterns. This study shows further promise for the use of LLMs for DPR and recognises the need for larger studies utilising LLMs for DPR.

Beskrivning

Ämne/nyckelord

computer science, design patterns, machine learning, large language models, software engineering, design pattern recognition, DPR, LLM

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced