LLMs for SDVs: Automated Software Vulnerability Detection and Repair
| dc.contributor.author | Gong, Wenkang | |
| dc.contributor.author | Yan, Jieman | |
| dc.contributor.department | Chalmers tekniska högskola / Institutionen för data och informationsteknik | sv |
| dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering | en |
| dc.contributor.examiner | Gomes de Oliveira Neto, Francisco | |
| dc.contributor.supervisor | Sun, Simin | |
| dc.contributor.supervisor | Staron, Miroslaw | |
| dc.date.accessioned | 2026-07-02T12:48:51Z | |
| dc.date.issued | 2026 | |
| dc.date.submitted | ||
| dc.description.abstract | Software-Defined Vehicles (SDVs) increasingly rely on large-scale C/C++ software stacks to implement safety-critical functionalities. While these languages provide the deterministic performance and hardware control required in automotive systems, they are also susceptible to memory-safety vulnerabilities such as buffer overflows, out-of-bounds accesses, NULL pointer dereferences, and resource management errors. Existing vulnerability analysis approaches remain essential in industrial practice but face limitations in scalability, coverage, and manual remediation effort when applied to modern automotive-scale software systems. Recent advances in Large Language Models (LLMs) have motivated increasing research interest in automated vulnerability detection and repair. This thesis presents an experimental study of a two-stage detection and repair pipeline for function-level C/C++ memory-safety vulnerability detection and re pair in software relevant to SDVs. For the detection stage, the study evaluates how classification strategy, pre-trained code model selection, and inference-time threshold selection affect detection performance for four vulnerability categories, Common Weakness Enumeration (CWE)-787, CWE-476, CWE-399, and CWE-125. Detection experiments compare CodeBERT, GraphCodeBERT, and UniXcoder across specialised binary classifiers and a shared multiclass classifier. For the repair stage, the study evaluates how detection-augmented prompting using vulnerability guidance affect LLM-based automated vulnerability repair performance. Repair experiments evaluate three prompting strategies with increasing levels of vulnerability guidance. Experiments on the BigVul and PrimeVul datasets show that the specialised binary classifiers outperform the multiclass classifier for all model-CWE combinations, with per-CWE F1-score improvements ranging from +0.13 to +0.35. The results also show that no evaluated pre-trained code model is strongest across all four CWE types. Thresholds selected on validation F1 make the detector more permissive, in creasing the rate at which the ground-truth CWE reaches the repair stage by 11.6 to 21.4 percentage points; UniXcoder achieves the highest detection rate of 85.9%. For vulnerability repair, detection-augmented prompting improves vulnerability repair performance, increasing the vulnerability pattern removal rate from 28.22% under the unguided baseline to 48.43% under the detailed guided prompting strategy, while maintaining high code quality. The results indicate that specialised binary classifiers are the strongest evaluated architecture, while model selection and threshold selection still affect how these classifiers perform within the pipeline. Moreover, incorporating detection results into repair prompts proves an effective strategy for improving vulnerability repair quality, though the improvement is bounded by upstream detection accuracy. | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12380/311810 | |
| dc.language.iso | eng | |
| dc.setspec.uppsok | Technology | |
| dc.subject | Software-Defined Vehicles, Vulnerability Detection, Automated Program Repair, Large Language Models, Pre-trained Code Models | |
| dc.title | LLMs for SDVs: Automated Software Vulnerability Detection and Repair | |
| dc.type.degree | Examensarbete för masterexamen | sv |
| dc.type.degree | Master's Thesis | en |
| dc.type.uppsok | H | |
| local.programme | Software engineering and technology (MPSOF), MSc |
