On Obfuscating JavaScript Code Using Large Language Models

dc.contributor.authorHeng, Siyu
dc.contributor.authorCîrstoiu, Andreea-Ioana
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerStrüber, Daniel
dc.contributor.supervisorLeitner, Philipp
dc.date.accessioned2025-10-07T12:39:56Z
dc.date.issued2025
dc.date.submitted
dc.description.abstractLarge Language Models (LLMs) have become increasingly popular for their proven capabilities in code analysis and synthesis, and code obfuscation is a suitable task. Generally, the purpose of obfuscation is to make a program difficult to understand. More specifically, it is widely applied to JavaScript code, as it is a popular language used to build client-side web applications. One reason to obfuscate JavaScript code is to prevent anyone from copying proprietary work. Code obfuscation is widely applied and studied in the context of cybersecurity, particularly in relation to malware or intellectual property protection. Existing obfuscators have been built to implement obfuscation patterns of ranging complexities. Research in the area of code obfuscation using LLMs has emerged in recent years and is continually evolving. A potential gap in research is whether LLMs can obfuscate code using patterns that the current standard deobfuscators cannot reverse-engineer, and whether it is possible for the LLMs to replace existing obfuscators. To achieve this goal, it is essential to investigate whether and how LLMs can apply obfuscation transformations to code, as well as the impact that prompt engineering techniques have on the results. In this laboratory experiment, our goal is to determine the extent to which an LLM can obfuscate JavaScript code. We choose an open-weight LLM and we craft prompts to obfuscate standalone, relatively simple code snippets. A key component of our work is a dedicated, free-to-use obfuscation tool that serves as our baseline for evaluating the LLM results. We prompt the model iteratively, then, using data visualization and descriptive statistics, we analyze and interpret the results. Our results show that the chosen LLM can obfuscate simple JavaScript code; however, the choice of prompt engineering technique is crucial. Some LLM-obfuscated code snippets differ significantly from the original code, while maintaining the original behavior. However, the LLM also yields obfuscated code that changes the original behavior, has errors, or is significantly similar to the original code.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/310606
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectobfuscation
dc.subjectLarge Language Model (LLM)
dc.subjectJavaScript
dc.subjectprompt engineering
dc.subjectsoftware engineering
dc.titleOn Obfuscating JavaScript Code Using Large Language Models
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeSoftware engineering and technology (MPSOF), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 25-28 AC SH.pdf
Storlek:
1.42 MB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: