Assessing Privacy vs. Efficiency Tradeoffs in Open-Source Large Language Models
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
LLMs are being actively implemented across various industries in applications from customer service to code generation. With this recent development, concerns surrounding data privacy have become increasingly urgent. While open-source LLMs
are often seen as a more transparent and flexible alternative to proprietary models, the extent of their openness and privacy guarantees vary significantly, as well as the research done in this area being quite small. With regulatory pressure from the EU AI Act, many organizations must now navigate the trade-offs between transparency, privacy, and efficiency. This thesis investigates two key questions, “What are the actual privacy guarantees provided by open-source LLMs?” and “Does ensuring robust privacy safeguards in open-source LLMs necessarily compromise efficiency?”. Through our evaluation process, we find no consistent link between a model’s openness and its resistance to privacy attacks, and neither do privacy safeguards necessarily reduce efficiency. These findings suggest that it is possible to develop or select open-source models that are both privacy-conscious and efficient.
Beskrivning
Ämne/nyckelord
LLM, privacy, efficiency, benchmarking, open-source
