Real-time Relevance: RAG with Dynamic Context for Improved Natural Language Responses
Ladda ner
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Program
Computer science – algorithms, languages and logic (MPALG), MSc
Data science and AI (MPDSC), MSc
Data science and AI (MPDSC), MSc
Publicerad
2024
Författare
Landgren, Malte
Giljegård, Oskar
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Today’s Retrieval Augmented Generation (RAG) systems often struggle when trying to answer questions that require complex multi-hop reasoning. In this thesis we investigate an autoregressive Large Language Model (LLM) architecture which can generate a real-time relevant dense search vector for every token generation step. To facilitate this we also develop a synthetic data generation technique to acquire search query vector labels on a token-by-token level, requiring only a generating LLM and a document database. We investigate the quality of the synthetic data, and provide an attention based relabeling method which decreases hallucinations, improving the correctness of the labels by 67%. The architecture is able to produce query vectors 27 times faster than a separate embedder at the cost of retrieval accuracy. Finally, we train and employ the model in an active retrieval question-answering setting.
Beskrivning
Ämne/nyckelord
LLM , RAG , active retrieval , synthetic data generation , master thesis