Exploring Image-to-Text Visual Search Using Open Source Models

Liu, Tommy

Exploring Image-to-Text Visual Search Using Open Source Models

dc.contributor.author	Liu, Tommy
dc.contributor.department	Chalmers tekniska högskola / Institutionen för elektroteknik	sv
dc.contributor.examiner	Häggström, Ida
dc.contributor.supervisor	Häggström, Ida
dc.contributor.supervisor	Dahlin, Albert
dc.date.accessioned	2026-02-09T13:23:57Z
dc.date.issued	2026
dc.date.submitted
dc.description.abstract	Visual searching refers to the use of visual data, typically images, in order to perform a search rather than textual input. Most visual search implementations rely on performing similarity searching over image features, in which a user-submitted query image is compared against all searchable entries’ features before returning sufficiently similar results. This thesis explores a different method which utilizes image descriptions generated by vision-language models instead of image features, where the descriptions are converted into embeddings in order to match with other search entries. Evaluation data indicate that the method can provide satisfactory retrieval performance in addition to maintaining a low search query execution time, provided that an adequate vision-language model is employed and sufficient server capacity is available.
dc.identifier.coursecode	EENX30
dc.identifier.uri	http://hdl.handle.net/20.500.12380/310968
dc.language.iso	eng
dc.setspec.uppsok	Technology
dc.subject	visual search
dc.subject	machine learning
dc.subject	deep learning
dc.subject	embedding
dc.subject	vision-language model
dc.subject	transformer
dc.subject	e-commerce
dc.title	Exploring Image-to-Text Visual Search Using Open Source Models
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.degree	Master's Thesis	en
dc.type.uppsok	H
local.programme	Data science and AI (MPDSC), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: Thesis_Final_Report-Tommy_Liu.pdf
Size:: 2.14 MB
Format:: Adobe Portable Document Format

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Size:: 2.35 KB
Format:: Item-specific license agreed upon to submission
Description:

Ladda ner

Samlingar

Examensarbeten för masterexamen