Generalization abilities of scene text detection models

Gjesdal, Andreas F.T.

Generalization abilities of scene text detection models

dc.contributor.author	Gjesdal, Andreas F.T.
dc.contributor.department	Chalmers tekniska högskola / Institutionen för data och informationsteknik	sv
dc.contributor.examiner	Johansson, Richard
dc.contributor.supervisor	Norlund, Tobias
dc.date.accessioned	2022-10-14T12:51:44Z
dc.date.available	2022-10-14T12:51:44Z
dc.date.issued	2022	sv
dc.date.submitted	2020
dc.description.abstract	The field of scene text detection has seen massive improvements in the last year with the introduction of models that are based on deep convolutional networks. Stateof- the-art performance on certain benchmark datasets is getting close to human capabilities of detecting text. The question of how well these text detection models can generalize to detect text in images from different domains is of high interest for tasks where a high variety of images are included. This thesis performs an analysis of the generalization abilities of two scene text detection models, EAST and DBnet by training various instances of both models on different combinations of benchmark dataset and evaluating the performance on several datasets which are both used for training and unseen during training. The results show that both models are able to generalize the text detection to a certain degree with instances of both models achieving an average f1-score >0.6 on a selection of benchmark datasets. Both models also achieved f1-scores >0.6 on a set of images collected from social media provided by Recorded Future which was not used for training. Results from some of the easier benchmark datasets are, however, not indicative of performance in a highly varied domain. Finally, it was showed that both models perform quite well as classifiers of whether an image contains text or not.	sv
dc.identifier.coursecode	DATX05	sv
dc.identifier.uri	https://hdl.handle.net/20.500.12380/305715
dc.language.iso	eng	sv
dc.setspec.uppsok	Technology
dc.subject	Scene text detection	sv
dc.subject	computer vision	sv
dc.subject	deep neural network	sv
dc.subject	convolutional network	sv
dc.title	Generalization abilities of scene text detection models	sv
dc.type.degree	Examensarbete för masterexamen	sv
dc.type.uppsok	H
local.programme	Computer systems and networks (MPCSN), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1

Namn:: CSE 22-123 Gjesdal.pdf
Size:: 2.89 MB
Format:: Adobe Portable Document Format
Description:

Ladda ner

License bundle

Visar 1 - 1 av 1

Namn:: license.txt
Size:: 1.51 KB
Format:: Item-specific license agreed upon to submission
Description:

Ladda ner

Samlingar

Examensarbeten för masterexamen