Generalization abilities of scene text detection models
dc.contributor.author | Gjesdal, Andreas F.T. | |
dc.contributor.department | Chalmers tekniska högskola / Institutionen för data och informationsteknik | sv |
dc.contributor.examiner | Johansson, Richard | |
dc.contributor.supervisor | Norlund, Tobias | |
dc.date.accessioned | 2022-10-14T12:51:44Z | |
dc.date.available | 2022-10-14T12:51:44Z | |
dc.date.issued | 2022 | sv |
dc.date.submitted | 2020 | |
dc.description.abstract | The field of scene text detection has seen massive improvements in the last year with the introduction of models that are based on deep convolutional networks. Stateof- the-art performance on certain benchmark datasets is getting close to human capabilities of detecting text. The question of how well these text detection models can generalize to detect text in images from different domains is of high interest for tasks where a high variety of images are included. This thesis performs an analysis of the generalization abilities of two scene text detection models, EAST and DBnet by training various instances of both models on different combinations of benchmark dataset and evaluating the performance on several datasets which are both used for training and unseen during training. The results show that both models are able to generalize the text detection to a certain degree with instances of both models achieving an average f1-score >0.6 on a selection of benchmark datasets. Both models also achieved f1-scores >0.6 on a set of images collected from social media provided by Recorded Future which was not used for training. Results from some of the easier benchmark datasets are, however, not indicative of performance in a highly varied domain. Finally, it was showed that both models perform quite well as classifiers of whether an image contains text or not. | sv |
dc.identifier.coursecode | DATX05 | sv |
dc.identifier.uri | https://hdl.handle.net/20.500.12380/305715 | |
dc.language.iso | eng | sv |
dc.setspec.uppsok | Technology | |
dc.subject | Scene text detection | sv |
dc.subject | computer vision | sv |
dc.subject | deep neural network | sv |
dc.subject | convolutional network | sv |
dc.title | Generalization abilities of scene text detection models | sv |
dc.type.degree | Examensarbete för masterexamen | sv |
dc.type.uppsok | H | |
local.programme | Computer systems and networks (MPCSN), MSc |