Using language models to improve a speech recognition based maritime emergency call detection system
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Program
Complex adaptive systems (MPCAS), MSc
Publicerad
2022
Författare
Johansson, Eric
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Novel applications of the transformers architechture as well as the availability of
pre-trained models have drastically reduced the amount of data required to train
successful speech-to-text (STT) models. By using the Connectionist Temporal Classification
(CTC) algorithm, the process is further simplified as the training data does
not have to be pre-segmented. This work aims to improve the performance of such
a model developed to detect maritime VHF radio emergency calls by adding a language
model to the CTC-decoding. We experiment with language models trained
on several different text corpora and apply language models both in the decoding
and on the resulting transcripts. The results indicate the importance of large
amounts of domain-specific text. The results also show that a reduced Word Error
Rate (WER) does not necessarily lead to an improvement in contextual comprehension.
Finally, it is shown that relatively large improvements are given by fine-tuning
various pre-trained STT-models on a curated dataset.
Beskrivning
Ämne/nyckelord
Speech to text, automatic speech recognition, natural language processing, NLP, language model, wav2vec2.0, VHF, emergency call detection