A Computational Grammar and Lexicon for Maltese

Publicerad

Typ

Examensarbete för masterexamen
Master Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Maltese is the national language of Malta and an official language of the European Union. While classified as Semitic, Maltese has been heavily influenced by the Romance languages and English, and features both root-and-pattern and concatenative morphologies. Despite its active use, the language is highly under-resourced in digital terms. This thesis contributes two computational resources for Maltese: a grammar and an online full-form lexicon. The first part of this thesis deals with a computational grammar for Maltese, which is implemented using the Grammatical Framework (GF). GF is a multilingual grammar formalism based on using abstract syntax trees as language-independent semantic representations. Its Resource Grammar Library (RGL) already covers the morphology and basic syntax of some 27 languages from around the world. Maltese is the 28th addition to the RGL, and the first Semitic language in the library to be completed. The smart paradigms implemented in the morphological part of grammar allow full inflection tables to be produced for any lexical unit, often requiring only a lemmatised form. This report looks at some of the more interesting implementational details of the grammar, discussing the compromises that had to be made along the way. The second part covers the collection of various Maltese lexical resources into a single searchable collection, using a schema-less database to accommodate partial data from heterogeneous sources. We then use the smart paradigms from the morphological part of the grammar to automatically produce some 4 million inflection forms and extend the collection into a full-form computational lexicon, which can be used in for morphological lookup and spell checking. All the software and resources described in this thesis are open-source and free to use for any purpose.

Beskrivning

Ämne/nyckelord

Data- och informationsvetenskap, Computer and Information Science

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced