A Computational Grammar and Lexicon for Maltese

dc.contributor.authorCamilleri, John J.
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)sv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineering (Chalmers)en
dc.description.abstractMaltese is the national language of Malta and an official language of the European Union. While classified as Semitic, Maltese has been heavily influenced by the Romance languages and English, and features both root-and-pattern and concatenative morphologies. Despite its active use, the language is highly under-resourced in digital terms. This thesis contributes two computational resources for Maltese: a grammar and an online full-form lexicon. The first part of this thesis deals with a computational grammar for Maltese, which is implemented using the Grammatical Framework (GF). GF is a multilingual grammar formalism based on using abstract syntax trees as language-independent semantic representations. Its Resource Grammar Library (RGL) already covers the morphology and basic syntax of some 27 languages from around the world. Maltese is the 28th addition to the RGL, and the first Semitic language in the library to be completed. The smart paradigms implemented in the morphological part of grammar allow full inflection tables to be produced for any lexical unit, often requiring only a lemmatised form. This report looks at some of the more interesting implementational details of the grammar, discussing the compromises that had to be made along the way. The second part covers the collection of various Maltese lexical resources into a single searchable collection, using a schema-less database to accommodate partial data from heterogeneous sources. We then use the smart paradigms from the morphological part of the grammar to automatically produce some 4 million inflection forms and extend the collection into a full-form computational lexicon, which can be used in for morphological lookup and spell checking. All the software and resources described in this thesis are open-source and free to use for any purpose.
dc.subjectData- och informationsvetenskap
dc.subjectComputer and Information Science
dc.titleA Computational Grammar and Lexicon for Maltese
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
local.programmeComputer science – algorithms, languages and logic (MPALG), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Bild (thumbnail)
1.84 MB
Adobe Portable Document Format