A Computational Grammar and Lexicon for Maltese

Loading...
Thumbnail Image

Date

Type

Examensarbete för masterexamen
Master Thesis

Model builders

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Maltese is the national language of Malta and an official language of the European Union. While classified as Semitic, Maltese has been heavily influenced by the Romance languages and English, and features both root-and-pattern and concatenative morphologies. Despite its active use, the language is highly under-resourced in digital terms. This thesis contributes two computational resources for Maltese: a grammar and an online full-form lexicon. The first part of this thesis deals with a computational grammar for Maltese, which is implemented using the Grammatical Framework (GF). GF is a multilingual grammar formalism based on using abstract syntax trees as language-independent semantic representations. Its Resource Grammar Library (RGL) already covers the morphology and basic syntax of some 27 languages from around the world. Maltese is the 28th addition to the RGL, and the first Semitic language in the library to be completed. The smart paradigms implemented in the morphological part of grammar allow full inflection tables to be produced for any lexical unit, often requiring only a lemmatised form. This report looks at some of the more interesting implementational details of the grammar, discussing the compromises that had to be made along the way. The second part covers the collection of various Maltese lexical resources into a single searchable collection, using a schema-less database to accommodate partial data from heterogeneous sources. We then use the smart paradigms from the morphological part of the grammar to automatically produce some 4 million inflection forms and extend the collection into a full-form computational lexicon, which can be used in for morphological lookup and spell checking. All the software and resources described in this thesis are open-source and free to use for any purpose.

Description

Keywords

Data- och informationsvetenskap, Computer and Information Science

Citation

Architect

Location

Type of building

Build Year

Model type

Scale

Material / technology

Index

Endorsement

Review

Supplemented By

Referenced By