Unsupervised Disambiguation of Abstract

dc.contributor.authorKalldal, Oscar
dc.contributor.authorLudvigsson, Maximilian
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)sv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineering (Chalmers)en
dc.date.accessioned2019-07-03T14:46:07Z
dc.date.available2019-07-03T14:46:07Z
dc.date.issued2018
dc.description.abstractDisambiguating natural text is the task of choosing the correct meaning among several possible interpretations. This thesis focus on disambiguating parse trees created by Grammatical Framework — a formal language that represent meaning of natural language sentences with abstract syntax trees in order to do machine translation. Since one tree represents a meaning, for every sentence there exists several interpretations for which the most probable one should be chosen. In order to achieve this, a language model on trees is defined. This is then used to compare possible trees and choose the one with the highest probability. In order to estimate the parameters of the model, the probability of the different meanings behind a word needs to be estimated. This is done using the Expectation Maximization algorithm. Experiments are done on seven different languages to show that the method is generalizable. Different smoothing techniques as well as different dictionaries are evaluated. A novel merged Wordnet is constructed in order to avoid sparseness. The method is evaluated by doing word sense disambiguation (a subtask of tree disambiguation) on standard data sets. The model is shown to be comparable to other unsupervised methods in the SemEval 2015.
dc.identifier.urihttps://hdl.handle.net/20.500.12380/255307
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectData- och informationsvetenskap
dc.subjectComputer and Information Science
dc.titleUnsupervised Disambiguation of Abstract
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
dc.type.uppsokH
local.programmeEngineering Mathematics (300 hp)
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
255307.pdf
Storlek:
642.18 KB
Format:
Adobe Portable Document Format
Beskrivning:
Fulltext