ODR kommer att vara otillgängligt pga systemunderhåll onsdag 25 februari, 13:00 -15:00 (ca). Var vänlig och logga ut i god tid. // ODR will be unavailable due to system maintenance, Wednesday February 25, 13:00 - 15:00. Please log out in due time.
 

Generation of Wikidata Descriptions with Grammatical Framework

dc.contributor.authorBokun, Xiao
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerRanta, Aarne
dc.contributor.supervisorListenmaa, Inari
dc.date.accessioned2026-01-15T15:04:45Z
dc.date.issued2025
dc.date.submitted
dc.description.abstractWikidata is a collaborative, multilingual knowledge base that serves a wide range of purposes, such as supporting Wikipedia, and plays a significant role in ensuring equitable access to information worldwide. However, the quality and consistency of entity descriptions in different languages vary greatly, and many entities lack descriptions altogether. This thesis presents a workflow based on the Grammatical Framework (GF) for the automated generation of multilingual Wikidata entity descriptions. The system integrates property extraction, grammar design, and automatic linearization, enabling the systematic generation of multilingual descriptions while reducing, to some extent, the need for manual intervention and GF-specific expertise. Manual evaluation shows that, compared to human-written Wikidata descriptions and those generated by large language models, GF-generated descriptions achieve higher cross-linguistic consistency and factual accuracy. The workflow also supports efficient extension to additional languages, as demonstrated by the Bengali case by Mohammad Rakib Imtiaz. These results highlight the potential of GF, the GF Resource Grammar Library, and the approach introduced in this thesis for scalable, verifiable, and reliable multilingual description generation.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/310891
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectWikidata
dc.subjectGrammatical Framework
dc.subjectNatrual Language
dc.subjectcomputer science
dc.titleGeneration of Wikidata Descriptions with Grammatical Framework
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComputer science – algorithms, languages and logic (MPALG), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 25-127 BX.pdf
Storlek:
762.65 KB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: