Generation of Wikidata Descriptions with Grammatical Framework
| dc.contributor.author | Bokun, Xiao | |
| dc.contributor.department | Chalmers tekniska högskola / Institutionen för data och informationsteknik | sv |
| dc.contributor.department | Chalmers University of Technology / Department of Computer Science and Engineering | en |
| dc.contributor.examiner | Ranta, Aarne | |
| dc.contributor.supervisor | Listenmaa, Inari | |
| dc.date.accessioned | 2026-01-15T15:04:45Z | |
| dc.date.issued | 2025 | |
| dc.date.submitted | ||
| dc.description.abstract | Wikidata is a collaborative, multilingual knowledge base that serves a wide range of purposes, such as supporting Wikipedia, and plays a significant role in ensuring equitable access to information worldwide. However, the quality and consistency of entity descriptions in different languages vary greatly, and many entities lack descriptions altogether. This thesis presents a workflow based on the Grammatical Framework (GF) for the automated generation of multilingual Wikidata entity descriptions. The system integrates property extraction, grammar design, and automatic linearization, enabling the systematic generation of multilingual descriptions while reducing, to some extent, the need for manual intervention and GF-specific expertise. Manual evaluation shows that, compared to human-written Wikidata descriptions and those generated by large language models, GF-generated descriptions achieve higher cross-linguistic consistency and factual accuracy. The workflow also supports efficient extension to additional languages, as demonstrated by the Bengali case by Mohammad Rakib Imtiaz. These results highlight the potential of GF, the GF Resource Grammar Library, and the approach introduced in this thesis for scalable, verifiable, and reliable multilingual description generation. | |
| dc.identifier.coursecode | DATX05 | |
| dc.identifier.uri | http://hdl.handle.net/20.500.12380/310891 | |
| dc.language.iso | eng | |
| dc.setspec.uppsok | Technology | |
| dc.subject | Wikidata | |
| dc.subject | Grammatical Framework | |
| dc.subject | Natrual Language | |
| dc.subject | computer science | |
| dc.title | Generation of Wikidata Descriptions with Grammatical Framework | |
| dc.type.degree | Examensarbete för masterexamen | sv |
| dc.type.degree | Master's Thesis | en |
| dc.type.uppsok | H | |
| local.programme | Computer science – algorithms, languages and logic (MPALG), MSc |
