Analysis and Generation of Wikidata Descriptions Focusing on Bangla Language

dc.contributor.authorRakib Imtiaz, Mohammad
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerRanta, Aarne
dc.contributor.supervisorListenmaa, Inari
dc.date.accessioned2026-01-23T14:51:07Z
dc.date.issued2025
dc.date.submitted
dc.description.abstractWe present a Grammatical Framework (GF)-based resource grammar for Bangla designed to automatically generate structured natural language descriptions for Wikidata entities. The system covers multiple entity types including cities, universities, islands, lakes, and humans. Unlike statistical or black-box models, our approach uses a rule-based grammar that guarantees grammatical correctness and structural consistency. Evaluations on more than 76,000 entities demonstrate high coverage (over 99%) and strong alignment with source descriptions, as shown by multilingual embedding similarity. Our results show that the generated Bangla descriptions not only complement existing entries but often exceed them in semantic consistency. This work offers a practical solution for enhancing low-resource language content in multilingual knowledge bases.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/310944
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectgrammatical framework
dc.subjectbangla
dc.subjectwikidata
dc.subjectnatural language generation
dc.subjectcomputational linguistics
dc.subjectresource grammar library
dc.titleAnalysis and Generation of Wikidata Descriptions Focusing on Bangla Language
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeData science and AI (MPDSC), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 25-161 MRI.pdf
Storlek:
737.14 KB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: