Analysis and Generation of Wikidata Descriptions Focusing on Bangla Language
Loading...
Date
Authors
Type
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Programme
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
We present a Grammatical Framework (GF)-based resource grammar for Bangla designed to automatically generate structured natural language descriptions for Wikidata entities. The system covers multiple entity types including cities, universities, islands, lakes, and humans. Unlike statistical or black-box models, our approach uses a rule-based grammar that guarantees grammatical correctness and structural consistency. Evaluations on more than 76,000 entities demonstrate high coverage (over 99%) and strong alignment with source descriptions, as shown by multilingual embedding similarity. Our results show that the generated Bangla descriptions not only complement existing entries but often exceed them in semantic consistency. This work offers a practical solution for enhancing low-resource language content in multilingual knowledge bases.
Description
Keywords
grammatical framework, bangla, wikidata, natural language generation, computational linguistics, resource grammar library
