Analysis and Generation of Wikidata Descriptions Focusing on Bangla Language
Publicerad
Författare
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
We present a Grammatical Framework (GF)-based resource grammar for Bangla designed to automatically generate structured natural language descriptions for Wikidata entities. The system covers multiple entity types including cities, universities, islands, lakes, and humans. Unlike statistical or black-box models, our approach uses a rule-based grammar that guarantees grammatical correctness and structural consistency. Evaluations on more than 76,000 entities demonstrate high coverage (over 99%) and strong alignment with source descriptions, as shown by multilingual embedding similarity. Our results show that the generated Bangla descriptions not only complement existing entries but often exceed them in semantic consistency. This work offers a practical solution for enhancing low-resource language content in multilingual knowledge bases.
Beskrivning
Ämne/nyckelord
grammatical framework, bangla, wikidata, natural language generation, computational linguistics, resource grammar library
