Analysis and Generation of Wikidata Descriptions Focusing on Bangla Language

Publicerad

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

We present a Grammatical Framework (GF)-based resource grammar for Bangla designed to automatically generate structured natural language descriptions for Wikidata entities. The system covers multiple entity types including cities, universities, islands, lakes, and humans. Unlike statistical or black-box models, our approach uses a rule-based grammar that guarantees grammatical correctness and structural consistency. Evaluations on more than 76,000 entities demonstrate high coverage (over 99%) and strong alignment with source descriptions, as shown by multilingual embedding similarity. Our results show that the generated Bangla descriptions not only complement existing entries but often exceed them in semantic consistency. This work offers a practical solution for enhancing low-resource language content in multilingual knowledge bases.

Beskrivning

Ämne/nyckelord

grammatical framework, bangla, wikidata, natural language generation, computational linguistics, resource grammar library

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced