Quantum Models for Word- Sense Disambiguation
Examensarbete för masterexamen
Complex adaptive systems (MPCAS), MSc
In recent years, developments in machine learning had a tremendous impact on Natural Language Processing (NLP). However, state-of-the-art language models contain billions of parameters that require vast computational resources for optimization and capture syntactic rules only from data, which does not allow an extensive analysis of the underlying logic of language. Hence, to reduce the parameter space of NLP models and close the gap between logic-based language models and statistical vector space models, Coecke, Sadrzadeh, and Clark  introduce a compound framework called Compositional Distributional Model of Meaning, based on Lambeks Pregroup grammar and Quantum Theory. This thesis investigates applying the Compositional Distributional Model of Meaning on the word-sense disambiguation task by Kartsaklis, Sadrzadeh, and Pulman . Different quantum embeddings are evaluated in terms of disambiguation power, given a matching context. One focus lies on the description of ambiguous words as mixed states. Mixed states are probabilistic quantum states expressed as density matrices which entail a lack of knowledge about the underlying system. Empirical data was gathered from experiments using quantum circuits and classical computations. We evaluate the performance and discuss the challenges and limitations of the current quantum computing models. The results confirm the comprehensiveness of the Compositional Distributional Model of Meaning and show statistical indications for a richer representation of words by density matrices.
Quantum Natural Language Processing (QNLP) , Compositional Distributional Model of Meaning , Word-Sense Disambiguation , Quantum Computing