Solving Problems, One Role at a Time
Examensarbete för masterexamen
Data science and AI (MPDSC), MSc
For large companies, leveraging internal knowledge and existing information within the organization has proved to be difficult for several reasons. In this thesis, which is conducted in collaboration with Ericsson, an attempt to facilitate the extraction of internal knowledge is made, more specifically by matching new issues that employees face with pre-existing, solved ones. The issues are represented by so-called ‘support tickets’ and partly consist of manually entered text where the user describes the problem. The support process could be optimized by automatically identifying what kind of issue the user experience. This study aims to investigate if it is possible to extract semantic information from the text contained in support tickets through semantic role labeling (SRL), and leverage that information to match similar issues related to Ericsson’s cloud infrastructure branch. SRL is often used for information extraction and question-answering, but not in a technical domain. Two pre-trained SRL models were tested: one based on FrameNet and the other based on PropBank. Eventually, the FrameNet model was used throughout the thesis. After initial preprocessing and standardization of technical jargon, pre-trained stateof- the-art (SOTA) models were used to extract semantic information, and visual analysis and overall statistics supported the idea that they could identify relevant targets in sentences and populate frames with roles accordingly. The information yielded through SRL allowed for new ways of representing the support tickets. However, further experiments with topic modeling and classification indicated that the information produced by the FrameNet SRL model was not useful for grouping support tickets according to the categorizations provided by Ericsson. It is suggested that the FrameNet model may be too general for the specific context and that customization of the semantic framework may be a possible solution. It is also noted that the categorizations used as similarity proxies for the support tickets may be based on information outside of the text used to represent the support tickets. Even though the semantic information yielded through SRL did not improve the ability to match similar support tickets in this case, we firmly believe that these features can be helpful. Since the semantic frames provide information otherwise not present in the text, they should be able to enrich the representation.
Semantic Role Labeling, Machine Learning, Transformers, Information Extraction, Issue Resolution, Sentence Analysis, Natural Language Processing