Similarity-Based Patent Selection using Natural Language Processing

dc.contributor.authorAliyev, Elmar
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.examinerMyreen, Magnus
dc.contributor.supervisorSeger, Carl-Johan
dc.date.accessioned2021-07-02T07:21:16Z
dc.date.available2021-07-02T07:21:16Z
dc.date.issued2021sv
dc.date.submitted2020
dc.description.abstractMany companies spend a lot of resources and put significant effort into R&D activities to keep themselves informed of the latest advances in technology. As patent data is the world’s largest technology repository, it is frequently utilized by technology managers for this purpose. The patent analysis of this kind usually involves much manual works, for example collecting patents to represent technology fields. It has been observed that creating such patent sets is the most critical part since poor patent selection would lead to biased results, no matter how well the analysis is performed. Manual nature, on the other hand, makes the quality of the patent selection process questionable. This thesis studied the subject and proposed a novel method (called “SBPS”) that assists users in building effective queries and, based on these queries, finds representative patents for technology fields. The proposed method is divided into three main stages, namely query building, similarity calculation, and threshold finding. The essence of the first stage is offering synonyms to the user’s query through the use of trained word embeddings. The second stage involves employing a keyword extraction algorithm for calculating document vectors and the cosine similarity measure for ranking documents based on similarity to the query. The third stage requires the adjustment of the similarity threshold between the range of 0 and 1. This manual step lets the users to define the degree of patent relatedness to the query. To evaluate the method, four technology battles were studied from the development history viewpoint and compared to the histogram and growth curve graphs extracted for the corresponding technologies using the SBPS method. The results from the comparative analysis showed significant agreement between the historical events and the graphs and proved the potential of the proposed method. Keywords:sv
dc.identifier.urihttps://hdl.handle.net/20.500.12380/302930
dc.language.isoengsv
dc.setspec.uppsokTechnology
dc.subjectPatentsv
dc.subjectNLPsv
dc.subjectword2vecsv
dc.subjectsimilaritysv
dc.subjecttechnologysv
dc.subjectpatent searchsv
dc.subjecttechnology watchsv
dc.titleSimilarity-Based Patent Selection using Natural Language Processingsv
dc.type.degreeExamensarbete för masterexamensv
dc.type.uppsokH
local.programmeComputer science – algorithms, languages and logic (MPALG), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 21-73 Aliyev.pdf
Storlek:
4.83 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
1.51 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: