Classification of Legal Documents A Topic Modeling Approach

dc.contributor.authorCarlsson, Hanna
dc.contributor.authorLindgren, Tobias
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.examinerAngelov, Krasimir
dc.contributor.supervisorJohansson, Moa
dc.contributor.supervisorHeggemann, Olof
dc.date.accessioned2021-10-01T08:46:27Z
dc.date.available2021-10-01T08:46:27Z
dc.date.issued2021sv
dc.date.submitted2020
dc.description.abstractEntering a civil dispute presents financial risks for all parties involved, and sometimes all parties may end up losing money. Eperoto is a legaltech start-up in Gothenburg that aims to solve this problem by providing a tool for risk analysis of outcomes of civil disputes. They want to use information about previous cases to improve their tool further and make better analyses of the current disputes. The category of a dispute could play an essential role in the risks involved in a dispute. It could also be used to make more accurate predictions of a dispute based on statistics from previous disputes of the same category. Manually annotating every case is a very time-consuming and costly task. In this thesis, we develop and evaluate an unsupervised system based on topic modeling for classifying civil dispute judgments into categories. The system presents similar results to previous similar supervised systems in terms of f1-score. The created system managed to classify 67% of the tested documents correctly. Overall, the system for categorizing civil disputes performed well, especially considering that it is an unsupervised system. Being able to automatically categorize the disputes with an accuracy of 67% significantly reduces the manual work needed to categorize disputes and contributes to improving Eperoto’s tool.sv
dc.identifier.urihttps://hdl.handle.net/20.500.12380/304213
dc.language.isoengsv
dc.setspec.uppsokTechnology
dc.subjectmachine learningsv
dc.subjecttopic modelingsv
dc.subjectLDAsv
dc.subjecttext classificationsv
dc.subjectunsupervisedsv
dc.subjectmulti-class classificationsv
dc.subjectnatural language processingsv
dc.subjectcivil disputessv
dc.titleClassification of Legal Documents A Topic Modeling Approachsv
dc.type.degreeExamensarbete för masterexamensv
dc.type.uppsokH
local.programmeComputer science – algorithms, languages and logic (MPALG), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 21-131 Carlsson Lindgren.pdf
Storlek:
1000.21 KB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
1.51 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: