Classification of Legal Documents A Topic Modeling Approach

Loading...
Thumbnail Image

Date

Type

Examensarbete för masterexamen

Model builders

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Entering a civil dispute presents financial risks for all parties involved, and sometimes all parties may end up losing money. Eperoto is a legaltech start-up in Gothenburg that aims to solve this problem by providing a tool for risk analysis of outcomes of civil disputes. They want to use information about previous cases to improve their tool further and make better analyses of the current disputes. The category of a dispute could play an essential role in the risks involved in a dispute. It could also be used to make more accurate predictions of a dispute based on statistics from previous disputes of the same category. Manually annotating every case is a very time-consuming and costly task. In this thesis, we develop and evaluate an unsupervised system based on topic modeling for classifying civil dispute judgments into categories. The system presents similar results to previous similar supervised systems in terms of f1-score. The created system managed to classify 67% of the tested documents correctly. Overall, the system for categorizing civil disputes performed well, especially considering that it is an unsupervised system. Being able to automatically categorize the disputes with an accuracy of 67% significantly reduces the manual work needed to categorize disputes and contributes to improving Eperoto’s tool.

Description

Keywords

machine learning, topic modeling, LDA, text classification, unsupervised, multi-class classification, natural language processing, civil disputes

Citation

Architect

Location

Type of building

Build Year

Model type

Scale

Material / technology

Index

Endorsement

Review

Supplemented By

Referenced By