Machine learning based warning system for failed procurement classification documents
Publicerad
Författare
Typ
Examensarbete för masterexamen
Program
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Warning systems in the Machine Learning field of study, is a tool that generates a
warning based on a model’s prediction results. This thesis’s study topic is to create
such system to identify possible problematic procurement classification documents.
Given a database of a company, a dataset was created for which a feature analysis
was made to investigate which properties of a document can cause an either
classification or formatting error. The challenging part of the research was the feature
engineering since each feature had to be preprocessed differently based on the
importance of the information contained.
Moreover, different supervised machine learning methods were implemented and hyperparameter
tuned, using an algorithm called Grid Search. After the evaluation
and comparison of the models, XGBoost Classifier was found to be the most successful
both in terms of performance and computational time achieving 90,5% accuracy.
However, by gathering more data, especially containing formatting errors, it is anticipated
that the performance of the warning system using the XGBoost will be
improved.
Beskrivning
Ämne/nyckelord
Warning system, supervised learning, machine learning, feature engineering, XGBoost Classifier