Learning to rank, a supervised approach for ranking of documents

Examensarbete för masterexamen

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.12380/219663
Download file(s):
File Description SizeFormat 
219663.pdfFulltext999.09 kBAdobe PDFView/Open
Full metadata record
DC FieldValueLanguage
dc.contributor.authorTapper, Kristofer
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)sv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineering (Chalmers)en
dc.date.accessioned2019-07-03T13:45:17Z-
dc.date.available2019-07-03T13:45:17Z-
dc.date.issued2015
dc.identifier.urihttps://hdl.handle.net/20.500.12380/219663-
dc.description.abstractAs available information gets more accessible everywhere and as the rate of new information grows very fast, the systems and models which retrieve this information deserves more attention. The purpose of this thesis is to investigate state-of-the-art machine learning methods for ranking known as learning to rank. The goal is to explore if learning to rank can be used in enterprise search, which means less data and less document features than web based search. Comparisons between several state-of-the-art algorithms from RankLib (Dang, 2011) was carried out on benchmark datasets. Further, Fidelity Loss Ranking (Tsai et al., 2007) was implemented and added to RankLib. The performance of the tests showed that the machine learning algorithms in RankLib had similar performance and that the size of the training sets and the number of features were crucial. Learning to rank is a possible alternative to the standard ranking models in enterprise search only if there are enough features and enough training data. Advise for an implementation of learning to rank in Apache Solr is given, which can be useful for future development. Such an implementation requires a lot of understanding about how the Lucene core works on a low level.
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectInformations- och kommunikationsteknik
dc.subjectData- och informationsvetenskap
dc.subjectInformation & Communication Technology
dc.subjectComputer and Information Science
dc.titleLearning to rank, a supervised approach for ranking of documents
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
dc.type.uppsokH
Collection:Examensarbeten för masterexamen // Master Theses



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.