Pattern extraction to define normal driving from driver-rated data using data mining techniques

Examensarbete för masterexamen

Please use this identifier to cite or link to this item:
Download file(s):
File Description SizeFormat 
193476.pdfFulltext905.54 kBAdobe PDFThumbnail
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKarlsson, Tobias
dc.contributor.departmentChalmers tekniska högskola / Institutionen för matematiska vetenskapersv
dc.contributor.departmentChalmers University of Technology / Department of Mathematical Sciencesen
dc.description.abstractIn this thesis driver rated data is studied using data mining techniques. The rated data consists of roughly 72 hours of data from seven drivers. The ultimate goal is to be able to identify patterns of high rating and match them towards a reference database consisting naturalistic driving data. Two segmentations of the drives are used, equilength subsegments and steering operations. An alternative morphed standardised rating scaled is proposed. Two data mining approaches are applied. The first method is based on using an ensemble classifier on features derived from the CAN-data to predict the rating of each segment of the data. The second method uses an outlier detection algorithm and a hierarchical clustering approach on a distance metric based on the angles between the principal variance components of the observations. Using the ensemble classifier and general variables a large proportion of rating variance can be explained when including driver and route factors. Large rating values can be identified well. For the standardised rating the prediction of high values is worse with many false positives. The matching of signals using the covariance structure works well. Using hierarchical clustering clusters with standardised rating high above average can be obtained. Outliers with high standardised rating are extracted and matched towards a larger database. The matches are few but similar to the original situations owing to the fact the matching is strict. In conclusion the ensemble classifier works well for predicting rating when driver and route factors are included. The covariance-based method performs well for situation matching and clusters with high rating can be identified. It also has potential to be be used for extracting and matching more sofisticated patterns.
dc.subjectGrundläggande vetenskaper
dc.subjectMatematisk statistik
dc.subjectBasic Sciences
dc.subjectMathematical statistics
dc.titlePattern extraction to define normal driving from driver-rated data using data mining techniques
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
Collection:Examensarbeten för masterexamen // Master Theses

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.