A Cluster Analysis Framework using DiMaxL - Clustering and data reduction in the presence of noise

dc.contributor.authorWerner, Philip
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)sv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineering (Chalmers)en
dc.date.accessioned2019-07-03T12:23:29Z
dc.date.available2019-07-03T12:23:29Z
dc.date.issued2010
dc.description.abstractThe goal of this project was to develop a framework which can be used to make accurate predictions on large, sampled, data sets where statistical outliers are present. A secondary aim was to develop a method to reduce the large amount of data sometimes available, but all of it not always useful when making a prediction. The framework was developed using the DiMaxL algorithms and was tested on data sets taken from biology. These data sets are protein measurements where a large amount of statistical outliers are present. The results indicate that the method can accurately detect patterns even in presence of large amount of noise without any excessive overfitting. In the case of data reduction, the accuracy of the method is more sensitive to the amount of available data, and a semi-automatic procedure is recommended. In conclusion, the framework developed, is able to effectively remove noise while detecting the underlying pattern present, even in complex correlations.
dc.identifier.urihttps://hdl.handle.net/20.500.12380/126759
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectDatavetenskap (datalogi)
dc.subjectComputer Science
dc.titleA Cluster Analysis Framework using DiMaxL - Clustering and data reduction in the presence of noise
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
dc.type.uppsokH

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
126759.pdf
Storlek:
1.87 MB
Format:
Adobe Portable Document Format
Beskrivning:
Fulltext