Advanced Algorithms to Identify Performance Degradation

dc.contributor.authorJohansson, Annika
dc.contributor.authorOtterberg, Markus
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data- och informationsteknik (Chalmers)sv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineering (Chalmers)en
dc.date.accessioned2019-07-03T13:55:19Z
dc.date.available2019-07-03T13:55:19Z
dc.date.issued2016
dc.description.abstractFor the purpose of analysing performance of a system, data measuring the resources consumed are gathered. The common goal, independent of what is measured, is to draw conclusions on the performance and see if an update has improved or degraded it. Performance analysis in computer science becomes increasingly important as software controls more and more complex processes and requires more and more accuracy in both precision and timing. As new data are rapidly generated, automation of the analysis both saves money and achieves a more reliable result compared to manual inspection. Machine learning is common for automated data analysis. In this thesis methods from the machine learning eld are applied to performance data with the aim of identifying performance degrada- tion. Both the data and aggregated points are analysed with k-means and k-medoids clustering algorithms and the results show points leading to degraded performance. The performance measurements analysed are the load and memory usage of the hard- ware, generated during testing of the actual hardware and software in a simulated en- vironment. It is generated from a number of different tests running different scenarios, which gives the data a large internal spread in covariance. Due to this large spread a threshold method is not exact enough to determine performance of a single update. In order to analyse changes in the data, aggregated adaptations consisting of the change from one point in time to another are generated. The changes are clustered for each kind of measurement and the clustering is quantitatively and qualitatively evaluated in order to determine its success. By using two stage hierarchical clustering, where the rst layer is used to remove outliers, most of the points leading to performance degradation within the dataset are singled out. At each stage of the clustering different distance metrics are evaluated and the optimal k and the corresponding weights are algorithmically found for each metric. After evaluating each way of clustering the top performing ones are chosen based on quantitative and qualitative measures, such as V-measure and adjusted Rand index. The centroids of the chosen clustering method are labelled and all points are labelled, each point according to the centroid of it's respective cluster. The points labelled as performance degrading are used to locate updates which led to degraded performance. Finally, the method designed is compared to what is generally required of a fault detec- tion system to determine if it can be used as such.
dc.identifier.urihttps://hdl.handle.net/20.500.12380/238227
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectData- och informationsvetenskap
dc.subjectComputer and Information Science
dc.titleAdvanced Algorithms to Identify Performance Degradation
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster Thesisen
dc.type.uppsokH
local.programmeComputer science – algorithms, languages and logic (MPALG), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
238227.pdf
Storlek:
2.1 MB
Format:
Adobe Portable Document Format
Beskrivning:
Fulltext