SSPOC: Smart Stream Processing Operator Classification

dc.contributor.authorNilsson, Hampus
dc.contributor.authorGustafsson, Victor
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.examinerPapatriantafilou, Marina
dc.contributor.supervisorGulisano, Vincenzo
dc.contributor.supervisorBäckström, Karl
dc.date.accessioned2022-05-02T13:06:18Z
dc.date.available2022-05-02T13:06:18Z
dc.date.issued2019sv
dc.date.submitted2020
dc.description.abstractStream Processing is a rapidly growing field. Efficiently handling a stream processing query often requires knowing what type each operator is, as knowing its behaviour allows for tailored solutions. Today, each framework handles the identification of operators in its own way, often using semantics and compile-time info for this purpose. Having a more general way of classification could be an interesting way to simplify the creation of such framework. Creating such a general way requires a change from semantic info, as different frameworks use different semantics, to more general information. We pioneer a first step in this direction by using metrics available at runtime to classify a basic set of operators. In this thesis, we present a machine learning model for classification of stream processing operators. The model is a densely connected multi-layer feed-forward neural network. The operators that are classified are limited to a subset of the standard set of operators available in the stream processing framework Apache Flink. The training, validation and test datasets are also a contribution of this thesis. These were collected from public queries using our collection method. We also propose a set of features for our classifier, that aid in differentiating operators; we suggest that other machine-learning based solutions can use them.The model is optimized for prediction accuracy while training on data collected from 9 different queries. It reaches a prediction accuracy of 97.51% on the validation dataset and 99.796% on the test dataset.sv
dc.identifier.coursecodeDATX05sv
dc.identifier.urihttps://hdl.handle.net/20.500.12380/304571
dc.language.isoengsv
dc.setspec.uppsokTechnology
dc.subjectComputersv
dc.subjectsciencesv
dc.subjectcomputer sciencesv
dc.subjectengineeringsv
dc.subjectprojectsv
dc.subjectthesissv
dc.subjectmachine learningsv
dc.subjectneural networkssv
dc.subjectstream processingsv
dc.titleSSPOC: Smart Stream Processing Operator Classificationsv
dc.type.degreeExamensarbete för masterexamensv
dc.type.uppsokH

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 19-80 Gustafsson Nilsson.pdf
Storlek:
1.36 MB
Format:
Adobe Portable Document Format
Beskrivning:
SSPOC: Smart Stream Processing Operator Classification

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
1.51 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: