Quality Attributes of Data in Distributed Deep Learning Architectures

dc.contributor.authorPRADHAN, SHAMEER KUMAR
dc.contributor.authorTUNGAL, SAGAR
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.examinerBerger, Christian
dc.contributor.supervisorKnauss, Eric
dc.date.accessioned2021-09-20T07:12:39Z
dc.date.available2021-09-20T07:12:39Z
dc.date.issued2021sv
dc.date.submitted2020
dc.description.abstractLarge volume of data is generated by different systems. Intelligent systems such as autonomous driving uses such large volume of data to train their artificial intelligence models. However, good quality data is one of the foremost needs of any system to function in an effective and safe manner. Especially in critical systems such as those related with autonomous driving, quality data becomes sacrosanct as fault in such systems could result in fatal accidents. In this thesis, a Design Science Research is conducted to identify challenges related with data quality of a distributed deep learning system. The challenges are identified by conducing interviews with five experts from autonomous driving domain as well as through literature review. The challenges and their severity are validated using a survey. After identification of the challenges, five artifact components are developed that relate with assessing and improving data quality.The artifact components include Data Quality Workflow, List of Challenges, List of Data Quality Attributes, List of Data Quality Attribute Metrics, and Potential Solutions. The abstract artifact components and concrete implementation of those components are devised and validated using second round of interviews. In the third iteration of this study, the final artifact components are validated through a focus group session with experts and survey. Furthermore, the artifact also presents the information regarding which challenges affect which data quality attributes. This association between challenges and attributes are also val idated in the focus group session. The results depict that most of the challenge -attribute association presumed by the researchers of this thesis are valid. Similarly, the templates developed for the artifact components are regarded as appropriate as well. A contribution of this thesis study towards the body of software engineering and requirements engineering research is the comprehensive and unified "Data Quality Assessment and Maintenance Framework" developed as a series of artifact components in this thesis. This framework can be used by researchers and practitioners to improve processes related with data quality as well as enhance data quality of the systems they develop.sv
dc.identifier.coursecodeMPSOFsv
dc.identifier.urihttps://hdl.handle.net/20.500.12380/304143
dc.language.isoengsv
dc.setspec.uppsokTechnology
dc.subjectData qualitysv
dc.subjectDatasv
dc.subjectData quality attributessv
dc.subjectData quality challengessv
dc.subjectData quality workflowsv
dc.subjectData quality assessmentsv
dc.subjectData quality maintenancesv
dc.subjectDesign science researchsv
dc.subjectArtifactssv
dc.subjectTemplatesv
dc.subjectDeep learningsv
dc.subjectDistributed architecturesv
dc.subjectDistributed deep learning architecturesv
dc.subjectAdvanced driver assistance systemssv
dc.titleQuality Attributes of Data in Distributed Deep Learning Architecturessv
dc.type.degreeExamensarbete för masterexamensv
dc.type.uppsokH
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 21-136 Tungal Pradhan.pdf
Storlek:
7.82 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
1.51 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: