Creating a reference dataset for neural network validation and evaluation Determining key characteristics in vehicle images appropriate for binary classifier validation and evaluation
Examensarbete för masterexamen
Foughman Lind, Tobias
In the automotive industry, customizing a car has recently been made possible thanks to online services, where various car parts can be personalized independently. This process is done by a back-end service which composes images of individual parts into a fully configured vehicle. However, there are instances where an image is not perfectly rendered, which may result in a defective image being shown directly to the client. Using neural networks to perform defect detection is a way of mitigating this problem. Previous research regarding defect detection using neural networks, evaluating neural networks, and constructing a test harness for machine learning have been widely studied. However, there exist a lack of research that bridge these research topics. The purpose of this study is to investigate the procedures needed to construct a test harness for defect detection, by characterizing, designing and evaluating a reference dataset. Using the design science research methodology, we created and validated datasets containing images with different defects. These were then combined into a reference dataset, and included in a test harness. The procedures required for the creation of this reference dataset can be used for the recreation of a similar dataset for other domains. Then, the test harness was evaluated using three binary classifiers with known performance. Test Case Prioritization was the testing methodology used in the test harness, to establish the correctness of the networks. The testing results verified that the test harness is able to distinguish between adequate and unsuitable neural network-based binary classifiers. However, as only a limited amount of defects were included in the test harness, the generalizability could be threatened. Furthermore, due to the confidentiality of the data used in the thesis, replication of the study by other researchers may be difficult.
Software Engineering , Computer science , Machine Learning , Image Classification , Thesis