Segmented Classification of Traffic Environments Using RGB-D Data: Considering the effect of image resolution and the relevance of artificial data during training
Publicerad
Författare
Typ
Examensarbete för masterexamen
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
In this thesis a method based on previous approaches to perform semantic segmentation
using color (RGB) and depth images together (RGB-D) in a Convolutional
Neural Network (CNN) is presented. To improve the accuracy of the prediction a
fusion module is proposed, to fuse RGB and depth features more efficiently.
Furthermore, it is proved that higher resolution images improve the accuracy
of the segmentation, especially for thin structures that are far away. The drawback
of increasing the image resolution, on the other hand, is that the runtime
increases.
The method is tested using both simulated and real-world data. It is concluded
that training the network on artificial data only and then evaluating it using realworld
data does not yield a good result due to differences in composition between
data. Thus using only artificial data during training is not sufficient. Even though
the artificial data can be used for pre-training the network, it is concluded that it
does not increase the accuracy compared to training the network using only realworld
data.
It is shown that the use of depth images improves the robustness of the segmentation
with a large margin. Finally, it is concluded that for this approach to yield
its full potential, high-accuracy depth images are a requirement.
Beskrivning
Ämne/nyckelord
Semantic Segmentation, Convolutional Neural Networks, Deep Neural Networks, Deep Machine Learning, Computer Vision, RGB-D Data