Implementation of a Vision System for an Autonomous Railway Maintenance Vehicle: Track and Object Detection with YOLO, Neural Networks and Region Growing

Examensarbete för masterexamen
Engineering mathematics and computational science (MPENM), MSc
Warnicke, Albin
Jönsson, Jesper
Railway infrastructure is often expensive to maintain. To improve efficiency and lower these costs, the use of autonomous railway vehicles for such maintenance has begun to be explored. A railway vehicle requires several components to achieve complete automation, including systems for navigation, decision-making, and sensors such as cameras. This project aims to develop the vision system used by an autonomous track trolley under development at Chalmers University of Technology. The proposed vision system can detect railway tracks and switches by a region-growing algorithm based on the image intensity gradient. Object detection is achieved by the use of a YOLOv4-tiny neural network and is developed to detect persons, vehicles, railway signs and signals, road crossings and catenary support poles. The signal and speed sign messages are further classified by additional convolutional neural networks. The vision system is implemented as a ROS node on a single-board computer, a NVIDIA Jetson Nano, and is running in real-time at up to 15 FPS. The vision system is accurate and robust enough to be used as a prototype in simple environments. The track that the vehicle is traveling on was detected in 98.4 % of the evaluated video frames, with the sidetracks correctly identified in 70-80 % of the time. Several of the considered objects were detected with 90-100 % accuracy, for example vehicles and road crossings. Other objects, particularly railway switches and incoming tracks, were however only correctly recognized in about 60 % of their occurrences. Signals and speed signs were detected with high accuracy. Some features can be improved or added before the vision system can be applied to a complete autonomous railway vehicle. The main limitation of the implemented object detection is the lack of large training datasets. With more available video data, datasets with an increased number of labeled objects and greater diversity could be created. Utilizing the full capabilities of larger datasets would eventually require the use of more complex neural networks. The currently used hardware however limits the possible methods to simpler algorithms. The track detection algorithm can serve as a base for further improvement, with the region growing based on the image intensity gradients not being robust enough to handle large variations in lighting and environment conditions. An approach with semantic segmentation neural networks is instead suggested to achieve robust track detection.
Autonomous vehicles , Railway , Object detection , Computer vision , Track detection , Machine learning , Arti cial neural networks , Convolutional neural networks , YOLO , You Only Look Once , Region growing
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Teknik / material