Exploring the feasibility of using ultrasonic sensors and cameras for human gesture recognition to activate trunk opening in vehicles
Download
Date
Authors
Type
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Model builders
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Abstract
The integration of new advanced technologies plays a crucial role in the industrial market. The automotive industry is no different. With the introduction of ultrasonic parking sensors and high-resolution cameras in new vehicles combined with the integration of high-performance computing power, it is possible to implement machine learning and classical methods to process real-time sensor information. This
thesis focuses on recognizing human gestures using the combined information from the ultrasonic sensors and visual camera data for functional actuation. In particular, the thesis serves as a feasibility study for using gesture recognition as an input for activating the automatic opening of the trunk. Several approaches to this problem have been investigated through literature studies, and the most suitable method has been determined to be a combination of machine learning neural networks and sensor fusion from classical methods. Two different machine learning methods are implemented and analyzed for the visual input. One model that classifies static images and one model that classifies a series of images to capture information from dynamic movement. Another model is built for the parking sensory input, which, similarly to the previous model, utilized a series of measurements in time for the classification. Together, these models form a logical pipeline that utilizes classical ultrasonic sensory input as an indicator for activating the models. These models are evaluated for both binary outputs, meaning classifying gesture or no gesture, and multi-class gestures, meaning several different gesture classifications.
Separately, the vision models achieved close to perfect test accuracy for both the binary and the multi-class implementations, while the model for the ultrasonic sensors achieved a test accuracy of around 70 %. Using sensor fusion, the combined model achieved perfect test accuracy for both the static implementation and the dynamic, proving the proposed solution’s feasibility. However, one should note that the results are all based on a small data pool collected during the thesis. Furthermore, the data lacks diversity. Implementing the solution on a greater scale would likely yield some changes in the results. In conclusion, it is possible to reliably use human gesture recognition for functional actuation from ultrasonic and visual data.
Description
Keywords
Keywords: Human gesture recognition, machine learning, neural networks and sensor fusion.