3-D object tracking through the use of a single camera and the motion of a driverless car
Examensarbete för masterexamen
Complex adaptive systems (MPCAS), MSc
There has been a very large increase in interest and development of partially or fully driverless cars in recent years. For these driverless cars to function, they need to be able to navigate to their destination while avoiding nearby objects. This can be done using simultaneous localisation and mapping (SLAM). SLAM is the task of simultaneously creating a map of the surrounding objects while keeping track of the car’s position within this map. This thesis will look into the feasibility of using a single camera attached on a driverless car to perform SLAM on cones detected by the real-time object detection system You only look once (YOLO). Three different methods were tested. All of these require a calibrated camera that is capable of determining horizontal and vertical angles from the pixel positions. The first ‘triangulation’ method uses that the distance travelled and rotation between two frames is known. The second ‘plane projection’ method is an optimisation problem which consists of finding the variables which result in lowest error, and through this determine the cone distances and car speed. The map of the surrounding cones is moved according to the estimated velocity and rotation of the car such that the car is always placed at the origin, allowing for use of multiple detections to improve accuracy. The third ‘distance from cone height’ method works by using the size of the cone detections in order to determine the approximate distance of each cone, use this to determine the approximate angle of the camera and then use the median angle to make the final distance estimates. The triangulation method was shown to be completely unsuitable for mono-camera use. The plane projection method was shown to be unreliable, likely due to a relatively small number of visible cones and a too large noise amplitude of detections from YOLO. The distance from cone height method was shown to be the best out of the tested methods, as it was simple, fast and quite reliable. However, this method still had an error approximately 1.4 times larger than what is advertised by commercial stereo camera systems.
SLAM , driverless , autonomous , depth vision , 3D , mono camera