Aerial Imagery Based Position and Heading Estimation For maritime autonomous vessels in GNSS-challenged environments Master’s thesis in Master Programme Systems, Control and Mechatronics ALGOT BERGMAN TEODOR LINDELL DEPARTMENT OF MECHANICS AND MARITIME SCIENCES CHALMERS UNIVERSITY OF TECHNOLOGY Gothenburg, Sweden 2023 www.chalmers.se www.chalmers.se Master’s thesis 2023 Aerial Imagery Based Position and Heading Estimation For maritime autonomous vessels in GPS-challenged environments ALGOT BERGMAN TEODOR LINDELL Department of Mechanics and Maritime Sciences Division of Vehicle Engineering and Autonomous Systems Chalmers University of Technology Gothenburg, Sweden 2023 Aerial Imagery Based Position and Heading Estimation For maritime autonomous vessels in GPS-challenged environments ALGOT BERGMAN TEODOR LINDELL © ALGOT BERGMAN, TEODOR LINDELL, 2023. Supervisor: Erik Lund, CPAC Systems AB Supervisor: Valter Strömberg, CPAC Systems AB Examiner: Peter Forsberg, Department of Mechanics and Maritime Sciences Master’s Thesis 2023 Department of Mechanics and Maritime Sciences Division of Vehicle Engineering and Autonomous Systems Chalmers University of Technology SE-412 96 Gothenburg Telephone +46 31 772 1000 Cover: Matlab visualization of LiDAR data projected on an aerial image in a mar- itime environment. Typeset in LATEX, template by Kyriaki Antoniadou-Plytaria Printed by Chalmers Reproservice Gothenburg, Sweden 2023 iv Aerial Imagery Based Position and Heading Estimation For maritime autonomous vessels in GPS-challenged environments ALGOT BERGMAN TEODOR LINDELL Department of Mechanics and Maritime Sciences Chalmers University of Technology Abstract One of the most important parts of an autonomous vehicle is the estimation of its position and orientation, which often comes from a GNSS sensor. However, the per- formance is degraded when the GNSS signals are obstructed. Nowadays, a popular implementation for estimation with accurate precision is to fuse the information from IMU, GNSS and LiDARs. The method of how a position and orientation are calculated from the LiDAR measurements can be done in different ways. This thesis presents a method of how position and orientation can be effectively estimated by matching LiDAR measurements to an aerial image. The method consists of project- ing the LiDAR measurements onto the aerial image, filtering the measurements and the image, and then calculating the cross-correlation. The orientation is estimated via stochastic optimization which also finds the maximum correlation to update the position. This is fused in an EKF with information from IMU and a GNSS sensor to get a more precise estimation. The conclusion of the thesis is that the method works well in estimating the position and heading in a GNSS-challenged environment. Keywords: Extended Kalman filter, LiDAR, aerial imagery, particle swarm opti- mization, computer vision, point cloud, localization, cross-correlation. v Acknowledgements We would like to thank all the people who have contributed to the completion of this thesis. Firstly, we would like to thank our examiner Peter Forsberg for all the valuable suggestions and constructive criticism. The weekly meetings we had made sure to keep us on the right path and ensure the successful completion of this thesis. We would also like to extend special thanks to our supervisors, Erik Lund and Valter Strömberg. Their guidance, support, and expertise have been very valuable. Lastly, we would like to thank CPAC Systems AB for the opportunity to do this thesis and for providing us with the equipment to collect real-world data. Algot Bergman & Teodor Lindell, Gothenburg, June 2023 Thesis advisor: Erik Lund, CPAC Systems AB Thesis advisor: Valter Strömberg, CPAC Systems AB Thesis examiner: Peter Forsberg, Department of Mechanics and Maritime Sci- ences vii List of Acronyms Below is the list of acronyms that have been used throughout this thesis listed in alphabetical order: 3D Three dimensions 2D Two dimensions EKF Extended Kalman Filter FFT Fast Fourier Transformed FoV Field of View GNSS Global Navigation Satellite System GPS Global Positioning System ICP Iterative Closest Point IMU Inertial Measurements Unit LiDAR Light Detection and Ranging MEMS Micro-Electro-Mechanical-Systems NDT Normal Distributions Transform PSO Particle Swarm Optimization RANSAC Random Sample Consensus RMSE Root Mean Square Error SBAS Satellite Based Augmentation Systems ix Nomenclature Below is the nomenclature of indices, parameters and variables that have been used throughout this thesis. Indices i,j Indices m,n Indices for image width and height k Index for discrete time step Parameters f Focal length s Skew ratio px Translations parameters in x direction py Translations parameters in y direction γ Aspect ratio M , P Image pixel width N , Q Image pixel height Nest Number of estimations δ Computation frequency g Gravity vector δIMU IMUs sampling frequency α1, α2, α3 Tuning parameters Nm Maximum number of LiDAR points Variables xi T Transformation matrix Rrot Rotation matrix Rrot,x Rotation matrix around x-axis Rrot,y Rotation matrix around y-axis Rrot,z Rotation matrix around z-axis t Translation vector Pproj Projection matrix X Set of 3D coordinates X̃ Set of 3D homogeneous coordinates X̃′ Set of 2D homogeneous coordinates KImg Intrinsic camera matrix f(t), g(t) Time dependent functions F Fourier transform C,D 2D Images Bw Binary Image xk State at the time step k x0 Initial state P0 Initial covariance uk Control vector at time step k f(xk−1,uk) Motion model at time step k qk−1 Gaussian process noise at time step k − 1 yk Sensor measurements at time step k rk Gaussian measurements noise at time step k h(xk) Measurements model at time step k Hk Jacobian of measurements model at time step k Fk Jacobian of motion model at time step k x̂k|k−1 Posterior mean at time step k Pk|k−1 Posterior covariance at time step k Qk Covariance of the process noise at time step k Pk|k Covariance at time step k x̂k|k Mean at time step k RGNSS,k GNSS measurement Covariance matrix at time step k RLiDAR,k LiDAR measurement Covariance matrix at time step k RSensor,k Measurement Covariance matrix at time step k xii xest Estimated position of x in 2D space at estimation i yest Estimated position of y in 2D space at estimation i xtrue True position of x in 2D space at estimation i ytrue True position of y in 2D space at estimation i σGyro IMUs gyroscopes standard deviation σAcc IMUs accelerometers standard deviation eGNSS,k RMSE of the GNSS measurements at time step k σGNSS Latitude,k Standard deviation of GNSS latitude measurements at time step k σGNSS Longitude,k Standard deviation of GNSS longitude measurements at time step k σGNSS Heading,k Standard deviation of GNSS heading measurements at time step k XL1,k - XL5,k Individual LiDARs point cloud at time k Xk Collective LiDAR point cloud at time k X̃k Homogeneous collective LiDAR point cloud at time step k Pproj,k Projection matrix at time step k ψk Heading at time step k xk Horizontal position at time step k yk Vertical position at time step k vsway Velocity in the vessels orthogonal direction at time step k asurge Acceleration in the vessels direction at time step k asway Acceleration in the vessels orthogonal direction at time step k ψ̇surge Vessels yaw rate at time step k ax IMU acceleration in its x-axis at time step k ay IMU acceleration in its y-axis at time step k ωz Angular velocity around the IMUs z-axis at time step k Qa State transition matrix H Jacobian of measurements model Nk Number of LiDAR points at time step k xiii xiv Contents List of Acronyms ix Nomenclature xi List of Figures xvii List of Tables xix 1 Introduction 1 1.1 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Pose estimation in known environment . . . . . . . . . . . . . 2 1.2.2 Pose estimation in unknown environment . . . . . . . . . . . . 2 1.2.3 Feature matching to aerial view . . . . . . . . . . . . . . . . . 3 1.2.4 Localization with 2D vector maps . . . . . . . . . . . . . . . . 3 1.3 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Theory 5 2.1 LiDAR properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 IMU properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 GNSS properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.4 Transformation matrices . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.4.1 3D transformations . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4.2 2D projection . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.5 Canny edge detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.6 RANSAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.7 Cross-correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.8 FFT-based cross-correlation . . . . . . . . . . . . . . . . . . . . . . . 9 2.9 Particle swarm optimization . . . . . . . . . . . . . . . . . . . . . . . 10 2.10 Extended Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.11 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Methods 13 3.1 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1.1 Vessel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1.2 IMU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 xv Contents 3.1.3 GNSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.4 LiDAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.5 Test environment . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 General approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3 Design and implementation of positioning algorithm . . . . . . . . . . 17 3.3.1 Aerial images . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3.2 Project LiDAR point cloud . . . . . . . . . . . . . . . . . . . . 19 3.3.3 Extract features from aerial image . . . . . . . . . . . . . . . . 20 3.3.4 Extract features from LiDAR point cloud . . . . . . . . . . . . 20 3.3.5 Point cloud aligning . . . . . . . . . . . . . . . . . . . . . . . 22 3.4 Kalman filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4.1 Prediction step . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.4.2 Update step . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4 Results 29 4.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Positioning algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5 Discussion 35 5.1 Estimated position and heading performance . . . . . . . . . . . . . . 35 5.1.1 Case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.1.2 Case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.1.3 Case 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.2 Supplementary validation of positioning algorithm . . . . . . . . . . . 36 5.3 Extended Kalman filtering . . . . . . . . . . . . . . . . . . . . . . . . 37 5.4 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.5 Real-time applications . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6 Conclusion 41 6.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 A Appendix I B Appendix III C Appendix VII xvi List of Figures 2.1 Input image (left) and the result of canny edge detection algorithm (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1 Visualisation of LiDAR point cloud. . . . . . . . . . . . . . . . . . . . 15 3.2 Aerial images of two locations at the port of Krossholmen. The first location is illustrated in (a) together with the trajectory of the two ground truth paths from the GNSS data. The blue trajectory rep- resents case 1 and the red represents case 2. The second location is illustrated in (b) together with the ground truth path/location for case 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 A simplified version of the general approach for the position and head- ing estimation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.4 Aerial image from noisy GNSS signal with projected LiDAR point cloud as blue dots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.5 The most frequently occurring coordinate pairs (yellow markers) se- lected with Algorithm 3 applied on all LiDAR points (blue markers). 21 3.6 LiDAR point cloud before RANSAC (yellow marks) and LiDAR point cloud after RANSAC (red marks). The dashed blue lines show each grid RANSAC is performed in. . . . . . . . . . . . . . . . . . . . . . . 22 3.7 The grey markers to the left illustrates the features from the aerial im- age Bw. The red markers to the right illustrates the reduced LiDAR point cloud Xreduced. . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.8 Visualisation of the vessel’s degrees of freedom and coordinate frame. 24 4.1 True trajectory of the vessel, measurements, and the result of the Kalman estimation for case 1. The GNSS data without noise is used as prior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.2 True trajectory of the vessel, measurements with a sudden jump, and the result of the Kalman estimation for case 1. The GNSS data without noise is used as prior. . . . . . . . . . . . . . . . . . . . . . . 30 4.3 True trajectory of the vessel, measurements, and the result of the Kalman estimation for case 2. The GNSS data without noise is used as prior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.4 True trajectory of the vessel, measurements with a sudden jump, and the result of the Kalman estimation for case 2. The GNSS data without noise is used as prior. . . . . . . . . . . . . . . . . . . . . . . 31 xvii List of Figures 4.5 True trajectory of the vessel, measurements, and the result of the Kalman estimation for case 3. The GNSS data without noise is used as prior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.6 True trajectory of the vessel, measurements with a sudden jump, and the result of the Kalman estimation for case 3. The GNSS data without noise is used as prior. . . . . . . . . . . . . . . . . . . . . . . 32 4.7 An instance from case 1, the first subfigure (left) shows the new align- ment of the point cloud (green) after finding the best transformation of the input point cloud (red) with the extracted features from the aerial image (grey). The second subfigure (right) shows the PSO that has found the global best rotation angle with the objective function f(x) as the negative cross-correlation between extracted features and the rotated point cloud. Each particle, for the last iteration, is de- noted by red marks, while the green mark indicates the global best position that has been found. . . . . . . . . . . . . . . . . . . . . . . 33 4.8 An instance from case 2, the first subfigure (left) shows the new align- ment of the point cloud (green) after finding a transformation of the input point cloud (red) with the extracted features from the aerial image (grey). The second subfigure (right) shows the PSO that has found the global best rotation angle (green mark) with the objective function f(x) as the negative cross-correlation between extracted fea- tures and the rotated point cloud. . . . . . . . . . . . . . . . . . . . . 33 4.9 Same instance as in Figure 4.8, the first subfigure (left) shows the new alignment of the point cloud (green) after finding another valid trans- formation of the input point cloud (red) with the extracted features from the aerial image (grey). The second subfigure (right) shows the PSO that has found a local best rotation angle (green mark) with the objective function f(x) as the negative cross-correlation between extracted features and the rotated point cloud. . . . . . . . . . . . . . 34 4.10 An instance from case 3, the first subfigure (left) shows the new align- ment of the point cloud (green) after finding the best transformation of the input point cloud (red) with the extracted features from the aerial image (grey). The second subfigure (right) shows the PSO that has found a global best rotation angle (green mark) with the objec- tive function f(x) as the negative cross-correlation between extracted features and the rotated point cloud. . . . . . . . . . . . . . . . . . . 34 B.1 Visualisation of the vessels coordinate system. . . . . . . . . . . . . . IV B.2 Visualisation of IMU position and its coordinate system. . . . . . . . IV B.3 Visualisation of GNSS position and its coordinate system. . . . . . . V B.4 Visualisation of LiDARs positions and coordinate systems; blue x- axis, red y-axis and yellow z-axis. . . . . . . . . . . . . . . . . . . . . V xviii List of Tables 3.1 LiDAR specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Specifications of the computer used for testing. . . . . . . . . . . . . . 17 4.1 Position RMSE and max error for measurement signal in the first scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Position RMSE and max error for measurement signal in the second scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3 RMSE and max error for estimation from run in Figure 4.1. . . . . . 30 4.4 RMSE and max error for estimation from run in Figure 4.2. . . . . . 30 4.5 RMSE and max error for estimation from run in Figure 4.3. . . . . . 31 4.6 RMSE and max error for estimation from run in Figure 4.4. . . . . . 31 4.7 RMSE and max error for estimation from run in Figure 4.5. . . . . . 32 4.8 RMSE and max error for estimation from run in Figure 4.6. . . . . . 32 xix List of Tables xx 1 Introduction An autonomous vehicle needs an accurate and reliable position and heading esti- mation for navigation, regardless if it is operating on land or at sea. The most common way to achieve a position estimation is to use a Global Navigation Satel- lite System (GNSS) with e.g. Global Positioning System (GPS). The worldwide position accuracy of GNSS is high and can almost pinpoint the exact position in certain applications. However, to achieve this precision, a clear view of the sky and preferably no large reflecting surfaces nearby are required. Quite often surrounding objects are reflecting the satellite signals, making the estimated pose erroneous and sporadic. One solution would be to use a multitude of other positioning methods and fuse these into a more reliable and robust pose. One positioning method is the use of Inertial Measurements Units (IMUs), which consist of accelerometers, gyroscopes and sometimes magnetometers. The device combines these sensors to measure and provide information about the acceleration and angular velocity of a body. However, IMUs do not provide an absolute posi- tion in space, as they rely on dead reckoning, which accumulates errors over time. Another method that is commonly used in positioning and mapping applications is Light Detection and Ranging (LiDAR). LiDAR is a type of remote-sensing tech- nology that uses light to measure distances to objects and produce high-resolution maps of the environment, alternatively referred to as point clouds. Map-based vehicle localization techniques have grown in connection with the de- velopment of autonomous vehicles and the need for precise positioning. Map-based refers to an already existing map of the environment that is used together with different sensor measurements to determine the precise location of a vehicle. This pre-existing map is commonly made up of a high-definition 3D point cloud. How- ever, when it comes to the marine environment, the availability of satellite and aerial images with detailed features is abundant and easily accessible. For example, a dock in the harbor is often clearly visible from an aerial image and could be used for mapping. The development in camera technology, computer processing power, and Unmanned Aerial Systems (UAS) have led to increased capabilities of airborne imagery. In situations when the GNSS signals become unavailable or weak, data from LiDARs and IMU can be fused with information from aerial imagery to achieve a more reliable and robust pose estimation. 1 1. Introduction 1.1 Aim The aim of this thesis is to develop a reliable and robust position and heading estima- tion system for a vessel in GNSS-challenged marine environments. The estimation should utilize LiDAR, GNSS and IMU data together with aerial imagery. 1.2 Related work In this section, the related work for estimating the position and heading for vehicles in marine and terrestrial environments will be presented. Different approaches on how to use the sensor measurements will be covered. 1.2.1 Pose estimation in known environment Multi-sensor fusion is a common approach to obtain a more accurate and reliable estimate of the vehicle’s position and heading. Most often, a combination of the sensors LiDAR, GNSS, IMU, and cameras are used. These are the sensors used in [1], together with a 3D map of the area of interest. The LiDAR and camera are used together to identify points and use them to triangulate the position of the vehicle. This information is then fused with Inertial Navigation System (INS) and GNSS in an Ensemble Kalman Filter (EnKF) to estimate the position of the vehicle. The system showed promising results with a mean translational error of about 0.4m. A different approach for fusing sensor readings is presented in [2], which does not use cameras but achieves a more accurate pose estimation with a mean error of 5-10cm. The filtering is performed by IMU-based state equations in an error-state Kalman filter with LiDAR and GNSS as the update step. LiDAR measures the position and heading angle of the vehicle by adaptively combining intensity and altitude cues and matching measurements to a pre-built LiDAR map, while GNSS measures only the position. Pose localization in a marine environment is performed in [3]. The sensors used in this work consisted of LiDAR, IMU and GNSS, which where proven to give low RMSE values. The state estimation uses boat modeling with a constant velocity model together with the velocity readings from the IMU. GNSS, LiDAR and a pre-built LiDAR map were used for the state measurements. For fusing the sensor outputs, the study compares three different approaches; Generic Particle Filter (PF), PF with Unscented Kalman Filter (UKF) and PF with Iterative Closest Point (ICP). The Generic PF proved to be the fastest while the PF with ICP was better suited for precision. 1.2.2 Pose estimation in unknown environment One problem with the methods above is that they are dependent on a pre-built LiDAR map to match the LiDAR measurements, which will not work in unknown environments. Simultaneous localization and mapping (SLAM) [4] can be used to 2 1. Introduction map an unknown environment. SLAM can be suitable for online applications if the physical systems can handle the computational load. An implementation of SLAM, in a marine environment, was done in [5]. It aims to reduce vertical drift using hardware and software approaches while identifying the minimum conditions required for SLAM. The baseline performance was built on the LiDAR-only normal distributions transform (NDT) and ICP registration algorithms. The study uses LiDAR, IMU, and GNSS data to fuse 15 states in an Unscented Kalman Filter (UKF), which estimates the sensor position and thereby improves mapping. Both approaches significantly reduce deviation compared to LiDAR-only methods. An alternative method involves the utilization of Visual Odometry (VO), which does not incorporate the use of a map. VO is a technique used to estimate the motion of a vehicle by tracking features in sequential images captured by the camera. The fallback of VO is that the error grows exponentially if no measurement updates are given. In [6], VO together with GNSS and INS is fused in an error state Extended Kalman Filter (EKF). The results show that the system provides more accurate and continuous positioning results than GNSS alone in GNSS-challenged environment. A more suitable and reliable localization method for unknown environments should strive to find the absolute position. This would require the use of a map. However, pre-built LiDAR maps will not be available for unknown environments and SLAM approaches can become too computing-heavy for online implementations. Ideally, a localization method that matches measurements to one arbitrary map, working for all situations, would be desired. 1.2.3 Feature matching to aerial view Aerial imagery can act as a map for localization. A lot of information can be extracted from an aerial image, making it usable in many different scenarios. In [7], vehicle-mounted cameras and an aerial view is used to match visual features to each other using ICP. The proposed algorithm uses lane markings as the visual features. The features from the camera were detected with a Canny edge detector, and with a neural network in the aerial view. The evaluation included both position and heading estimation where the method produces results within 0.25m and 1◦ heading from the reference position. In [8] a method is proposed for finding road intersections in real- time by the use of LiDAR and aerial image matching. The LiDAR scans are used in an intersection detection algorithm. From the aerial image, an intersection template is extracted and used as the prior for the intersection detection algorithm. The method shows that intersections can effectively be detected at a low false positive rate. 1.2.4 Localization with 2D vector maps A 2D vector map is a digital representation of geographic information as a collec- tion of points, lines and polygon shapes in two-dimensional space. It provides a compact and structured representation of the environment. In [9], 2D footprints 3 1. Introduction of the buildings are matched with LiDAR data to achieve an estimated pose. The proposed method uses Normal Distributions Transform (NDT) to match a LiDAR scan with a 2D vector map. A fast filtering method is used to extract aligned points and filter out outliers. The performance shows promising results, with an obtained localization accuracy of 0.21m. In [10] and [11] a new map structure called multilayer 2D vector map is used for localization. The results show that the new map provides more features for map matching and has less uncertainty, leading to improved accuracy. However, a 2D vector map provides a simplified representation of objects. For example, the outline of a large building can be represented with multiple straight lines that could be used to match the LiDAR data. But usually smaller objects like a dock in a harbor may be represented with only a straight line. Taking away a lot of useful information about the dock, like the width, length and shape, which makes localization with 2D vector maps less suitable for this thesis. 1.3 Research questions The following research questions have been identified for further investigation: • How does the integration of aerial imagery improve the robustness of a system in case of characteristic GNSS signal disturbance? • Can LiDAR data be effectively matched with aerial imagery to estimate the position and heading of a marine vessel? • How do objects that are only present in the aerial imagery affect the system? • How can uncertainty and errors be handled and propagated for more robust and reliable results? 1.4 Limitations The implementation will be done in the Matlab programming language on field data, collected by a vessel in the port of Krossholmen, Volvo Penta’s marine leisure testing facility in Gothenburg, Sweden. Therefore, the evaluation of an online imple- mentation will be done by analyzing the computational time required by the offline implementation. 4 2 Theory In this chapter, the relevant theory that is essential to understand the methods used in this thesis is presented. First, the different sensors used in this work and their properties are introduced. Followed by how the sensor data is transformed and the algorithms used to yield an estimate of the position and heading. Finally, the filtering process and evaluation process are presented. 2.1 LiDAR properties A LiDAR is a distance measuring sensor [12]. The sensor calculates the distance by emitting light signals and measuring the time it takes for the light signal to return. The distance is measured in a range of vertical and horizontal directions. Depending on what type of LiDAR is used, the maximum and minimum measur- able distance differs. Generally, measured distances are between a few meters to around 200 meters. By emitting several rays of light between a span of vertical and horizontal directions, referred to as a Field of View (FoV), a cluster of points is generated. This cluster of points is often referred to as a point cloud. Point clouds are generated at the sample rate of the LiDAR which usually is between 10Hz - 20Hz. LiDARs can be categorized into two main types, mechanical and solid state. The main difference between the types is that the mechanical LiDAR has moving parts within and has a wider horizontal and vertical FoV. The solid-state LiDAR has a higher resolution and does not have any moving parts within which makes it less likely to break because of vibrations. However, the solid-state has lower FoV coverage. 2.2 IMU properties In three axes perpendicular to each other, the IMU gyroscope measures angu- lar velocity, the accelerometer measures linear acceleration and the magnetome- ter measures magnetic fields [13]. There exist several different types of gyroscopes which measure angular velocity [14]. One type of gyroscope is the Micro-Electro- Mechanical-Systems (MEMS) gyroscope. The MEMS gyroscope generally consists of springs, dampers and a resonating mass with a capacitance. As the resonating mass moves, the capacitance inside the gyroscope changes. This change is propor- tional to the displacement, which in turn is proportional to the angular velocity. 5 2. Theory MEMS accelerometers work much like MEMS gyroscopes in the way that the mea- surement is computed by measuring the change in capacitance [15]. The MEMS accelerometer consists of a movable mass and fixed electrodes. When a force is applied to the accelerometer, the mass is displaced and the capacitance is changed proportional to the displacement. The change in capacitance is used to calculate the acceleration. When an accelerometer is aligned with the gravity vector, -1g will be measured. Some types of IMUs combine the measurement from gyroscopes and accelerometers to yield the orientation of the IMU and filter out the gravity vector. They can also sometimes use the magnetometer for better estimation of the orien- tation. For best accuracy, accelerometers should be placed in the center of rotation so momentum is not measured as a linear force. In marine applications, this center of rotation is often referred to as the center of buoyancy. 2.3 GNSS properties The GNSS sensor receives radio signals from satellites and via triangulation, its latitude and longitude are measured [16]-[17]. The sensor measures the time when a satellite signal is received and then compares it with the time when it was trans- mitted. Since the transmitted signal travels at a constant speed, the distance to the satellite can be computed. When multiple satellites transmit signals to the sensor and then multiple distances are known, the location is computed. The orientation of the sensor can be estimated if the sensor is equipped with two signal-receiving antennas. GNSS sensor also yields velocity by deriving the distance with respect to the time lapse between two position measurements or by using Doppler measure- ments related to user-satellite motion [18]. In certain areas around the globe, the precision of GNSS measurements is increased by the use of augmentation systems such as Satellite-Based Augmentation Systems (SBAS) [16]. SBAS consists of a network of ground reference stations that moni- tors and generates data of GNSS signals. The data is processed and a message is constructed. The message is then transmitted to geostationary satellites in the con- stellation which emits the message that the GNSS sensor then receives. The sensor then processes the message to get information about the signals and then corrects the measurement. Some GNSS sensors also estimate the standard deviation of the measured heading and position in real-time. The standard deviation is a good indicator of how well the position and heading have been measured. 2.4 Transformation matrices The material presented in this section is accessed from [19]. Transformation matrices are used to describe a transformation between two coordinate frames. The trans- formations theory covered will be transformations in 3D space and transformation 6 2. Theory from 3D to 2D space which is also known as projection. 2.4.1 3D transformations A 3D transformation matrix consists of a translation vector t and a rotation matrix Rrot as T = [ Rrot t 0⊺ 1 ] . (2.1) The rotation matrix for rotating one coordinate frame to another is done by the rotations along one of the coordinate frame’s separate axes. A rotation around its own axis with the given angle Θ is defined by Rrot,x(Θ) = 1 0 0 0 cos Θ − sin Θ 0 sin Θ cos Θ  , (2.2) Rrot,y(Θ) =  cos Θ 0 sin Θ 0 1 0 − sin Θ 0 cos Θ  , (2.3) Rrot,z(Θ) = cos Θ − sin Θ 0 sin Θ cos Θ 0 0 0 1  . (2.4) The rotation matrix is then defined by Rrot = Rrot,x(Θ1)Rrot,y(Θ2)Rrot,z(Θ3), (2.5) where Θ1, Θ2 and Θ3 are the angels of rotation in each direction. The transformation of a set of 3D coordinates, denoted as Xs, to a base frame, begins by converting Xs to homogeneous coordinates as X̃s = [ Xs 1 ] . (2.6) Then the transformation is applied as shown in equation (2.7). X̃B = Ts BX̃s (2.7) Ts B is the transformation from the coordinate frame of Xs to the base frame and X̃B is the homogeneous transformed coordinates. To convert X̃B to cartesian coor- dinates XB, X̃B is divided by its last row and then the last row is discarded. 2.4.2 2D projection To transform a view from 3D into 2D, the transformation can be done via a projec- tion matrix [20]. Applying a projection onto a set of 3D coordinates such as Xs will then yield a homogeneous 2D subspace as X̃′ s = PprojXs, (2.8) 7 2. Theory where X̃′ s is the set of coordinates projected to 2D and Pproj is the projection matrix. Pproj is defined by the rotation matrix R, translation vector t and camera matrix KImg as equation (2.9) shows. Pproj = KImg [ Rrot t ] (2.9) The intrinsic camera matrix KImg is defined by the specific properties of the second view. It is used when transforming a set of points onto images and is defined as KImg = γf s px 0 f py 0 0 1  , (2.10) where f is referred to as the focal length, which corresponds to how the view should be scaled. s corresponds to the skew ratio. px and py are translations parameters and γ is the aspect ratio. The parameters of the intrinsic camera matrix K can be obtained from the resolution and size of the image. 2.5 Canny edge detection The Canny edge detector is an edge detection method developed by John F. Canny in 1986 [21]. The purpose of edge detection is to reduce the amount of data to be processed while preserving useful structural information about object boundaries. The process of Canny edge detection involves smoothing the image, detecting areas with a gradient change and connecting those areas to form edges. An example of the Canny edge detection on an aerial image can be seen in Figure 2.1. Figure 2.1: Input image (left) and the result of canny edge detection algorithm (right). 2.6 RANSAC Random Sample Consensus (RANSAC), first presented in [22], is an iterative algo- rithm that estimates the parameters of a mathematical model from a set of data 8 2. Theory points. The basic idea of RANSAC involves a random selection of a subset of points as a representation of the model, followed by an evaluation. The evaluation is done by selecting a certain predefined threshold and comparing the number of inliers and outliers among the remaining data points. The procedure is reiterated until a good approximation of the model, with sufficiently many data points classified as inliers, is obtained. RANSAC is commonly used to fit lines in two dimensions. 2.7 Cross-correlation Cross-correlation is commonly used in image processing to detect similarities be- tween two images. The 2D cross-correlation between two matrices DM×N and GP ×Q is a matrix C(M+P−1)×(N+Q−1). The cross-correlation C is given by C(o, l) = M−1∑ m=0 N−1∑ n=0 D(m,n)Ḡ(m− o, n− l), (2.11) where −(P − 1) ≤ o ≤M − 1, −(Q− 1) ≤ l ≤ N − 1, (2.12) and Ḡ denotes complex conjugation [23]. The mathematical expression can be de- scribed as the matrix G being slid across the matrix D for calculating the correlation. The operation computes the sum of element-wise products between D and a shifted version of G at each position of D. This generates a measurement of similarity between the two matrices at each position. 2.8 FFT-based cross-correlation Calculating the cross-correlation is usually computationally demanding. However, multiplication of Fast Fourier Transformed (FFT) signals are a common and efficient way to solve this problem. First and foremost, the cross-correlation ”⋆” of a function f(t) and g(t) is equivalent to f(t) ⋆ g(t) = f(−t) ∗ g(t), (2.13) where ”∗” is the convolution of the functions. Also, it is known that F { f(−t) } = F{f(t)}, (2.14) where F is the Fourier transform. Utilizing this, equation (2.13) becomes F{f(t) ⋆ g(t)} = F{f(t)} ∗ F{g(t)}. (2.15) To be able to find the peak value of the cross-correlation, the expression needs to be transformed back to state space with the inverse Fourier transform, F−1 { F{f(t) ⋆ g(t)} } = F−1 { F{f(t)} ∗ F{g(t)} } . (2.16) 9 2. Theory Referring back to Section 2.7, by replacing the functions f(t) and g(t) with D and G the cross-correlation C becomes C(k, l) = FFT−1 { FFT{G} ∗ FFT{D} } , (2.17) where the equation is written in terms of fast Fourier transforms. Equation (2.17) illustrates the cross-correlation theorem [24]. 2.9 Particle swarm optimization Particle swarm optimization (PSO) is a stochastic optimization algorithm proposed by Kennedy and Eberhart in 1995 [25]. PSO captures the ability of a swarm to collectively and efficiently search a space for a specific goal. The swarm in PSO is built up by an arbitrary number of particles. Each particle is associated both with a position and a velocity, as well as a method for determining the changes in velocity depending the performance of the particle itself and that of other particles. The algorithm is presented below [26]. Algorithm 1 Basic particle swarm optimization algorithm 1. Initialize positions xi and velocities vi of each particle pi, i = 1, . . . , N . The second subscript, j = 1, . . . , n where n is the dimensionality of the problem. 1.1 xij = xmin + r(xmax − xmin) 1.2 vij = α ∆t ( − xmax−xmin 2 + r(xmax − xmin) ) 2. Evaluate each particle in the swarm by computing f(xi). 3. Update the best position of each particle xpb i and the global best position xsb. 3.1 if f(xi) < f(xpb i ) then xpb i ← xi 3.2 if f(xi) < f(xsb) then xsb ← xi 4. Use the current values of xpb i and xsb to update the particle velocities and positions. 4.1 vij ← wvij + c1q ( xpb ij −xij ∆t ) + c2r ( xsb j −xij ∆t ) 4.2 Restrict velocities to maintain the coherence of the swarm, |vij| < vmax 4.3 xij ← xij + vij∆t 5. Return to step 2 until termination criterion has been reached. In Algorithm 1 the constants q and r is uniform random numbers in the range [0, 1]. α is a constant in the range [0, 1] and ∆t is the time step length. The goal of the algorithm is to minimize the objective function f(x). The parameters c1 and c2 are often referred to as the cognitive component and the social component. c1 is a measurement of how much each particle should move towards their individual best positions using their own velocity vectors. Similarly, c2 guides particles to explore new areas by adjusting their velocity vectors based on the global best position found by any other particle in the swarm. w is referred to as the inertia weight and is a measurement of how much the particle prioritizes exploration (w > 1) or exploitation (w < 1). 10 2. Theory 2.10 Extended Kalman filter To obtain a more precise estimation for position and heading, the measurements from the sensors can be fused and filtered via Bayesian filtering [27], a recursive algorithm. In Bayesian filtering, the optimal filtering solution for linear systems with Gaussian measurements is yielded by computing the joint posterior distribution given current and previous states, and measurements. The algorithm involves a prediction step followed by an update step. In the prediction step, an estimate is based on the previous state and a motion model, along with their uncertainties. In the update step, the Bayesian filter computes the likelihood of the next measurement given the predicted state. The likelihood is then combined with the predicted state to obtain an estimation. The implementation of optimal filtering is done by the Kalman Filter (KF) for linear problems. For non-linear problems with non-Gaussian measurements, the Extended Kalman Filter (EKF) can be used. The EKF model can be written as xk = f(xk−1,uk) + qk−1, (2.18) yk = h(xk) + rk, (2.19) where xk ∈ Rn is the vessels states, yk ∈ Rm is the sensor measurements and uk is the control signal, qk−1 is the Gaussian process noise, rk is the Gaussian measurement noise, f(xk−1,uk) is the motion model function and h(xk) is the measurement model function. The subscript k denotes the current time instance. The idea of the EKF is to utilize Taylor series approximations to form Gaussian approximations of the filtering densities. The Taylor expansion for the motion and measurement model gives the Jacobians Fk = ∂f ∂x ∣∣∣∣ x̂k−1|k−1 , (2.20) Hk = ∂h ∂x ∣∣∣∣ x̂k|k−1 . (2.21) The posterior mean x̂k|k−1 and the posterior covariance Pk|k−1 is calculated by the prediction step to of the EKF as x̂k|k−1 = f(x̂k−1|k−1,uk), (2.22) Pk|k−1 = FkPk−1|k−1F⊺ k + Qk, (2.23) where Qk is the covariance of the process noise. The covariance Pk|k and mean x̂k|k of the EKF is computed by the update step as vk = yk −Hkx̂k|k−1, (2.24) Sk = HkPk|k−1H⊺ k + RSensor,k. (2.25) Kk = Pk|k−1H⊺ kS−1 k , (2.26) x̂k|k = x̂k|k−1 + Kkvk, (2.27) Pk|k = Pk|k−1 + KkSkK⊺ k, (2.28) 11 2. Theory where RSensor,k is the covariance matrix of the measurement. If the sensor measure- ments are registered at asynchronous times, the update step of the Kalman filter can be constructed with a two-layer fusion structure as described in [28]. This means that the update step is performed separately for every set of sensor measurements. 2.11 Evaluation The position estimation performance will be evaluated based on RMSE and the maximum error. The RMSE is defined as the errors RMS in a 2D space [29]. Position RMSE = √√√√ 1 Nest Nest∑ i=1 ((xest,i − xtrue,i)2 + (yest,i − ytrue,i)2), (2.29) xest,i and yest,i are the estimated position and xtrue,i and ytrue,i are the corresponding true position of the vessel. Nest is the total number of estimations. The maximum position error is defined as Max Position Error = Nestmax i=1 √ (xest,i − xtrue,i)2 + (yest,i − ytrue,i)2. (2.30) Heading estimation performance will also be evaluated by RMSE as Heading RMSE = √√√√ 1 Nest Nest∑ i=1 (ψest,i − ψtrue,i)2, (2.31) where ψest,i is the heading estimated and ψtrue,i is the true heading of the vessel. For the heading, the max error is defined as Max Heading Error = Nestmax i=1 √ (ψest,i − ψtrue,i)2. (2.32) 12 3 Methods In this chapter, the methods used to arrive at the result will be presented. The primary objective of the method was to determine a position and heading of a vessel by using data from IMU, LiDARs, GNSS and aerial images. This section will start by introducing the equipment chosen and the test environment. Then the process of how the position and heading were estimated will be covered. 3.1 Data collection To evaluate the performance of position and heading algorithm, data was collected from a vessel driving along the coast in the port of Krossholmen. The data was collected with an IMU sensor, five LiDAR sensors and a GNSS sensor. 3.1.1 Vessel Tiara 48 was chosen as the test vessel since it is well suited for a wide range of testing applications. The test vessel’s coordinate frame can be seen in Figure B.1 in Appendix B and is referred to as the base frame. 3.1.2 IMU The IMU of choice was developed by CPAC systems AB [30]. The IMU consists of a MEMS gyroscope, a MEMS accelerometer and a magnetometer. The gyroscope measures the angular velocities around its three axes with a range up to 200°/s and the accelerometer can measure forces on the three axes up to ±16g. The IMUs maximum sampling time was 100 Hz and mounted on the Tiara 48 as in Figure B.2 in Appendix B. The IMU was placed as close to the GNSS sensor as possible. The noise on the angular velocity measurement of the gyroscope was 0.008°/s/ √ δIMU and 37 mg/ √ δIMU on the accelerometer, where δIMU is the IMUs sampling frequency. The variance of the IMU sensors measurements σ2 Gyro and σ2 Acc were calculated from σ2 Gyro = ( 0.008√ δIMU )2 , (3.1) σ2 Acc = ( 37 1000 · 9.81√ δIMU )2 . (3.2) 13 3. Methods 3.1.3 GNSS The GNSS sensor of choice was the V200s Vector™ GNSS Compass [31]. To max- imize the accuracy of the V200s, it was mounted on the roof of the vessel with no blocking objects in the vicinity, as shown in Figure B.3 in Appendix B. The trans- formation matrix to Tiara 48 base, TGNSS B , can be seen in Appendix A. The V200s sampling rate was 10 Hz and it also supports SBAS. When the V200s is within reach of SBAS messages, the overall error characteristic is 0.3m RMSE for the position accuracy and 1.5◦ RMSE for the heading accuracy. However, the performance of the V200s will degrade when there is an obscuration against the GNSS signals and the specified error characteristics will then change as well. The V200s will then estimate the position measurements RMSE in real-time, denoted as eGNSS,k. This was used to estimate the variance of the V200s measurements from RMSE as described in [32], see equations (3.3)-(3.5). σ2 GNSS Latitude,k = ( eGNSS,k 1, 414 )2 (3.3) σ2 GNSS Longitude,k = ( eGNSS,k 1, 414 )2 (3.4) σ2 GNSS Heading,k = ( 1.5 1, 414 )2 (3.5) In scenarios where the GNSS environment was too challenging for the V200s, so no position measurement could be yielded, the V200s stopped providing information about the expected RMSE. In such instances, it was manually set to an arbitrarily large number. 3.1.4 LiDAR The vessel was equipped with five LiDARs, mounted as in Figure B.4 in Appendix B. Transformation according to the theory in sections 2.4.1 was applied to the LiDAR measurements to yield a collective point cloud for all LiDARs. The transformation matrices were computed through calibration prior to this work. The calibration was done by scanning an arbitrary environment with several overlapping features within multiple LiDARs FoV. ICP was then used on the different point clouds to yield the transformation between them. In Figure B.4 in Appendix B, the position and orientation of each LiDAR mounted on the Tiara 48 can be seen via their coordinate systems. The x-axis of the LiDARs coordinate system corresponds to the direction it was facing. L1, L2 and L3 were of type RS-BPearl [33]. This was chosen for its wide FoV and placed on the vessel to get a wide coverage of the surrounding environment. Together with L4 and L5, which were of the type RS-Helios [34], the vessel has almost total coverage of the environment it is located in. The relevant specification for each LiDAR type can be seen in Table 3.1. 14 3. Methods Table 3.1: LiDAR specifications. Name RS-BPearl RS-Helios Horizontal FoV 360◦ 360◦ Vertical FoV 87.19◦ 70◦ Horizontal resolution 0.2◦ 0.2◦ Range 100m 150m Range Accuracy ±0.03m (3σ) ±0.03m (3σ) Vertical lines 32 32 Type Mechanical Mechanical The transformation matrices used to transform each individual LiDAR point cloud into a collective point cloud can be seen in Appendix A. A visualization of a LiDAR point cloud can be seen in Figure 3.1. Figure 3.1: Visualisation of LiDAR point cloud. 3.1.5 Test environment As previously mentioned, the harbor of Krossholmen was selected as the test en- vironment. During testing, the vessel was constrained to be within 60 meters of the land to ensure the LiDARs captured features of the coastline. Three different route scenarios at two different locations were logged and used for testing. In the first session, case 1, the vessel enters the port with the dock in close proximity to its right side, as depicted with the blue trajectory in Figure 3.2a. In the second session, case 2, the vessel enters the port from the opposite direction and makes a direct turn to the left, as depicted with the red trajectory in Figure 3.2a. The third logging session, case 3, takes place at a nearby location in Krossholmen, where the vessel stood still but swayed back and forth, creating a rocking motion. Case 3 is 15 3. Methods visualized in Figure 3.2b with a red trajectory. The time duration for the scenarios was around one minute for case 3 and around two minutes for cases 1 and 2. Case 1 Case 2 (a) Case 3 (b) Figure 3.2: Aerial images of two locations at the port of Krossholmen. The first location is illustrated in (a) together with the trajectory of the two ground truth paths from the GNSS data. The blue trajectory represents case 1 and the red represents case 2. The second location is illustrated in (b) together with the ground truth path/location for case 3. To assess the algorithm’s robustness, additional noise signals were added to the GNSS sensor measurement. The added noise was designed to have the characteris- tics of noise that occur in a GNSS-challenged environment. According to [35] and [36], GNSS noise can be categorized into two parts, a noisy but unambiguous part and a precise but ambiguous part. This means the GNSS noise can be described as a white-like error with occasional jumps with inconsistent amplitude. The char- acteristic of the unambiguous part of the noise was made by stationary placing the GNSS receiver under a large metal crane tower for a long period of time. This caused the measured position to gradually deviate over time. To simulate the second part of the GNSS noise, the precis but ambiguous part, a constant value in latitude or longitude was added over a set duration. The GNSS signals without added noise were considered as the vessel’s ground truth. The position and heading estimation was implemented in Matlab using the data collected from the scenarios. All the tests and experiments were done using the same computer with the following specifications 16 3. Methods Table 3.2: Specifications of the computer used for testing. Make Lenovo Model ThinkPad P15 Gen 2 Operating system Windows 10, 64-bit Processor i7-11800H, 2.3 GHz, 8 Core(s) RAM 32 GB 3.2 General approach In order to estimate the position and heading, all the different sensor measurements were fused in an EKF, as shown in Figure 3.3. The IMU measured linear acceleration and angular velocities, while GNSS measured position and heading, which were direct inputs to the filter. However, the LiDARs generated a point cloud that needed to be processed together with the aerial image and heading to yield a refined position and heading reading. This was done by a self-constructed positioning algorithm. A detailed framework outlining the specific processing steps is provided in the following sections. Figure 3.3: A simplified version of the general approach for the position and heading estimation. 3.3 Design and implementation of positioning al- gorithm The projection of a point cloud onto an aerial image, captured from a position with a noisy GNSS signal does not align correctly, as illustrated in Figure 3.4. To address this problem, it is necessary to determine the appropriate offset that shifts the point cloud to the correct position. In this case, by adjusting the point cloud a little upwards and to the left, the resulting projection onto the aerial image could be significantly improved with features and edges more accurately aligned with their 17 3. Methods corresponding counterparts in the image. The idea behind the positioning algorithm is to find the offset that compensates for the interference caused by the noisy GNSS signal, allowing for more accurate positioning. An overview of how the algorithm works is summarized in Algorithm 2. Algorithm 2 Positioning algorithm 1. Read aerial image from estimated position 2. Project LiDAR point cloud on aerial image 3. Extract features from aerial image 4. Extract features from LiDAR point cloud 5. Match features from both sources and find transformation between them Point cloud projected on to an aerial image Projected point cloud Vessel position Figure 3.4: Aerial image from noisy GNSS signal with projected LiDAR point cloud as blue dots. 3.3.1 Aerial images Aerial images were provided by a toolbox from Matlab called Mapping Toolbox [37]. The toolbox supports a complete workflow for managing geographic data, making it suitable for obtaining an overhead view of the environment from any location worldwide. It can be used for mapping high-resolution satellite or aerial imagery with a Web Mercator projected coordinate reference system. The Mercator projection preserves angles and is suitable for small regions. To retrieve an aerial image of a specific position from the Mapping Toolbox, the latitude and longitude of the position must be known. The dimensions of the aerial images were set to provide coverage of 150× 150 meters around the vessel and the aerial image resolution was 18 3. Methods 938× 940 pixels with no skew ratio. This was used to calculate the parameters as γ = 938 940 , (3.6) f = 940 150 , (3.7) px = 938 2 , (3.8) py = 940 2 , (3.9) s = 0. (3.10) which was used to yield the intrinsic camera matrix KImg from equation 2.10 as KImg = 6.2533 0 469 0 6.2667 470 0 0 1  . (3.11) A visualization of the aerial images from the mapping toolbox can be seen in Figure 3.4. 3.3.2 Project LiDAR point cloud The collective LiDAR point cloud Xk was generated via transformation matrices to the base of the vessel, as described in Section 2.4.1. The transformation matrices can be seen in Appendix A. Equation (2.7) was utilized to convert the point clouds XL1,k, XL2,k, XL3,k, XL4,k and XL5,k generated by each respective LiDAR, into homogeneous transformed coordinates. The homogeneous collective point cloud X̃k,B was obtained by X̃k,B = [ TL1 B X̃L1,k TL2 B X̃L2,k TL3 B X̃L3,k TL4 B X̃L4,k TL5 B X̃L5,k ] . (3.12) Xk was then produced by converting the homogeneous coordinates X̃k,B to cartesian coordinates. The projection matrix Pproj,k was then applied to project the LiDAR point cloud onto an aerial image, described in Section 2.4.2, as X̃′ k = Pproj,kXk. (3.13) Pproj,k was defined by the 2D rotation Rrot(ψk) = cosψk − sinψk 0 sinψk cosψk 0 0 0 0  , (3.14) the translation vector t = [ 0 0 1 ]⊺ , (3.15) and the intrinsic camera matrix KImg from equation (3.11). ψk were the vessels heading at time step k. The last step consists of converting the projected homoge- neous coordinates X̃′ k to cartesian coordinates X′ k. 19 3. Methods 3.3.3 Extract features from aerial image Canny edge detection, introduced in Section 2.5, returns a binary image, also referred to as Bw (black and white), containing 1’s where the function finds edges in the grayscale image and 0’s elsewhere [38]. 3.3.4 Extract features from LiDAR point cloud The next part of the positioning algorithm consisted of extracting the important features from the LiDAR point cloud. However, the LiDARs produce a large point cloud, containing a lot of information in each frame. The point cloud therefore needed to be reduced first. A large number of points did not fulfill any important purpose in this situation and were therefore discarded directly. The list below states the criteria for how the points were selected: • Points outside aerial image. • Points that have a height value less than 0.3m above the water surface. • Points within the polygon shape of the vessel. As presented in Section 2.5, Canny edge detection identifies pixels that have a large gradient, indicating a change in intensity. Because of this, there were likely going to be clear edges identified in the harbor between the blue water and the land. Hence, it was important to keep features like walls and other sharp edges of the docks and piers from the point cloud. When viewing a 3D point cloud projected into 2D, walls and other sharp edges are typically represented as a series of stacked points with high density (same x and y coordinate in 2D). Algorithm 3 selects these points by finding the indices with the most frequently occurring coordinate pairs, so the remaining points could be discarded. A visualization example of the behavior of the algorithm can be seen in Figure 3.5. Algorithm 3 Select most frequently occurring coordinate pairs coords← [round(x), round(y)] nrPoints← 3000 uniqueCoords, ic← Find the unique coordinate pairs in coords and the index mapping of rows in coords to rows in uniqueCoords, such that coords = uniqueCoords(ic) count← Count the number of times each element in uniqueCoords appears in coords sortedInd← Sort count in descending order and return the sorted indices topInd← Select indices of first nrPoints elements of sortedInd topCoords← Extract rows in uniqueCoords corresponding to topInd ind← Find indices of rows in coords that match rows in topCoords 20 3. Methods -75 50 25 0 25 50 75 m -75 50 25 0 25 50 75 m Most frequently occurring coordinate pairs 2D point cloud Overlapping 2D points Figure 3.5: The most frequently occurring coordinate pairs (yellow markers) se- lected with Algorithm 3 applied on all LiDAR points (blue markers). Since the detected edges in Bw typically form continuous lines along the boundaries of objects, it should be expected that the LiDAR point cloud would provide similar patterns. As seen in Figure 3.5, the point cloud was not at that stage yet and required further processing. The RANSAC algorithm, presented in Section 2.6, was therefore the next step used to estimate lines from the stacked LiDAR points. The algorithm that was used for this is presented in Algorithm 4. The image was divided into small grids, where the RANSAC algorithm was performed in each grid. This approach proved to be useful since the point cloud contains multiple intersecting and parallel lines. Consequently, the lines were independently estimated in the different regions of the image. An example of how it works is illustrated in Figure 3.6 where the final point cloud, also referred to as Xreduced, is shown with red dots. Algorithm 4 RANSAC-based line fitting for grid segmentation of image Initilize: Divide binary image into i number of grids, where 1’s represents the LiDAR points and 0’s represents the absence of LiDAR points in the image threshold← 15 for j ← 1 to i do if number of points in gridj > threshold then line← RANSAC(points) gridj ← 1’s in position of line and 0’s elsewhere else gridj ← 0’s in all positions end if end for 21 3. Methods -75 50 25 0 25 50 75 m -75 50 25 0 25 50 75 m RANSAC-based line estimation Overlapping 2D points Estimated lines Figure 3.6: LiDAR point cloud before RANSAC (yellow marks) and LiDAR point cloud after RANSAC (red marks). The dashed blue lines show each grid RANSAC is performed in. 3.3.5 Point cloud aligning The final stage of Algorithm 2 involves finding the transformation between the features present in the reduced point cloud, Xreduced, with those extracted from the aerial image, Bw. An example of how the two could look can be seen in Figure 3.7. -75 50 25 0 25 50 75 m -75 50 25 0 25 50 75 m Extracted features from aerial image -75 50 25 0 25 50 75 m -75 50 25 0 25 50 75 m Reduced LiDAR point cloud Extracted features of aerial image and reduced LiDAR point cloud Figure 3.7: The grey markers to the left illustrates the features from the aerial image Bw. The red markers to the right illustrates the reduced LiDAR point cloud Xreduced. The sought transformation could be found with an FFT-based cross-correlation, which was described in Section 2.8. Cross-correlation is an effective method for aligning point clouds with high accuracy. However, cross-correlation by itself is not capable of rotating matrices. This limitation means that if the point cloud and 22 3. Methods aerial image were not aligned because of a slight inaccuracy in the point cloud’s ro- tation, the computed cross-correlation results would be inaccurate. The relationship between the rotation angle and the cross-correlation is generally nonlinear. As a re- sult, the stochastic optimization method PSO was chosen as a well-suited method for finding the best rotation angle between the matrices. PSO was introduced in Section 2.9. The search space of interest was set to [−5◦, 5◦] from the initial pose. Because of the small search space, a good and fast configuration of the PSO was determined to 10 particles and 6 iterations, where each particle represents a rotation angle. The objective function, see equation (3.16), was formulated as the negative cross- correlation between the fixed matrix Bw and the rotated matrix X′ reduced,i for the ith particle, where i = 1, . . . , 10. f(X′ reduced,i,Bw) = −max ( FFT−1 { FFT{X′ reduced,i} ∗ FFT{Bw} }) (3.16) In conclusion, the PSO finds the optimal rotation angle that maximizes the cross- correlation between Bw and Xreduced, thereby achieving the desired transformation. The remaining parameter values for the PSO was set to vmax = α = ∆t = 1, c1 = c2 = 2 and w = 0.4. A pseudocode for the comprehensive positioning algorithm can be seen in Algorithm 5. Algorithm 5 Update position and heading measurement from 3D point cloud Initialize: xest,k, yest,k, ψest,k as the current estimated position and heading for 3D point cloud at time k do Xk ← LiDAR measurement ▷ Update point cloud Ik ← Aerial image of xest,k and yest,k ▷ Load aerial image Bw,k ← Canny-edge(Ik) ▷ Detect edges of aerial image Pproj,k ← KImg [ Rrot(ψest,k) t ] ▷ Update projection matrix X′ k ← Pproj,kXk ▷ Project and translate points X′ o,k ← X′ o,k ∈ X′ k ▷ Filter points with Algorithm 3 Xreduced,k ← RANSAC(X′ o,k) ▷ Reduce point cloud with Algorithm 4 ψupdate,k ← PSO(Xreduced,k,Bw,k) ▷ Update heading with PSO X′ reduced,k ← Rrot(ψupdate,k)Xreduced,k ▷ Rotate point cloud with heading xupdate,k, yupdate,k ← Cross-correlation(X′ reduced,k,Bw,k) ▷ Update position Return: xupd,k, yupdate,k and ψupdate,k end for 3.4 Kalman filtering The Extended Kalman Filter was implemented to yield a more precise estimation of position and heading estimation based on the sensor data. The EKF uses the data from IMU and GNSS as well as the outputs provided from Algorithm 5. 23 3. Methods 3.4.1 Prediction step To predict where the vessel was going to be one time step ahead, the motion model was first defined. The nonlinear motion model was defined by the linear forces and momentum applied on the vessel’s coordinate system which can be seen in Figure 3.8. The linear forces were applied in the directions of the vessel’s base coordinate axis xb, yb and zb which are referred to as surge, sway, and heave respectively. The momentum was applied around the axis and the orientation along the base axis, denoted as the roll, pitch and yaw is also noted as the vessels heading. Figure 3.8: Visualisation of the vessel’s degrees of freedom and coordinate frame. Changes in the heave distance, as well as pitch and roll, would yield small changes in the position and heading of the vessel. Therefore, the vessel was assumed to be in a 2D plane where the heave distance as well as the pitch and roll angles were constant. Position and heading were then calculated from the linear acceleration and angular velocity measurements as xk = xk−1 + δ(cos(ψk−1)vsurge,k−1 − sin(ψk−1)vsway,k−1), (3.17) yk = yk−1 + δ(sin(ψk−1)vsurge,k−1 + cos(ψk−1)vsway,k−1), (3.18) vsway,k = vsway,k−1 + δasway,k−1, (3.19) vsurge,k = vsurge,k−1 + δasurge,k−1, (3.20) ψk = ψk−1 + δψ̇k−1. (3.21) And the state vector was set as xk = [ xk yk vsurge,k vsway,k ψk ]⊺ . (3.22) xk is the horizontal global position and yk is the vertical global position. ψk is the yaw as well as the heading of the vessel, vsway,k is the vessels velocity in its facing 24 3. Methods direction, vsurge,k is the velocity sideways (orthogonal to vsway,k) and δ is the filters computation frequency which was set to 20 Hz. ψ̇k−1 is the yaw rate, asurge,k−1 is the acceleration in the vessels facing direction and asway,k−1 is the acceleration sideways. To acquire ψ̇k−1, asway,k−1 and asurge,k−1, the IMU measurements were transformed to align with the vessel coordinate systemasurge,k asway,k ψ̇k  = RIMU rot,B ax,k ay,k ωz,k  , (3.23) where RIMU rot,B is the rotation matrix from the IMU to the vessel base, as in Appendix A. ax,k is the linear acceleration measured in the IMUs x-axis, ay,k is the linear acceleration measured in y-axis and ωz,k is the angular velocity measured around the z-axis. The coordinate system of the IMU can be seen in Appendix B in Figure B.2. The transformed IMU measurements were set as the Kalman filter control vector uk, see equation (3.24). The IMU measurements are used as a control input because they influence the state prediction step but they are not directly estimated. uk =  0 0 asurge,k asway,k ψ̇k  (3.24) From the control vector uk, state vector xk and the state equations (3.17) - (3.20), the motion model f(xk−1,uk) becomes f(xk−1,uk) =  1 0 δ cos(ψk−1) −δ sin(ψk−1) 0 0 1 δ sin(ψk−1) δ cos(ψk−1) 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1   xk−1 yk−1 vsurge,k−1 vsway,k−1 ψk−1  + δuk. (3.25) The Jacobian of the motion model Fk therefore became Fk = ∂f ∂x ∣∣∣∣ xk−1 =  1 0 δ cos(ψk−1) −δ sin(ψk−1) 0 0 1 δ sin(ψk−1) δ cos(ψk−1) 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1  . (3.26) The presence of noise in the control vector was used to model the process noise qk−1 from the IMU variances, introduced in Section 3.1.2. The variances were calculated using equations (3.1) - (3.2) with the IMUs sampling time δIMU set as equal to the filters computation frequency, 20 Hz. qk−1 = δ  0 0 N (0, σ2 Acc) N (0, σ2 Acc) N (0, σ2 Gyro)  (3.27) 25 3. Methods As described in [39], to derive the process noise matrix Qk, the state transition matrix Qa from equation (3.28) could be used. Qa =  0 0 0 0 0 0 0 0 0 0 0 0 δσ2 Acc 0 0 0 0 0 δσ2 Acc 0 0 0 0 0 δσ2 Gyro  (3.28) The state transition matrix Qa and the motion model Fk yielded the process noise matrix Qk as Qk = α1FkQaF⊺ k, (3.29) and can be seen in Appendix C. The tuning constant α1 addressed uncertainties that arose from assuming the heave distance, pitch, and roll to be constant, as well as accounted for uncertainties related to the placement of the IMU. The process covari- ance matrix Qk together with the motion model f(xk−1,uk) and the motion model Jacobian Fk was then used to calculate the posterior mean x̂k|k−1 and covariance Pk|k−1 as equations (2.22) - (2.23). 3.4.2 Update step As mentioned in Section 2.10, the update step in the EKF was performed separately for every set of sensor measurements. In this work, it consisted of two different sensor measurements, one from the GNSS sensor and one from the LiDAR-based position- ing algorithm. Both of the sensors gave a measurement of the vessel’s position and heading, corresponding to the states xk, yk and ψk. Therefore, the Jacobian of the measurement model became H = ∂h ∂x ∣∣∣∣ xk−1 = 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1  . (3.30) The measurement covariance RGNSS,k for the GNSS sensor was calculated using the measured standard deviation. During weak GNSS signals, the GNSS sensor did not accurately measure the standard deviation for its measurements. The measurement covariance was therefore multiplied with the tuning parameter α2 to compensate for the uncertain standard deviations. RGNSS,k = α2 σ 2 GNSS Latitude,k 0 0 0 σ2 GNSS Longitude,k 0 0 0 σ2 GNSS Heading,k  . (3.31) The LiDAR-based positioning algorithm, from Section 3.3, is designed to match features. Therefore, the algorithm’s performance was expected to improve when there were many features available. The number of features available was assumed to correlate with the number of points in the LiDAR point cloud. The number of points at the time instance k is denoted as Nk and the maximum amount of points the LiDARs can capture is denoted as the constant Nm. As described in section 26 3. Methods 3.1.4, there were 5 LiDARs on the vessel. Using the LiDARs FoV and resolution together with the number of vertical lines, the maximum number of points could be calculated, see equation (3.32). Nm = 5 · (360 0.2 · 32 ) = 288000 (3.32) The measurement covariance of the LiDAR-based update was then set with the tuning constant α3 as RLiDAR,k = α3  ( Nm 10Nk+1 )2 0 0 0 ( Nm 10Nk+1 )2 0 0 0 10  . (3.33) With the two measurement covariance matrices and the measurement model de- fined, the update step was performed via equations (2.24) - (2.28). The update step with LiDAR measurements was set to update with 1 Hz, since the PSO and cross- correlation steps in Algorithm 5 were computationally heavy. To further decrease the computational load of Algorithm 5, the algorithm was adjusted to only load new aerial images when the vessel was estimated to have moved 5 meters away from the position where the previous image was loaded. The update step with GNSS sensor measurements was done with a 10 Hz frequency. The Kalman filtering process is summarized in Algorithm 6. The if statement in the LiDAR update part of the algorithm was introduced because the stochastic optimization method PSO did not always converge to the global optimum. To overcome this limitation, a maximum allowable distance was implemented to assess the update distance prior to incorporating measurements into the update step of the EKF. The maximum distance was set to 2 meters. This criterion served as an indicator that an unusually large updated distance might indicate that the PSO has found a local optimum, and the measurement should therefore not be used as an update. These measurements are referred to as false measurements. To simulate the vessel entering a GNSS-challenged environment from an environment with accurate GNSS signals, the Kalman filter prior x0 was set with the true position and heading values during the initialization phase of the filtering, 27 3. Methods Algorithm 6 Kalman filtering process Initialize: x0 as the prior ▷ Initialize state vector P0 ← covariance matrix of x0 ▷ Initialize covariance matrix Qk ← covariance matrix of IMU noise ▷ Process covariance matrix RGNSS,k ← covariance of GNSS noise ▷ Measurement covariance matrix RLiDAR,k ← covariance of LiDAR noise ▷ Measurement covariance matrix for k = 1, 2, 3, . . . do Predict: uk ← IMU measurement ▷ Update control vector x̂k|k−1 ← f(x̂k−1|k−1,uk) ▷ Predicted state estimate Pk|k−1 ← FkPk−1F⊤ k + Qk ▷ Predicted covariance estimate Update: if GNSS measurement available then ▷ Kalman Filter Update yk ← GNSS measurement RGNSS,k ← GNSS measurement vk ← yk −Hx̂k|k−1 Sk ← HPk|k−1H⊺ + RGNSS,k Kk ← Pk|k−1H⊺S−1 k x̂k|k ← x̂k|k−1 + Kkvk Pk|k ← Pk|k−1 + KkSkK⊺ k end if if LiDAR measurement available then Xk ← LiDAR measurement ▷ LiDAR update yk ← Algorithm 5(Xk) ▷ Measurement from 3D point cloud if update distance < max distance then Nk ← #Xk ▷ Update number of LiDAR points RLiDAR,k ← Nk ▷ Update covariance matrix, equation (3.33) vk ← yk −Hx̂k|k−1 Sk ← HPk|k−1H⊺ + RLiDAR,k Kk ← Pk|k−1H⊺S−1 k x̂k|k ← x̂k|k−1 + Kkvk Pk|k ← Pk|k−1 + KkSkK⊺ k end if end if if No new measurement available then x̂k|k ← x̂k|k−1 Pk|k ← Pk|k−1 end if end for 28 4 Results This chapter shows the results of the system for the three different cases presented in Section 3.1.5. There are two different noise scenarios that were tested in each case. Also, some different outcomes of the positioning algorithm are presented. 4.1 Performance The following tuning parameters were used for all the tests in this section: α1 = 10 (Q,k), α2 = 10 (RGNSS,k), α3 = 0.5 (RLiDAR,k). The figures below show the true and estimated positions obtained through the measurement signal. The mea- surement consists of the GNSS data with added noise, leading to signal drift. Two different scenarios were tested in all three cases. The first scenario consists of noise that has gradually drifted over time, resulting in a few meters offset from the true position, see Figures 4.1, 4.3 and 4.5. The position RMSE and max error for the measurement in the first scenario are listed in Table 4.1. Table 4.1: Position RMSE and max error for measurement signal in the first scenario. Position [m] RMSE Max error Measurement (Case 1) 1.3567 1.7044 Measurement (Case 2) 1.3658 1.7044 Measurement (Case 3) 1.2847 1.5985 The second scenario consists of a slight reduction of noise but a sudden jump is introduced, representing the GNSS signal becoming temporarily lost, see Figures 4.2, 4.4 and 4.6. During the signal loss period, the tuning parameter for the measurement noise covariance matrix RGNSS,k was increased to α2 = 1000. The position RMSE and max error for the measurement in the second scenario are listed in Table 4.2. Table 4.2: Position RMSE and max error for measurement signal in the second scenario. Position [m] RMSE Max error Measurement (Case 1) 2.4596 14.5443 Measurement (Case 2) 2.4428 14.5115 Measurement (Case 3) 3.6406 9.9748 29 4. Results 0 5 10 15 20 25 30 35 40 45 x [m] -35 -30 -25 -20 -15 -10 -5 0 y [ m ] Measurement Estimated position True position Figure 4.1: True trajectory of the vessel, measurements, and the result of the Kalman estimation for case 1. The GNSS data without noise is used as prior. Table 4.3: RMSE and max error for estimation from run in Figure 4.1. State RMSE Max error Position [m] 0.5893 1.6668 Heading [◦] 1.7332 4.8879 5 10 15 20 25 30 35 40 45 x [m] -30 -25 -20 -15 -10 -5 y [ m ] Measurement Estimated position True position Figure 4.2: True trajectory of the vessel, measurements with a sudden jump, and the re- sult of the Kalman estimation for case 1. The GNSS data without noise is used as prior. Table 4.4: RMSE and max error for estimation from run in Figure 4.2. State RMSE Max error Position [m] 0.7241 1.8233 Heading [◦] 1.7118 4.9983 30 4. Results 35 40 45 50 55 60 65 70 75 80 85 x [m] -35 -30 -25 -20 -15 -10 -5 y [ m ] Measurement Estimated position True position Figure 4.3: True trajectory of the vessel, measurements, and the result of the Kalman estimation for case 2. The GNSS data without noise is used as prior. Table 4.5: RMSE and max error for estimation from run in Figure 4.3. State RMSE Max error Position [m] 1.7394 5.5664 Heading [◦] 1.6366 3.8959 35 40 45 50 55 60 65 70 75 80 x [m] -30 -25 -20 -15 -10 y [ m ] Measurement Estimated position True position Figure 4.4: True trajectory of the vessel, measurements with a sudden jump, and the re- sult of the Kalman estimation for case 2. The GNSS data without noise is used as prior. Table 4.6: RMSE and max error for estimation from run in Figure 4.4. State RMSE Max error Position [m] 1.2505 2.8422 Heading [◦] 1.5556 3.9108 31 4. Results 94 95 96 97 98 99 x [m] -162 -160 -158 -156 -154 -152 -150 y [ m ] Measurement Estimated position True position Figure 4.5: True trajectory of the vessel, measurements, and the result of the Kalman estimation for case 3. The GNSS data without noise is used as prior. Table 4.7: RMSE and max error for estimation from run in Figure 4.5. State RMSE Max error Position [m] 0.2901 0.9984 Heading [◦] 2.3374 5.0670 94 95 96 97 98 99 100 101 102 103 104 x [m] -160 -158 -156 -154 -152 -150 -148 y [ m ] Measurement Estimated position True position Figure 4.6: True trajectory of the vessel, measurements with a sudden jump, and the re- sult of the Kalman estimation for case 3. The GNSS data without noise is used as prior. Table 4.8: RMSE and max error for estimation from run in Figure 4.6. State RMSE Max error Position [m] 0.3500 1.1214 Heading [◦] 1.8218 3.2051 4.2 Positioning algorithm Some different outcomes of the positioning algorithm, Algorithm 5 presented in Sec- tion 3.3, can be seen below. The performance of the algorithm was very dependent on the environment and LiDAR data. Figure 4.7 showcases a good performance, while Figures 4.8 and 4.9 showcase a more challenging scenario where it performed worse. Figure 4.10 shows an instance from case 3, which is a relatively uncompli- cated environment with good conditions. However, the global best transformation that was found is incorrect. The time requirement of the algorithm was 4.80 sec- 32 4. Results onds when a new aerial image was downloaded and 3.59 seconds when the previously loaded aerial image was reused. -75 50 25 0 25 50 75 m -75 50 25 0 25 50 75 m Cross-correlation position update Canny edge image LiDAR point cloud Updated LiDAR point cloud Vessel position Updated vessel position k - 4° k - 2° k ° k + 2° k + 4° Rotation angle [°] 100 120 140 160 180 200 220 C ro s s -c o rr e la ti o n s im ila ri ty PSO heading update Objective functionen Particles Best particle found Figure 4.7: An instance from case 1, the first subfigure (left) shows the new alignment of the point cloud (green) after finding the best transformation of the input point cloud (red) with the extracted features from the aerial image (grey). The second subfigure (right) shows the PSO that has found the global best rotation angle with the objective function f(x) as the negative cross-correlation between extracted features and the rotated point cloud. Each particle, for the last iteration, is denoted by red marks, while the green mark indicates the global best position that has been found. -75 50 25 0 25 50 75 m -75 50 25 0 25 50 75 m Cross-correlation position update Canny edge image LiDAR point cloud Updated LiDAR point cloud Vessel position Updated vessel position k - 4° k - 2° k ° k + 2° k + 4° Rotation angle [°] 75 80 85 90 95 100 105 110 115 120 125 C ro s s -c o rr e la ti o n s im ila ri ty PSO heading update Objective functionen Particles Best particle found Figure 4.8: An instance from case 2, the first subfigure (left) shows the new alignment of the point cloud (green) after finding a transformation of the input point cloud (red) with the extracted features from the aerial image (grey). The second subfigure (right) shows the PSO that has found the global best rotation angle (green mark) with the objective function f(x) as the negative cross-correlation between extracted features and the rotated point cloud. 33 4. Results -75 50 25 0 25 50 m -75 50 25 0 25 50 75 m Cross-correlation position update Canny edge image LiDAR point cloud Updated LiDAR point cloud Vessel position Updated vessel position k - 4° k - 2° k ° k + 2° k + 4° Rotation angle [°] 75 80 85 90 95 100 105 110 115 120 C ro s s -c o rr e la ti o n s im ila ri ty PSO heading update Objective functionen Particles Best particle found Figure 4.9: Same instance as in Figure 4.8, the first subfigure (left) shows the new alignment of the point cloud (green) after finding another valid transformation of the input point cloud (red) with the extracted features from the aerial image (grey). The second subfigure (right) shows the PSO that has found a local best rotation angle (green mark) with the objective function f(x) as the negative cross-correlation between extracted features and the rotated point cloud. 50 25 0 25 m 25 0 25 m Cross-correlation position update Canny edge image LiDAR point cloud Updated LiDAR point cloud Vessel position Updated vessel position k - 4° k - 2° k ° k + 2° k + 4° Rotation angle [°] 50 100 150 200 250 300 350 400 C ro s s -c o rr e la ti o n s im ila ri ty PSO heading update Objective functionen Particles Best particle found Figure 4.10: An instance from case 3, the first subfigure (left) shows the new alignment of the point cloud (green) after finding the best transformation of the input point cloud (red) with the extracted features from the aerial image (grey). The second subfigure (right) shows the PSO that has found a global best rotation angle (green mark) with the objective function f(x) as the negative cross-correlation between extracted features and the rotated point cloud. 34 5 Discussion In this chapter, the results collected are discussed based on the research questions. The methods used to achieve the results presented are also discussed and their performance is evaluated. 5.1 Estimated position and heading performance As stated in the first research question, this thesis has investigated how integrating aerial imagery improves a system’s robustness in case of characteristic GNSS signal disturbance. The presented results show that while the proposed system enhances robustness in certain scenarios, it does not have the same effect in others. To get a better understanding of the system, each case is discussed individually in the subsections below. The discussion also involves how the LiDAR data is matched with the aerial imagery for the estimation of the position and heading, which is the second research question of this thesis. 5.1.1 Case 1 The two scenarios that were tested in case 1 both demonstrated that the estimation was an improvement of the position and heading. It can be seen in Figures 4.1 and 4.2 that the estimation is close to the ground truth. The values in Tables 4.3 and 4.4 also indicate this. The position RMSE goes from 1.36m down to 0.59m for scenario 1, and from 2.46m down to 0.72m for scenario 2. The position max error was also reduced, especially in scenario 2 where the sudden jump was imposed. Figure 4.7 showcases why the system works well for this case. There are many lines, both vertically and horizontally, for the algorithm to use and match with the features from the aerial image. Therefore, a distinct global maximum exists for the rotation angle that produces a high cross-correlation similarity. 5.1.2 Case 2 The performance of the system in a less favorable environment can be seen in Fig- ures 4.3 and 4.4. The results regard test case 2, where the values in Table 4.5 show that the estimation was a deterioration of the position and heading. The position RMSE was increased from 1.37m to 1.74m and the max error from 1.70m to 5.57m in scenario 1. The second scenario, regarding Table 4.6, shows a slight improve- ment in the estimation but not as good as for case 1. The reason behind the less 35 5. Discussion accurate estimation in this case is because the system encounters difficulties during the initial one-third of the path. Figures 4.8 and 4.9 show two possible results of the positioning algorithm during the initial phase of the path. In this instance, it is a more challenging scenario where there are no clear vertical lines after the point cloud has been processed, making it difficult to fix the new alignment in the horizontal direction. This is also noticeable in the PSO subfigure, where there are two local maximums. This makes the LiDAR update very sporadic and unreliable. Therefore, it is often classified as a false measurement and is barely used during the initial phase of the path, contributing to the inaccurate estimation. In Table 3.1, the LiDAR range is specified as 100m. However, upon examining Figures 4.8 and 4.9, it becomes apparent that the LiDARs have not detected any objects on the left side of the image, even though the dock on the left side is only about 50m away from the vessel. This is because the LiDAR points on the left side are classified as below water level, and are subsequently filtered out. In general, when a vessel makes a sharp turn, it tends to lean inward toward the center of the turn. In case 2 the vessel is turning left, making the horizontal plane, that should tangent the water surface, instead point upwards in the direction of the dock. Therefore, the algorithm perceives that the LiDAR points in that area are below the water surface and are removed. To compensate for this and ensure that the LiDAR points are retained, it would have been beneficial to introduce roll as a state in the EKF. In this way, more vertical lines would exist in the LiDAR point cloud and improve the performance of the system for case 2. 5.1.3 Case 3 Case 3 is more unique since the vessel stays at the same location and the recording duration is shorter than for cases 1 and 2. It is the time duration together with the good performance that makes the RMSE values, Tables 4.7 and 4.8, so low compared to the other cases. Figure 4.10 depicts an instance from case 3, where the point cloud consists of a good variety of lines. However, it is important to note that a false measurement is observed in this particular instance, which is discussed in Section 5.2. The incorporation of roll as a state in addition to the variety in lines could have further improved the estimation performance. More information about the motion of the vessel would have let the filter better adapt to the presence of ”waves” and adjust the estimated position and heading. 5.2 Supplementary validation of positioning algo- rithm As shown in Section 4.2, and discussed above, the positioning algorithm worked well in some cases and worse in others. Regarding research question three, objects that are only present in the aerial image had a minimal to negligible impact on the sys- tem in the majority of situations. This can for example be seen in Figure 4.7, where a few docked vessels are visible as extracted features in the aerial image. However, 36 5. Discussion in the LiDAR point cloud, only the dock is visible but it still accomplished the task of identifying the correct update. Nonetheless, it does exist specific circumstances when this has an effect on the system. Figure 4.10 showcase a false measurement where the updated LiDAR point cloud aligns with the wrong side of the dock (left side aligns with right side). Usually when this happens, other parts of the LiDAR point cloud would not align with anything, therefore decreasing the cross-correlation. But in this instance, instead of not aligning with anything the other end of the dock happens to align really well with the vessel from the aerial image. The configuration of the PSO, as described in Section 3.3.5, appears to be a good balance between speed and performance. The chosen search space of [−5◦, 5◦] also proved to be suitable, given that the maximum error observed during testing was 5.07◦. 5.3 Extended Kalman filtering The fusing of measurements proved to work well for producing a state estimation and the EKF can be used to answer the fourth research question. When the EKF was provided with errors in the GNSS measurements, the measurement covariance RGNSS,k increased simultaneously with the GNSS sensors measured RMSE. The EKF then performed state estimation with more respect to the IMU and LiDAR measurements, resulting in a trajectory close to the ground truth. The filtering worked well when GNSS measurement errors were handled and propagated but it worked worse for false LiDAR measurements. The trajectories of the estimated position can be seen taking sudden sharp jumps. These sharp jumps were produced by the EKFs update step with LiDAR measure- ments which indicates that the LiDAR measurement covariance RLiDAR,k was mod- eled with low variances compared to the other sensors measurement. This proved to work well for the estimation in cases 1 and 3 but worse in case 2. When Algorithm 5 provided false measurements as in case 2, the estimation can be seen deviating to the left of the true position in Figure 4.3. The GNSS and IMU measurements addressed this so the estimated trajectory returned to align with the true position. Since the estimated trajectory deviated when the EKF was provided with false measurements, the modeling of the LiDAR measurement covariance RLiDAR,k can be assumed to not work as intended. In section 3.4.2, the assumption that ”the number of matchable features depends on the size of the point cloud” can still be assumed to be true. However, since it was discovered that false measurements occur independent of the size of the point cloud, the accuracy of the LiDAR measurement should not have been assumed to be only dependent on the number of features. An implementation that could have addressed the issue of inaccurate modeling of LiDAR measurement covariance would be to set variance values dependent on the cross-correlation measured similarity instead, equation (3.16). The values of the cross-correlation measured similarities can be seen in Figure 4.7 - 4.10. In cases 1 and 3, it can be seen that higher values were used for the PSO compared to case 2. 37 5. Discussion Since the EKFs LiDAR update step performed better in cases 1 and 3, implementing a measured cross-correlation similarity into LiDAR measurement covariance could handle and propagate errors and uncertainties for more reliable results. Another implementation that could increase the performance of the LiDAR up- date step would be to model measurement covariance RLiDAR,k with respect to the number of horizontal and vertical lines. As discussed in Section 5.1.1 - 5.1.2, the performance of the cross-correlation position update and the PSO heading update decreased when there was a shortage of either horizontal or vertical lines as in case 2. This could be used to model the measurement covariance so that the uncertainty should be increased if there was mostly one type of line, as in case 2. The extended Kalman filtering was restricted to a 2D motion model. This was shown to work for improving the position and heading estimation of the vessel in GNSS-challenged environments. However, a more accurate and correct way of implementing the EKF would have been to model it for 3D. As discussed in Section 5.1.2 - 5.1.3, implementing roll in the state estimation could have improved the performance of the system. If more states were used in the EKF, even better results could have been found, such as in the implementation of [5] which was covered in Section 1.2. 5.4 Data collection There is a limitation in terms of the amount of data collected for this thesis. There were only three different data sets that have been used for evaluation, and each data set contributed valuable insight into how the system works. The limited data impacted the ability to fully evaluate the system. Where if additional testing could have been done, potentially greater insights could have been obtained. As presented earlier, the GNSS signal without the added noise has been used as the ground truth. First of all, it would generally be more appropriate to use an additional and independent GNSS sensor when evaluating the system. Secondly, the GNSS measurements by itself have some error characteristics. In Section 3.1.3 it is presented that the position and heading RMSE accuracies are 0.3m and 1.5◦ respectively for the GNSS sensor used. Therefore, the designed system could output the correct position and heading but still generate a high RMSE, since the ground truth could be incorrect. One solution to this could have been to equip the vessel with an additional GNSS source, that uses real-time kinematic positioning (RTK) for instance. This would ensure that the reference data is obtained from a highly accurate and precise positioning system. Consequently, the obtained results would better reflect the actual performance of the system. The LiDAR sensors used, RS-Bpearl and RS-Helios worked well in capturing the surrounding environment of the vessel. The IMU also performed well in measuring forces and angular velocities. For better accuracy regarding the IMU measurement, the IMU could have been placed in the vessel’s center of rotation. 38 5. Discussion 5.5 Real-time applications A challenge when implementing a localization system for real-time using IMU, GNSS and LiDAR sensors is the computational load of how the LiDAR data is handled to procure a state measurement. The positioning algorithm, which produces state measurements from the LiDAR data, was set to a computing frequency of 1 Hz. This was slower than the computing time required, therefore the state measurements were not computed fast enough. The time duration for Algorithm 5 can be lowered with the use of high-performing programming languages and code optimization. How- ever, the computer specification in real-time applications would be less powerful. Filtering and reducing the number of points in the LiDAR data would lower the computational load of the algorithm however it could discard important features. The PSO in the algorithm could be optimized for a lesser computational load which can be achieved by using fewer particles and fewer iterations. This would make the algorithm faster but less accurate since it would be more likely to get stuck at local maximums which had a higher probability of yielding false state estimations. A better approach for making the algorithm more suitable for real-time application would be to keep the PSO optimized for accuracy rather than computation speed and have the EKF update with respect to LiDAR measurements at a lower frequency. Another implementation that could make the PSO yield higher accuracy without increasing PSO particles or iterations would be to use an adaptive search space of interest. As the EKF computes the posterior covariance Pk|k−1, this uncertainty could be used to define the search space of the PSO. During low values of posterior covariance, the PSO could be used to fine-tune the heading estimation and during high values, the PSO would be used to approximately estimate the heading. 39 5. Discussion 40 6 Conclusion In this thesis, a system that estimates the position and heading with the use of aerial imagery, LiDAR sensors, an IMU sensor and a GNSS sensor, was constructed. The method of how the presented systems use LiDAR data to estimate position and heading can be described as, firstly, an aerial image that has captured the surround- ing environment is transformed via Canny edge detection. Secondly, the LiDAR data is projected to a top view and filtered so overlapping points are extracted and used for RANSAC line estimation. The lines and the Canny edge transformed aerial image are then matched using cross-correlation. PSO is used to optimize the head- ing that maximizes the cross-correlation similarity. The found transformation to the solution is used to update the position estimate. Finally, the position and heading estimate is fused with measurements from an IMU sensor and a GNSS sensor in an EKF to get a more precise estimation. The system’s performance was then tested on the sensor data from 6 different scenarios. The testing showed that the system could estimate a position with higher precision compared to the GNSS sensor in a GNSS-challenged environment and that it performed well in estimating the head- ing. The proposed system showed potential to act as a reliable source of estimating position and heading if the implementation described in future work is made. 6.1 Future work The modeling of the motion model used in the EKF should be revised since there were indications that the performance of the position algorithm degraded when the vessel was orientated with increased roll magnitude. Implementing an estimation of the roll to the state vector would therefore be an improvement. Other states can also be implemented such as pitch and acceleration noise. Another implementa- tion that can improve the system precision and computational load would be to use the EKFs estimation of the heading covariance to define the search space of the PSO. As the system relies on parameters that can be tuned such as the RANSAC grid size, overlapping point threshold, PSO parameters and noise covariances, the system could be optimized for better precision. A tuning parameter that should be revised would be the LiDAR covariance RLiDAR,k(Nk). A better approach than the pre- sented method of modeling the LiDAR covariance would be to make it dependent on the ratio between 2D lines orthogonal to each other. The algorithms param- eters could also be tuned for computational efficiency by lowering the number of PSO particles and iterations as well as down-sampling of the LiDAR point cloud. 41 6. Conclusion The computational load could also be lowered by programming the system in a high-performing program language. However, as the aerial images were provided by Matlab, another source of map-database must be used. 42 References [1] T. Hoang and V. P. Berntsson, “Localisation using lidar and camera,” Master’s thesis, Chalmers University of Technology, 2017. [2] G. Wan, X. Yang, R. Cai, H. Li, Y. Zhou, H. Wang, and S. Song, “Robust and precise vehicle localization based on multi-sensor fusion in diverse city scenes,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018. [3] N. Dalhaug, “Lidar-based localization for autonomous ferry,” Master’s thesis, Norwegian University of Science and Technology, 2019. [4] F. Moosmann and C. Stiller, “Velodyne slam,” in 2011 IEEE Intelligent Vehi- cles Symposium (IV), 2011, pp. 393–398. [5] A. Engsträm and D. Geiseler, “Lidar-based simultaneous localization and map- ping in marine vehicles,” Master’s thesis, Chalmers University of Technology, 2022. [6] T. Liu, B. Li, L. Yang, G. Chen, L. He, and J. He, “Tightly coupled gnss, ins and visual odometry for accurate and robust vehicle positioning,” in 2022 5th International Symposium on Autonomous Systems (ISAS), 2022. [7] O. Pink, “Visual map matching and localization using a global feature map,” in 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008. [8] Z. Zhao, H. Fu, R. Ren, and Z. Sun, “Real-time intersection detection based on satellite image and 3d lidar point cloud,” in Proceedings of 2021 International Conference on Autonomous Unmanned Systems (ICAUS 2021), M. Wu, Y. Niu, M. Gu, and J. Cheng, Eds. Springer Singapore, 2022, pp. 2868–2878. [9] M. Escourrou, J. A. H