Multimodal Data Fusion for BEV Perception

XUAN, YUNER; QU, YING

Multimodal Data Fusion for BEV Perception

Ladda ner

CSE 24-56 YX YQ.pdf (1.61 MB)

Publicerad

2024

Författare

XUAN, YUNER

QU, YING

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Data science and AI (MPDSC), MSc

Sammanfattning

In autonomous driving, sensors are situated across different parts of the vehicle to capture the information from the surrounding environments to allow the autonomous vehicles to address various tasks related to driving decisions, like object detection, semantic segmentation and path planning. In the diverse approaches of perception, birds-eye-view (BEV) perception has progressed impressively over recent years. In contrast to front-view or perspective view modalities, BEV provides a comprehensive representation of the vehicles surrounding environment, which is fusion-friendly and offering convenience for downstream applications. As vehicle cameras are oriented outward and parallel to the ground, the captured images are in a perspective view that is perpendicular to the BEV. Consequently, a crucial part of BEV perception is the transformation of multi-sensor data from perspective view (PV) to BEV. The quality and efficiency of this transformation play a critical role in influencing the performance of subsequent specific tasks. This thesis project aims to study comprehensive multimodal data fusion solutions for PV-to-BEV transformation. We analyzed the common and unique characteristics of existing approaches and assessed their performance against a selected downstream perception task, focusing on object detection within a short distance. Additionally, we implemented mainly two modules Global Position Encoding (GPE) and Information Enhanced Decoder (IED) to enhance the performance of the multi-modal data fusion model. Keywords:

Ämne/nyckelord

Multi Modality, Sensor Fusion, BEV Perception, LiDAR, Camera, Transformer, Deep learning, 3D Object Detection, thesis

URI

http://hdl.handle.net/20.500.12380/309060

Samlingar

Examensarbeten för masterexamen

Visa fullständig post

Multimodal Data Fusion for BEV Perception

Ladda ner

Publicerad

Författare

Typ

Program

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Beskrivning

Ämne/nyckelord

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

URI

Samlingar

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced