Diffusion models for novel view synthesis in autonomous driving
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Program
Complex adaptive systems (MPCAS), MSc
Publicerad
2024
Författare
Gasparyan, Artur
Qiu, Ruiqi
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Novel View Synthesis (NVS) generates target images from new camera poses using source images and their corresponding poses. It has gained prominence in the field of autonomous driving (AD) as a tool for generating synthetic data to improve perception systems. Current NVS implementations, such as Neural Radiance Fields (NeRFs), excel at constructing 3D scenes from sensory inputs but struggle to accurately render sparsely observed or unseen views. This thesis addresses these limitations by integrating Diffusion Models (DMs) into the NVS pipeline to enhance reconstruction quality in such cases. We propose a pipeline inspired by ReconFusion, training NeuRAD, a NeRF-based NVS method designed for dynamic AD data, on additional poses not present in the original training set. A pretrained, open-sourced DM, Stable Diffusion, provides supervision by refining NeuRAD’s outputs for these unseen views. To improve the
DM’s performance on AD scenes, we finetune it using Low-Rank Adaptation (LoRA), enabling efficient adaptation to small datasets. ControlNet is incorporated to extend the diffusion model with additional conditioning signals, ensuring better alignment with scene-specific characteristics. Despite introducing these enhancements, our experiments reveal mixed results. While some metrics show improvement, others remain inconsistent, particularly in challenging scenarios. We identify weak conditional signals and limited LoRA rank as potential limitations. Future research should explore incorporating more robust conditioning signals, such as depth or temporal information, and training on diverse
scenes to improve generalization and stability. These directions offer promising avenues for advancing NVS in AD applications.
Beskrivning
Ämne/nyckelord
Scene Reconstruction , Novel View Synthesis , Neural Radiance Fields , Autonomous Driving , Deep Learning , Generative Models , Diffusion Models , Latent Diffusion Models, , Closed Loop Simulation