Cross-modal image feature matching between infrared and visual images. Adapting intra-modal feature matching models for cross-modal matching

Publicerad

Författare

Typ

Examensarbete för masterexamen
Master's Thesis

Modellbyggare

Tidskriftstitel

ISSN

Volymtitel

Utgivare

Sammanfattning

Abstract Image feature matching is an essential part to various computer vision applications. Many modern solutions apply machine learning techniques to achieve state-of-theart results. A lesser studied problem is matching image features between images of different modalities. This thesis investigates this problem for the visual–LWIR (long-wave infrared) case by utilizing the matching capabilities of the pre-trained intra-modal models SuperPoint and SuperGlue. This is done by adding interfacing models and additional layers to mitigate problems such as catastrophic forgetting and data biasing in the pre-trained models. These techniques prove only marginally successful compared to the pre-trained models themselves. For training these models, a method for sparse pseudo ground truth point correspondence is proposed, and evaluation is done via pose estimation. This thesis provides insight into some specific methods of transfer learning for the SuperPoint and SuperGlue models, methods for ground truth estimation, and discusses the difficulties faced in this problem. Further studying of this problem may be able to construct improved models for LWIR–visual matching, which would enable more reliable methods for cross-modal camera calibration & registration, localization, and image retrieval, with numerous applications in the automotive, defense, and healthcare industries.

Beskrivning

Ämne/nyckelord

Keywords: feature matching, deep learning, computer vision, pose estimation, multimodal, infrared imaging, graph neural networks.

Citation

Arkitekt (konstruktör)

Geografisk plats

Byggnad (typ)

Byggår

Modelltyp

Skala

Teknik / material

Index

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced