[slides] Multimodal image registration techniques%3A a comprehensive survey

This paper provides a comprehensive review of state-of-the-art techniques for multimodal image registration, focusing on scenarios where images from different modalities need to be precisely aligned in the same reference system. The review covers classical and modern approaches, including deep learning-based methods, highlighting the specific requirements at each stage of the registration pipeline for multimodal images. Medical images are excluded due to their unique characteristics, such as the use of active and passive sensors or the non-rigid nature of the body in the image. The introduction discusses various applications where multimodal image registration is essential, such as video surveillance, building insulation inspection, environmental monitoring, image filtering and fusion, crop inspection, and driving assistance systems. These applications benefit from advancements in hardware technology, including smartphones with multiple cameras. Image registration involves aligning images of the same scene taken at different times or from different viewpoints to combine or compare them. The challenge increases when dealing with images from different modalities, as features may appear differently due to the nature of the sources. Traditional methods use hand-crafted feature descriptors like EOH, SIFT, and SURF, while recent deep learning approaches use Convolutional Neural Networks (CNNs) to learn features for alignment, even in noisy or different modalities. This survey focuses on non-medical multimodal image registration, reviewing classical and novel deep learning-based approaches. It presents a general image registration framework detailing each step in the registration process.This paper provides a comprehensive review of state-of-the-art techniques for multimodal image registration, focusing on scenarios where images from different modalities need to be precisely aligned in the same reference system. The review covers classical and modern approaches, including deep learning-based methods, highlighting the specific requirements at each stage of the registration pipeline for multimodal images. Medical images are excluded due to their unique characteristics, such as the use of active and passive sensors or the non-rigid nature of the body in the image. The introduction discusses various applications where multimodal image registration is essential, such as video surveillance, building insulation inspection, environmental monitoring, image filtering and fusion, crop inspection, and driving assistance systems. These applications benefit from advancements in hardware technology, including smartphones with multiple cameras. Image registration involves aligning images of the same scene taken at different times or from different viewpoints to combine or compare them. The challenge increases when dealing with images from different modalities, as features may appear differently due to the nature of the sources. Traditional methods use hand-crafted feature descriptors like EOH, SIFT, and SURF, while recent deep learning approaches use Convolutional Neural Networks (CNNs) to learn features for alignment, even in noisy or different modalities. This survey focuses on non-medical multimodal image registration, reviewing classical and novel deep learning-based approaches. It presents a general image registration framework detailing each step in the registration process.

Multimodal image registration techniques: a comprehensive survey

6 January 2024 | Henry O. Velesaca, Gisel Bastidas, Mohammad Rouhani, Angel D. Sappa