Remote Sensing Image Change Detection with Transformers

Remote Sensing Image Change Detection with Transformers

11 Jul 2021 | Hao Chen, Zipeng Qi and Zhenwei Shi*
This paper proposes a bitemporal image transformer (BIT) for remote sensing image change detection. The main idea is to represent high-level semantic concepts of the change of interest using a few visual words (semantic tokens). The input images are converted into a few semantic tokens, and a transformer encoder is used to model contexts in the compact token-based space-time. The learned context-rich tokens are then fed back to the pixel-space to refine the original features via a transformer decoder. The BIT is incorporated into a deep feature differencing-based change detection framework. Extensive experiments on three change detection datasets demonstrate that the proposed method outperforms purely convolutional baselines in terms of both efficiency and accuracy. The BIT-based model achieves significant improvements with only 3 times lower computational costs and model parameters. The model surpasses several state-of-the-art change detection methods, including better than four recent attention-based methods in terms of efficiency and accuracy. The code is available at https://github.com/justchenhao/BIT_CD. The key contributions include: (1) an efficient transformer-based method for remote sensing image change detection; (2) expressing input images into a few visual words (tokens) and modeling context in the compact token-based space-time; and (3) extensive experiments on three CD datasets validating the effectiveness and efficiency of the proposed method. The model is implemented on PyTorch and trained using a single NVIDIA Tesla V100 GPU. The evaluation metrics include F1-score, precision, recall, Intersection over Union (IoU), and overall accuracy (OA). The results show that the BIT-based model consistently outperforms other methods across the three datasets. The model is efficient and effective in modeling global semantic relations in space-time to benefit the feature representation of the change of interest. The model is also efficient in terms of computational cost and model parameters. The model is evaluated on three change detection datasets: LEVIR-CD, WHU-CD, and DSIFN-CD. The results show that the BIT-based model outperforms other methods in terms of F1-score, precision, recall, IoU, and OA. The model is also efficient in terms of computational cost and model parameters. The model is compared with several state-of-the-art methods, including three purely convolutional-based methods and four attention-based methods. The results show that the BIT-based model outperforms these methods in terms of F1-score, precision, recall, IoU, and OA. The model is also efficient in terms of computational cost and model parameters. The model is evaluated on three change detection datasets: LEVIR-CD, WHU-CD, and DSIFN-CD. The results show that the BIT-based model outperforms other methods in terms of F1-score, precision, recall, IoU, and OA. The model is also efficient in terms of computational cost and model parameters. The model is compared with several state-of-the-artThis paper proposes a bitemporal image transformer (BIT) for remote sensing image change detection. The main idea is to represent high-level semantic concepts of the change of interest using a few visual words (semantic tokens). The input images are converted into a few semantic tokens, and a transformer encoder is used to model contexts in the compact token-based space-time. The learned context-rich tokens are then fed back to the pixel-space to refine the original features via a transformer decoder. The BIT is incorporated into a deep feature differencing-based change detection framework. Extensive experiments on three change detection datasets demonstrate that the proposed method outperforms purely convolutional baselines in terms of both efficiency and accuracy. The BIT-based model achieves significant improvements with only 3 times lower computational costs and model parameters. The model surpasses several state-of-the-art change detection methods, including better than four recent attention-based methods in terms of efficiency and accuracy. The code is available at https://github.com/justchenhao/BIT_CD. The key contributions include: (1) an efficient transformer-based method for remote sensing image change detection; (2) expressing input images into a few visual words (tokens) and modeling context in the compact token-based space-time; and (3) extensive experiments on three CD datasets validating the effectiveness and efficiency of the proposed method. The model is implemented on PyTorch and trained using a single NVIDIA Tesla V100 GPU. The evaluation metrics include F1-score, precision, recall, Intersection over Union (IoU), and overall accuracy (OA). The results show that the BIT-based model consistently outperforms other methods across the three datasets. The model is efficient and effective in modeling global semantic relations in space-time to benefit the feature representation of the change of interest. The model is also efficient in terms of computational cost and model parameters. The model is evaluated on three change detection datasets: LEVIR-CD, WHU-CD, and DSIFN-CD. The results show that the BIT-based model outperforms other methods in terms of F1-score, precision, recall, IoU, and OA. The model is also efficient in terms of computational cost and model parameters. The model is compared with several state-of-the-art methods, including three purely convolutional-based methods and four attention-based methods. The results show that the BIT-based model outperforms these methods in terms of F1-score, precision, recall, IoU, and OA. The model is also efficient in terms of computational cost and model parameters. The model is evaluated on three change detection datasets: LEVIR-CD, WHU-CD, and DSIFN-CD. The results show that the BIT-based model outperforms other methods in terms of F1-score, precision, recall, IoU, and OA. The model is also efficient in terms of computational cost and model parameters. The model is compared with several state-of-the-art
Reach us at info@study.space
[slides and audio] Remote Sensing Image Change Detection With Transformers