Change-Agent: Towards Interactive Comprehensive Remote Sensing Change Interpretation and Analysis

Change-Agent: Towards Interactive Comprehensive Remote Sensing Change Interpretation and Analysis

16 Jul 2024 | Chenyang Liu, Keyan Chen, Haotian Zhang, Zipeng Qi, Zhengxia Zou, Member, IEEE, and Zhenwei Shi*, Senior Member, IEEE
The article introduces an interactive Change-Agent designed for comprehensive remote sensing change interpretation and analysis. This system integrates a multi-level change interpretation (MCI) model as its "eyes" and a large language model (LLM) as its "brain." The MCI model includes two branches: pixel-level change detection and semantic-level change captioning. A novel BI-temporal Iterative Interaction (BI3) layer enhances the model's ability to capture discriminative features through Local Perception Enhancement (LPE) and Global Difference Fusion Attention (GDFA) modules. The system also utilizes a large dataset, LEVIR-MCI, containing bi-temporal images, change detection masks, and descriptive captions, to train the MCI model. The Change-Agent enables users to interactively request and receive detailed change interpretations, including change detection, change captioning, object counting, and change cause analysis. The LLM component facilitates natural language understanding and interaction, allowing the agent to process user instructions and provide tailored outputs. The system demonstrates superior performance in both change detection and change description, offering a new approach to intelligent remote sensing applications. Experiments show that the MCI model achieves state-of-the-art results in change detection and captioning, highlighting the potential of the Change-Agent for comprehensive surface change analysis. The system's ability to provide both pixel-level and semantic-level interpretations makes it a valuable tool for environmental monitoring, urban planning, and resource management. The dataset and codebase are publicly available for further research and development.The article introduces an interactive Change-Agent designed for comprehensive remote sensing change interpretation and analysis. This system integrates a multi-level change interpretation (MCI) model as its "eyes" and a large language model (LLM) as its "brain." The MCI model includes two branches: pixel-level change detection and semantic-level change captioning. A novel BI-temporal Iterative Interaction (BI3) layer enhances the model's ability to capture discriminative features through Local Perception Enhancement (LPE) and Global Difference Fusion Attention (GDFA) modules. The system also utilizes a large dataset, LEVIR-MCI, containing bi-temporal images, change detection masks, and descriptive captions, to train the MCI model. The Change-Agent enables users to interactively request and receive detailed change interpretations, including change detection, change captioning, object counting, and change cause analysis. The LLM component facilitates natural language understanding and interaction, allowing the agent to process user instructions and provide tailored outputs. The system demonstrates superior performance in both change detection and change description, offering a new approach to intelligent remote sensing applications. Experiments show that the MCI model achieves state-of-the-art results in change detection and captioning, highlighting the potential of the Change-Agent for comprehensive surface change analysis. The system's ability to provide both pixel-level and semantic-level interpretations makes it a valuable tool for environmental monitoring, urban planning, and resource management. The dataset and codebase are publicly available for further research and development.
Reach us at info@study.space
[slides] Change-Agent%3A Toward Interactive Comprehensive Remote Sensing Change Interpretation and Analysis | StudySpace