Spatial Structure Constraints for Weakly Supervised Semantic Segmentation

Spatial Structure Constraints for Weakly Supervised Semantic Segmentation

20 Jan 2024 | Tao Chen, Yazhou Yao, Xingguo Huang, Zechao Li, Liqiang Nie and Jinhui Tang
This paper proposes spatial structure constraints (SSC) for weakly supervised semantic segmentation to address the problem of object over-activation caused by the expansion of class activation maps (CAMs). The main idea is to constrain the activation within the object area to prevent it from intruding into the background region. The proposed approach includes two key modules: a CAM-driven reconstruction module and an activation self-modulation module. The reconstruction module directly reconstructs the input image from CAM features, preserving the coarse spatial structure of the image content. The activation self-modulation module refines CAMs with finer spatial structure details by enhancing regional consistency. The proposed approach achieves 72.7% and 47.0% mIoU on the PASCAL VOC 2012 and COCO datasets, respectively, without relying on external saliency models. The results demonstrate the effectiveness of the proposed method in alleviating the object over-activation problem. The approach is trained jointly with the classification network and can be directly plugged into existing networks. The method is evaluated on two widely-used datasets and shows superior performance compared to state-of-the-art approaches. The key contributions include the proposal of spatial structure constraints for weakly supervised semantic segmentation, the development of a CAM-driven reconstruction module with perceptual loss, and the introduction of an activation self-modulation module with a reliable activation selection strategy. The method is effective in preserving the spatial structure of the image content and constraining the activation within the object area.This paper proposes spatial structure constraints (SSC) for weakly supervised semantic segmentation to address the problem of object over-activation caused by the expansion of class activation maps (CAMs). The main idea is to constrain the activation within the object area to prevent it from intruding into the background region. The proposed approach includes two key modules: a CAM-driven reconstruction module and an activation self-modulation module. The reconstruction module directly reconstructs the input image from CAM features, preserving the coarse spatial structure of the image content. The activation self-modulation module refines CAMs with finer spatial structure details by enhancing regional consistency. The proposed approach achieves 72.7% and 47.0% mIoU on the PASCAL VOC 2012 and COCO datasets, respectively, without relying on external saliency models. The results demonstrate the effectiveness of the proposed method in alleviating the object over-activation problem. The approach is trained jointly with the classification network and can be directly plugged into existing networks. The method is evaluated on two widely-used datasets and shows superior performance compared to state-of-the-art approaches. The key contributions include the proposal of spatial structure constraints for weakly supervised semantic segmentation, the development of a CAM-driven reconstruction module with perceptual loss, and the introduction of an activation self-modulation module with a reliable activation selection strategy. The method is effective in preserving the spatial structure of the image content and constraining the activation within the object area.
Reach us at info@study.space