Multi-Label Image Recognition with Graph Convolutional Networks

Multi-Label Image Recognition with Graph Convolutional Networks

7 Apr 2019 | Zhao-Min Chen, Xiu-Shen Wei, Peng Wang, Yanwen Guo
This paper proposes a multi-label image recognition framework based on Graph Convolutional Networks (GCN), called ML-GCN. The framework models label dependencies by constructing a directed graph over object labels, where each node represents a label and is encoded with word embeddings. GCN is then used to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to image descriptors extracted by another sub-network, enabling end-to-end training. A re-weighted scheme is introduced to create an effective label correlation matrix, which guides information propagation among nodes in the GCN. Experiments on two multi-label image recognition datasets show that ML-GCN outperforms existing state-of-the-art methods. Visualization analyses reveal that the classifiers learned by ML-GCN maintain meaningful semantic structures. The main contributions include a novel end-to-end trainable framework for multi-label image recognition, an effective re-weighted scheme for correlation matrix construction, and evaluation on two benchmark datasets showing superior performance. The approach effectively captures label correlations, leading to improved classification performance. The model is trained using a GCN-based mapping function that shares parameters across all classes, allowing gradients from all classifiers to impact the GCN-based classifier generation function. This implicitly models label dependencies. The re-weighted scheme balances the weights between a node and its neighborhood, alleviating overfitting and over-smoothing. The model is evaluated on MS-COCO and VOC 2007 datasets, achieving superior performance compared to existing methods. The results show that the proposed approach is effective in capturing label dependencies and improving multi-label image recognition performance.This paper proposes a multi-label image recognition framework based on Graph Convolutional Networks (GCN), called ML-GCN. The framework models label dependencies by constructing a directed graph over object labels, where each node represents a label and is encoded with word embeddings. GCN is then used to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to image descriptors extracted by another sub-network, enabling end-to-end training. A re-weighted scheme is introduced to create an effective label correlation matrix, which guides information propagation among nodes in the GCN. Experiments on two multi-label image recognition datasets show that ML-GCN outperforms existing state-of-the-art methods. Visualization analyses reveal that the classifiers learned by ML-GCN maintain meaningful semantic structures. The main contributions include a novel end-to-end trainable framework for multi-label image recognition, an effective re-weighted scheme for correlation matrix construction, and evaluation on two benchmark datasets showing superior performance. The approach effectively captures label correlations, leading to improved classification performance. The model is trained using a GCN-based mapping function that shares parameters across all classes, allowing gradients from all classifiers to impact the GCN-based classifier generation function. This implicitly models label dependencies. The re-weighted scheme balances the weights between a node and its neighborhood, alleviating overfitting and over-smoothing. The model is evaluated on MS-COCO and VOC 2007 datasets, achieving superior performance compared to existing methods. The results show that the proposed approach is effective in capturing label dependencies and improving multi-label image recognition performance.
Reach us at info@study.space
[slides and audio] Multi-Label Image Recognition With Graph Convolutional Networks