Understanding Multi-Label Image Recognition With Graph Convolutional Networks

The paper presents a novel approach for multi-label image recognition using Graph Convolutional Networks (GCNs). The main challenge in multi-label image recognition is to model the dependencies between object labels, as objects often co-occur in images. The proposed model, called ML-GCN, builds a directed graph over the object labels, where each label is represented by word embeddings. GCNs are then learned to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to image descriptors extracted by another sub-net, enabling end-to-end training. A key contribution is the introduction of a re-weighted scheme to create an effective label correlation matrix, which guides information propagation among nodes in the GCN. This scheme helps alleviate overfitting and over-smoothing issues. Experiments on two multi-label image recognition datasets, MS-COCO and VOC 2007, show that the proposed method outperforms existing state-of-the-art methods. Visualization analyses further demonstrate that the learned classifiers maintain meaningful semantic topology.The paper presents a novel approach for multi-label image recognition using Graph Convolutional Networks (GCNs). The main challenge in multi-label image recognition is to model the dependencies between object labels, as objects often co-occur in images. The proposed model, called ML-GCN, builds a directed graph over the object labels, where each label is represented by word embeddings. GCNs are then learned to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to image descriptors extracted by another sub-net, enabling end-to-end training. A key contribution is the introduction of a re-weighted scheme to create an effective label correlation matrix, which guides information propagation among nodes in the GCN. This scheme helps alleviate overfitting and over-smoothing issues. Experiments on two multi-label image recognition datasets, MS-COCO and VOC 2007, show that the proposed method outperforms existing state-of-the-art methods. Visualization analyses further demonstrate that the learned classifiers maintain meaningful semantic topology.

Multi-Label Image Recognition with Graph Convolutional Networks

7 Apr 2019 | Zhao-Min Chen, Xiu-Shen Wei, Peng Wang, Yanwen Guo