This paper proposes a novel generative adversarial network (GAN), f-CLSWGAN, for feature generation in zero-shot learning (ZSL) and generalized zero-shot learning (GZSL). The main challenge in ZSL and GZSL is the imbalance between seen and unseen classes, where traditional methods struggle due to the lack of labeled examples for unseen classes. To address this, f-CLSWGAN synthesizes CNN features conditioned on class-level semantic information, enabling direct mapping from semantic descriptors to class-conditional feature distributions. The model combines a Wasserstein GAN with a classification loss to generate discriminative CNN features that can be used to train classifiers like softmax or multimodal embedding methods.
The proposed approach significantly improves accuracy on five challenging datasets (CUB, FLO, SUN, AWA, and ImageNet) in both ZSL and GZSL settings. The model is generalizable to different deep CNN architectures, such as GoogleNet and ResNet, and can use various class-level auxiliary information, including attributes, sentences, and word2vec embeddings. The model is evaluated on multiple datasets and shows that generating CNN features of unseen classes allows for effective use of softmax classifiers in GZSL tasks.
The paper also compares the performance of different GAN models and a competing generative model, GMMN, for visual feature generation. It demonstrates that f-CLSWGAN outperforms existing methods in both ZSL and GZSL tasks. The model is shown to be effective in generating high-quality CNN features, which are crucial for classification tasks in ZSL and GZSL. The results indicate that the proposed method is a flexible and strong technique for handling data scarcity in zero-shot learning scenarios. The paper also highlights the importance of using class embeddings, such as attributes and sentences, to improve feature generation and classification performance. Overall, the study shows that the proposed approach is a promising solution for zero-shot learning tasks, particularly in scenarios where labeled data for unseen classes is limited.This paper proposes a novel generative adversarial network (GAN), f-CLSWGAN, for feature generation in zero-shot learning (ZSL) and generalized zero-shot learning (GZSL). The main challenge in ZSL and GZSL is the imbalance between seen and unseen classes, where traditional methods struggle due to the lack of labeled examples for unseen classes. To address this, f-CLSWGAN synthesizes CNN features conditioned on class-level semantic information, enabling direct mapping from semantic descriptors to class-conditional feature distributions. The model combines a Wasserstein GAN with a classification loss to generate discriminative CNN features that can be used to train classifiers like softmax or multimodal embedding methods.
The proposed approach significantly improves accuracy on five challenging datasets (CUB, FLO, SUN, AWA, and ImageNet) in both ZSL and GZSL settings. The model is generalizable to different deep CNN architectures, such as GoogleNet and ResNet, and can use various class-level auxiliary information, including attributes, sentences, and word2vec embeddings. The model is evaluated on multiple datasets and shows that generating CNN features of unseen classes allows for effective use of softmax classifiers in GZSL tasks.
The paper also compares the performance of different GAN models and a competing generative model, GMMN, for visual feature generation. It demonstrates that f-CLSWGAN outperforms existing methods in both ZSL and GZSL tasks. The model is shown to be effective in generating high-quality CNN features, which are crucial for classification tasks in ZSL and GZSL. The results indicate that the proposed method is a flexible and strong technique for handling data scarcity in zero-shot learning scenarios. The paper also highlights the importance of using class embeddings, such as attributes and sentences, to improve feature generation and classification performance. Overall, the study shows that the proposed approach is a promising solution for zero-shot learning tasks, particularly in scenarios where labeled data for unseen classes is limited.