5 April 2024 | Xiaoxu Li, Zhen Li, Jiyang Xie, Xiaochen Yang, Jing-Hao Xue, Zhanyu Ma
The paper introduces a self-reconstruction network (SRN) for fine-grained few-shot classification, addressing the issue of overfitting and improving feature diversity. The SRN includes a self-reconstruction metric module that reconstructs query features from both support and self-reconstructed query features, enhancing the representation capability of the learned feature space. Additionally, a restrained cross-entropy loss is introduced to prevent overconfident predictions, further mitigating overfitting. Extensive experiments on five benchmark datasets demonstrate the effectiveness of the proposed method, achieving state-of-the-art performance in both 5-way 1-shot and 5-way 5-shot classification tasks. The method's stability and generalization are also evaluated through various analyses, including visualization of validation loss, classification probabilities, discriminative feature regions, and feature embeddings. The computational complexity of the proposed method is compared with the feature reconstruction network (FRN), showing marginal increases in training time and parameters.The paper introduces a self-reconstruction network (SRN) for fine-grained few-shot classification, addressing the issue of overfitting and improving feature diversity. The SRN includes a self-reconstruction metric module that reconstructs query features from both support and self-reconstructed query features, enhancing the representation capability of the learned feature space. Additionally, a restrained cross-entropy loss is introduced to prevent overconfident predictions, further mitigating overfitting. Extensive experiments on five benchmark datasets demonstrate the effectiveness of the proposed method, achieving state-of-the-art performance in both 5-way 1-shot and 5-way 5-shot classification tasks. The method's stability and generalization are also evaluated through various analyses, including visualization of validation loss, classification probabilities, discriminative feature regions, and feature embeddings. The computational complexity of the proposed method is compared with the feature reconstruction network (FRN), showing marginal increases in training time and parameters.