Zero-Shot Learning Through Cross-Modal Transfer

Zero-Shot Learning Through Cross-Modal Transfer

20 Mar 2013 | Richard Socher, Milind Ganjoo, Hamsa Sridhar, Osbert Bastani, Christopher D. Manning, Andrew Y. Ng
This paper introduces a model that can recognize objects in images even without training data for those objects, relying solely on unsupervised large text corpora. The model leverages distributional information in language to understand what objects look like, achieving state-of-the-art performance on known classes and reasonable performance on unseen classes. The key contributions include: 1. **Semantic Space Mapping**: Images are mapped into a semantic space of words learned by a neural network, capturing distributional similarities from text corpora. 2. **Outlier Detection**: An outlier detection probability is used to determine whether a new image is on the manifold of known categories or an unseen category. 3. **Probabilistic Model**: A Bayesian framework integrates the probability of being an outlier or a known category, allowing for joint zero-shot and standard image classification. The model does not require manually defined semantic features and can classify both seen and unseen classes, outperforming previous zero-shot learning models. Experiments on the CIFAR10 dataset demonstrate high accuracy in both known and unseen classes, with accuracies of up to 80% for known classes and 30-15% for unseen classes. The model's effectiveness is further validated through various experiments, showing that performance improves with appropriate threshold settings for outlier detection.This paper introduces a model that can recognize objects in images even without training data for those objects, relying solely on unsupervised large text corpora. The model leverages distributional information in language to understand what objects look like, achieving state-of-the-art performance on known classes and reasonable performance on unseen classes. The key contributions include: 1. **Semantic Space Mapping**: Images are mapped into a semantic space of words learned by a neural network, capturing distributional similarities from text corpora. 2. **Outlier Detection**: An outlier detection probability is used to determine whether a new image is on the manifold of known categories or an unseen category. 3. **Probabilistic Model**: A Bayesian framework integrates the probability of being an outlier or a known category, allowing for joint zero-shot and standard image classification. The model does not require manually defined semantic features and can classify both seen and unseen classes, outperforming previous zero-shot learning models. Experiments on the CIFAR10 dataset demonstrate high accuracy in both known and unseen classes, with accuracies of up to 80% for known classes and 30-15% for unseen classes. The model's effectiveness is further validated through various experiments, showing that performance improves with appropriate threshold settings for outlier detection.
Reach us at info@study.space
Understanding Zero-Shot Learning Through Cross-Modal Transfer