12 Jul 2016 | Muhammad Ghifary*, W. Bastiaan Kleijn, Mengjie Zhang, David Balduzzi, Wen Li
This paper proposes a novel unsupervised domain adaptation algorithm based on deep learning for visual object recognition. The proposed model, called Deep Reconstruction-Classification Network (DRCN), jointly learns a shared encoding representation for two tasks: supervised classification of labeled source data and unsupervised reconstruction of unlabeled target data. The DRCN model is optimized using backpropagation, similar to standard neural networks. The model's architecture consists of two pipelines: one for label prediction and one for data reconstruction, with shared encoding parameters. The DRCN outperforms existing state-of-the-art domain adaptation algorithms, achieving up to 8% accuracy improvement on cross-domain object recognition tasks. The reconstruction pipeline transforms images from the source domain into images resembling the target domain, suggesting that the model learns a composite representation that encodes both the structure of target images and the classification of source images. The paper also provides a formal analysis of the algorithm's objective in the context of domain adaptation. The DRCN model is evaluated on various cross-domain object recognition tasks, including SVHN, MNIST, USPS, CIFAR, and STL, as well as the Office dataset. The results show that DRCN performs better than other methods, including ReverseGrad and SCAE, and provides a considerable improvement over the pretraining-finetuning strategy. The paper also discusses the effectiveness of data augmentation and denoising in improving DRCN's performance. The analysis shows that the DRCN learning objective relates to semi-supervised learning and that using unlabeled source data may not further improve domain adaptation. The paper concludes that DRCN is a promising approach for unsupervised domain adaptation in object recognition.This paper proposes a novel unsupervised domain adaptation algorithm based on deep learning for visual object recognition. The proposed model, called Deep Reconstruction-Classification Network (DRCN), jointly learns a shared encoding representation for two tasks: supervised classification of labeled source data and unsupervised reconstruction of unlabeled target data. The DRCN model is optimized using backpropagation, similar to standard neural networks. The model's architecture consists of two pipelines: one for label prediction and one for data reconstruction, with shared encoding parameters. The DRCN outperforms existing state-of-the-art domain adaptation algorithms, achieving up to 8% accuracy improvement on cross-domain object recognition tasks. The reconstruction pipeline transforms images from the source domain into images resembling the target domain, suggesting that the model learns a composite representation that encodes both the structure of target images and the classification of source images. The paper also provides a formal analysis of the algorithm's objective in the context of domain adaptation. The DRCN model is evaluated on various cross-domain object recognition tasks, including SVHN, MNIST, USPS, CIFAR, and STL, as well as the Office dataset. The results show that DRCN performs better than other methods, including ReverseGrad and SCAE, and provides a considerable improvement over the pretraining-finetuning strategy. The paper also discusses the effectiveness of data augmentation and denoising in improving DRCN's performance. The analysis shows that the DRCN learning objective relates to semi-supervised learning and that using unlabeled source data may not further improve domain adaptation. The paper concludes that DRCN is a promising approach for unsupervised domain adaptation in object recognition.