21 May 2016 | Mingsheng Long, Jianmin Wang, Michael I. Jordan
This paper addresses deep transfer learning in scenarios where the joint distributions of features and labels may significantly differ across domains. It proposes a novel joint distribution discrepancy (JDD) to directly compare these distributions without the need for marginal-conditional factorization. The method is applied to deep convolutional networks, where dataset shifts can occur in multiple task-specific feature layers and the classifier layer. A set of joint adaptation networks (JANs) are designed to minimize the JDD, enabling the adaptation of these layers across domains. The approach is trained efficiently using back-propagation and shows state-of-the-art results on standard domain adaptation datasets. The paper also discusses the importance of joint distribution adaptation and the effectiveness of JANs in learning more transferable features and classifiers.This paper addresses deep transfer learning in scenarios where the joint distributions of features and labels may significantly differ across domains. It proposes a novel joint distribution discrepancy (JDD) to directly compare these distributions without the need for marginal-conditional factorization. The method is applied to deep convolutional networks, where dataset shifts can occur in multiple task-specific feature layers and the classifier layer. A set of joint adaptation networks (JANs) are designed to minimize the JDD, enabling the adaptation of these layers across domains. The approach is trained efficiently using back-propagation and shows state-of-the-art results on standard domain adaptation datasets. The paper also discusses the importance of joint distribution adaptation and the effectiveness of JANs in learning more transferable features and classifiers.