2014 | Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson
This paper investigates the transferability of features in deep neural networks, focusing on how general or specific features are at different layers. The authors train convolutional neural networks on the ImageNet dataset and analyze how well features from one task can be transferred to another. They find that first-layer features are generally applicable across many tasks, while last-layer features are specific to the original task. The transition from general to specific features occurs somewhere in the network, and the extent of this transition varies across layers.
The study shows that transferring features from lower layers tends to be more effective than transferring from higher layers, especially when the base and target tasks are similar. However, when the tasks are dissimilar, transferring features from higher layers can be less effective. The authors also find that transferring features from distant tasks can still yield better performance than using random features. Additionally, they demonstrate that initializing a network with transferred features from any layer can improve generalization performance, even after fine-tuning to a new task.
The results highlight two key issues that negatively affect transferability: (1) the specialization of higher layer neurons to their original task, and (2) optimization difficulties related to splitting networks between co-adapted neurons. These issues can dominate at different layers of the network, depending on the task similarity. The study provides a framework for quantifying the generality of features at each layer and shows that transferring features can significantly improve performance, even when the target dataset is large. The findings have important implications for transfer learning and the design of deep neural networks.This paper investigates the transferability of features in deep neural networks, focusing on how general or specific features are at different layers. The authors train convolutional neural networks on the ImageNet dataset and analyze how well features from one task can be transferred to another. They find that first-layer features are generally applicable across many tasks, while last-layer features are specific to the original task. The transition from general to specific features occurs somewhere in the network, and the extent of this transition varies across layers.
The study shows that transferring features from lower layers tends to be more effective than transferring from higher layers, especially when the base and target tasks are similar. However, when the tasks are dissimilar, transferring features from higher layers can be less effective. The authors also find that transferring features from distant tasks can still yield better performance than using random features. Additionally, they demonstrate that initializing a network with transferred features from any layer can improve generalization performance, even after fine-tuning to a new task.
The results highlight two key issues that negatively affect transferability: (1) the specialization of higher layer neurons to their original task, and (2) optimization difficulties related to splitting networks between co-adapted neurons. These issues can dominate at different layers of the network, depending on the task similarity. The study provides a framework for quantifying the generality of features at each layer and shows that transferring features can significantly improve performance, even when the target dataset is large. The findings have important implications for transfer learning and the design of deep neural networks.