11 Jun 2018 | Jaehong Yoon1,3*, Eunho Yang1,3, Jeongtae Lee2, Sung Ju Hwang1,3
This paper proposes a novel deep network architecture for lifelong learning called Dynamically Expandable Network (DEN). DEN dynamically adjusts its network capacity as it trains on a sequence of tasks to learn a compact overlapping knowledge sharing structure among tasks. DEN is efficiently trained in an online manner by performing selective retraining, dynamically expands network capacity upon arrival of each task with only the necessary number of units, and effectively prevents semantic drift by splitting/duplicating units and timestamping them. The proposed method is validated on multiple public datasets under lifelong learning scenarios, where it not only significantly outperforms existing lifelong learning methods for deep networks but also achieves the same level of performance as the batch counterparts with substantially fewer parameters. Further, the obtained network fine-tuned on all tasks achieves significantly better performance over the batch models, showing that it can estimate the optimal network structure even when all tasks are available from the beginning.
The main challenges in incremental deep learning with selective parameter sharing and dynamic layer expansion include achieving scalability and efficiency in training, deciding when to expand the network and how many neurons to add, and preventing semantic drift. To address these challenges, the paper proposes a novel deep network model along with an efficient and effective incremental learning algorithm, DEN. DEN maximally utilizes the network learned on all previous tasks to efficiently learn to predict for the new task while dynamically increasing the network capacity by adding or splitting/duplicating neurons when necessary. The method is applicable to any generic deep network, including convolutional networks.
The paper validates the incremental deep neural network for lifelong learning on multiple public datasets, achieving similar or better performance than the model that trains a separate network for each task while using only 11.9%–60.3% of its parameters. Further, fine-tuning the learned network on all tasks obtains even better performance, outperforming the batch model by as much as 0.05%–4.8%. Thus, the model can also be used for structure estimation to obtain optimal performance over network capacity even when batch training is possible, which is a more general setup.
The paper also discusses related work in lifelong learning, including methods for preventing catastrophic forgetting and dynamic network expansion. It presents an incremental learning algorithm for DEN, which includes selective retraining, dynamic network expansion, and network split/duplication. The algorithm is validated on multiple datasets, showing that DEN outperforms existing methods in terms of performance and efficiency. The results demonstrate that DEN is effective in preventing semantic drift and can dynamically find its optimal capacity. The paper concludes that DEN is a novel deep neural network for lifelong learning that can be used for both lifelong learning and network structure estimation.This paper proposes a novel deep network architecture for lifelong learning called Dynamically Expandable Network (DEN). DEN dynamically adjusts its network capacity as it trains on a sequence of tasks to learn a compact overlapping knowledge sharing structure among tasks. DEN is efficiently trained in an online manner by performing selective retraining, dynamically expands network capacity upon arrival of each task with only the necessary number of units, and effectively prevents semantic drift by splitting/duplicating units and timestamping them. The proposed method is validated on multiple public datasets under lifelong learning scenarios, where it not only significantly outperforms existing lifelong learning methods for deep networks but also achieves the same level of performance as the batch counterparts with substantially fewer parameters. Further, the obtained network fine-tuned on all tasks achieves significantly better performance over the batch models, showing that it can estimate the optimal network structure even when all tasks are available from the beginning.
The main challenges in incremental deep learning with selective parameter sharing and dynamic layer expansion include achieving scalability and efficiency in training, deciding when to expand the network and how many neurons to add, and preventing semantic drift. To address these challenges, the paper proposes a novel deep network model along with an efficient and effective incremental learning algorithm, DEN. DEN maximally utilizes the network learned on all previous tasks to efficiently learn to predict for the new task while dynamically increasing the network capacity by adding or splitting/duplicating neurons when necessary. The method is applicable to any generic deep network, including convolutional networks.
The paper validates the incremental deep neural network for lifelong learning on multiple public datasets, achieving similar or better performance than the model that trains a separate network for each task while using only 11.9%–60.3% of its parameters. Further, fine-tuning the learned network on all tasks obtains even better performance, outperforming the batch model by as much as 0.05%–4.8%. Thus, the model can also be used for structure estimation to obtain optimal performance over network capacity even when batch training is possible, which is a more general setup.
The paper also discusses related work in lifelong learning, including methods for preventing catastrophic forgetting and dynamic network expansion. It presents an incremental learning algorithm for DEN, which includes selective retraining, dynamic network expansion, and network split/duplication. The algorithm is validated on multiple datasets, showing that DEN outperforms existing methods in terms of performance and efficiency. The results demonstrate that DEN is effective in preventing semantic drift and can dynamically find its optimal capacity. The paper concludes that DEN is a novel deep neural network for lifelong learning that can be used for both lifelong learning and network structure estimation.