12 Dec 2017 | Hanul Shin, Jung Kwon Lee, Jaehong Kim, Jiwon Kim
This paper proposes a novel framework called Deep Generative Replay (DGR) for continual learning in deep neural networks. The framework is inspired by the hippocampus's generative nature as a short-term memory system in primate brains. DGR consists of a dual model architecture: a deep generative model (generator) and a task-solving model (solver). The generator is trained to mimic past data, while the solver is trained to perform tasks. The generator and solver work together to generate pseudo-data that can be used to train the solver on new tasks without requiring access to past data.
The key idea of DGR is that the generator can produce synthetic data that closely matches the distribution of past data, allowing the solver to learn new tasks while retaining knowledge from previous tasks. This is achieved by interleaving generated data with new task data during training. The generator is trained in the generative adversarial networks (GANs) framework to produce realistic samples, while the solver is trained to perform tasks using both real and generated data.
The framework is tested on several sequential learning settings involving image classification tasks. The results show that DGR effectively prevents catastrophic forgetting, where a model's performance on previously learned tasks degrades when trained on new tasks. The framework is able to maintain performance on previous tasks while learning new ones, even when the network configuration is different.
The paper also compares DGR with other approaches to continual learning, such as elastic weight consolidation (EWC) and learning without forgetting (LwF). It shows that DGR outperforms these methods in terms of performance on both new and old tasks. The framework is particularly effective in scenarios where access to past data is limited, as it does not require storing or replaying actual past data.
The paper concludes that DGR is a promising approach for continual learning in deep neural networks. It allows models to learn multiple tasks sequentially without forgetting previous knowledge, and it is particularly effective in scenarios where access to past data is limited. The framework is also scalable and can be applied to a wide range of tasks, as long as the trained generator can reliably reproduce the input space.This paper proposes a novel framework called Deep Generative Replay (DGR) for continual learning in deep neural networks. The framework is inspired by the hippocampus's generative nature as a short-term memory system in primate brains. DGR consists of a dual model architecture: a deep generative model (generator) and a task-solving model (solver). The generator is trained to mimic past data, while the solver is trained to perform tasks. The generator and solver work together to generate pseudo-data that can be used to train the solver on new tasks without requiring access to past data.
The key idea of DGR is that the generator can produce synthetic data that closely matches the distribution of past data, allowing the solver to learn new tasks while retaining knowledge from previous tasks. This is achieved by interleaving generated data with new task data during training. The generator is trained in the generative adversarial networks (GANs) framework to produce realistic samples, while the solver is trained to perform tasks using both real and generated data.
The framework is tested on several sequential learning settings involving image classification tasks. The results show that DGR effectively prevents catastrophic forgetting, where a model's performance on previously learned tasks degrades when trained on new tasks. The framework is able to maintain performance on previous tasks while learning new ones, even when the network configuration is different.
The paper also compares DGR with other approaches to continual learning, such as elastic weight consolidation (EWC) and learning without forgetting (LwF). It shows that DGR outperforms these methods in terms of performance on both new and old tasks. The framework is particularly effective in scenarios where access to past data is limited, as it does not require storing or replaying actual past data.
The paper concludes that DGR is a promising approach for continual learning in deep neural networks. It allows models to learn multiple tasks sequentially without forgetting previous knowledge, and it is particularly effective in scenarios where access to past data is limited. The framework is also scalable and can be applied to a wide range of tasks, as long as the trained generator can reliably reproduce the input space.