13 Sep 2022 | David Lopez-Paz and Marc'Aurelio Ranzato
The paper addresses the challenge of continual learning, where models must learn from a sequence of tasks without forgetting previously acquired knowledge. It introduces a set of metrics to evaluate models' performance over a continuum of data, focusing on both test accuracy and knowledge transfer between tasks. The proposed model, Gradient Episodic Memory (GEM), alleviates forgetting while allowing beneficial knowledge transfer. GEM uses an episodic memory to store examples from each task, ensuring that learning new tasks does not negatively impact performance on previous tasks. Experiments on MNIST and CIFAR-100 datasets demonstrate GEM's superior performance compared to state-of-the-art methods, showing minimal forgetting and positive backward transfer. The paper also discusses the importance of task descriptors and suggests future directions for improving memory management and reducing computation time.The paper addresses the challenge of continual learning, where models must learn from a sequence of tasks without forgetting previously acquired knowledge. It introduces a set of metrics to evaluate models' performance over a continuum of data, focusing on both test accuracy and knowledge transfer between tasks. The proposed model, Gradient Episodic Memory (GEM), alleviates forgetting while allowing beneficial knowledge transfer. GEM uses an episodic memory to store examples from each task, ensuring that learning new tasks does not negatively impact performance on previous tasks. Experiments on MNIST and CIFAR-100 datasets demonstrate GEM's superior performance compared to state-of-the-art methods, showing minimal forgetting and positive backward transfer. The paper also discusses the importance of task descriptors and suggests future directions for improving memory management and reducing computation time.