14 Aug 2018 | Arslan Chaudhry*, Puneet K. Dokania*, Thalaiyasingam Ajanthan*, Philip H. S. Torr
This paper addresses the challenges of incremental learning (IL), particularly the issues of catastrophic forgetting and intransigence. The authors introduce two new metrics, "forgetting" and "intransigence," to quantitatively evaluate IL algorithms. They propose RWalk, a generalization of Elastic Weight Consolidation (EWC) and Path Integral (PI), which combines KL-divergence-based regularization with parameter importance scores and strategies to obtain representative samples from previous tasks. RWalk is evaluated on the MNIST and CIFAR-100 datasets, showing superior performance in terms of accuracy and a better trade-off between forgetting and intransigence compared to existing methods. The paper also discusses the practicality of single-head and multi-head evaluation settings, the probabilistic interpretation of neural network outputs, and the connection between KL-divergence and Riemannian manifold distances. Experimental results demonstrate that RWalk effectively mitigates intransigence and improves accuracy, even with a small subset of representative samples from previous tasks.This paper addresses the challenges of incremental learning (IL), particularly the issues of catastrophic forgetting and intransigence. The authors introduce two new metrics, "forgetting" and "intransigence," to quantitatively evaluate IL algorithms. They propose RWalk, a generalization of Elastic Weight Consolidation (EWC) and Path Integral (PI), which combines KL-divergence-based regularization with parameter importance scores and strategies to obtain representative samples from previous tasks. RWalk is evaluated on the MNIST and CIFAR-100 datasets, showing superior performance in terms of accuracy and a better trade-off between forgetting and intransigence compared to existing methods. The paper also discusses the practicality of single-head and multi-head evaluation settings, the probabilistic interpretation of neural network outputs, and the connection between KL-divergence and Riemannian manifold distances. Experimental results demonstrate that RWalk effectively mitigates intransigence and improves accuracy, even with a small subset of representative samples from previous tasks.