30 May 2024 | Simone Magistri, Tomaso Trinci, Albin Soutif-Cormerais, Joost van de Weijer, Andrew D. Bagdanov
This paper addresses the challenging Cold Start scenario in exemplar-free class incremental learning (EFCIL), where insufficient data is available in the first task to learn a high-quality backbone. The authors propose Elastic Feature Consolidation (EFC), a method that regularizes feature drift in directions highly relevant to previous tasks and employs prototypes to reduce task-recency bias. EFC uses an Empirical Feature Matrix (EFM) to induce a pseudo-metric in feature space, which is used to regularize feature drift and update Gaussian prototypes in an asymmetric cross entropy loss. Experimental results on CIFAR-100, Tiny-ImageNet, ImageNet-Subset, and ImageNet-1K demonstrate that EFC significantly outperforms state-of-the-art methods in both Warm Start and Cold Start scenarios, maintaining model plasticity while effectively learning new tasks. The paper also includes an ablation study and discusses storage costs, showing that EFC can mitigate these costs with appropriate approximations or proxies for class covariance matrices.This paper addresses the challenging Cold Start scenario in exemplar-free class incremental learning (EFCIL), where insufficient data is available in the first task to learn a high-quality backbone. The authors propose Elastic Feature Consolidation (EFC), a method that regularizes feature drift in directions highly relevant to previous tasks and employs prototypes to reduce task-recency bias. EFC uses an Empirical Feature Matrix (EFM) to induce a pseudo-metric in feature space, which is used to regularize feature drift and update Gaussian prototypes in an asymmetric cross entropy loss. Experimental results on CIFAR-100, Tiny-ImageNet, ImageNet-Subset, and ImageNet-1K demonstrate that EFC significantly outperforms state-of-the-art methods in both Warm Start and Cold Start scenarios, maintaining model plasticity while effectively learning new tasks. The paper also includes an ablation study and discusses storage costs, showing that EFC can mitigate these costs with appropriate approximations or proxies for class covariance matrices.