2018 | Jonathan Schwarz, Jelena Luketina, Wojciech M. Czarnecki, Agnieszka Grabska-Barwinska, Yee Whye Teh, Razvan Pascanu, Raia Hadsell
The paper introduces the Progress & Compress (P&C) framework, a scalable and conceptually simple approach for continual learning where tasks are learned sequentially. The method aims to preserve performance on previously encountered tasks while accelerating learning progress on new problems. It achieves this by training a network with two components: a *knowledge base* capable of solving previously learned problems and an *active column* used to efficiently learn new tasks. After learning a new task, the active column is distilled into the knowledge base, protecting previously acquired skills. This cycle of active learning (progression) followed by consolidation (compression) requires no architecture growth, no access to or storage of previous data, and no task-specific parameters. The approach is demonstrated on sequential classification of handwritten alphabets and reinforcement learning domains such as Atari games and 3D maze navigation. The paper also discusses related work and presents experiments to evaluate the method's performance in terms of catastrophic forgetting, positive transfer, and overall performance.The paper introduces the Progress & Compress (P&C) framework, a scalable and conceptually simple approach for continual learning where tasks are learned sequentially. The method aims to preserve performance on previously encountered tasks while accelerating learning progress on new problems. It achieves this by training a network with two components: a *knowledge base* capable of solving previously learned problems and an *active column* used to efficiently learn new tasks. After learning a new task, the active column is distilled into the knowledge base, protecting previously acquired skills. This cycle of active learning (progression) followed by consolidation (compression) requires no architecture growth, no access to or storage of previous data, and no task-specific parameters. The approach is demonstrated on sequential classification of handwritten alphabets and reinforcement learning domains such as Atari games and 3D maze navigation. The paper also discusses related work and presents experiments to evaluate the method's performance in terms of catastrophic forgetting, positive transfer, and overall performance.