29 Mar 2024 | Anurag Roy1 Riddhiman Moulick1 Vinay K. Verma2* Saptarshi Ghosh1 Abir Das1
The paper introduces ConvPrompt, a novel approach for continual learning (CL) that combines convolutional prompt creation with language models to improve performance and efficiency. ConvPrompt addresses the limitations of existing prompt tuning methods by maintaining layer-wise shared embeddings, enabling both layer-specific learning and better concept transfer across tasks. The approach uses convolution over task-shared parameters to generate task-specific prompts, leveraging similarity between tasks to dynamically control the number of prompts. This method reduces parameter overhead while maintaining or improving performance. Extensive experiments on various benchmarks show that ConvPrompt outperforms state-of-the-art approaches by about 3% with significantly fewer parameters. The paper also includes ablation studies to disentangle the importance of different components and demonstrates the effectiveness of the proposed approach in handling similar tasks and preventing overfitting.The paper introduces ConvPrompt, a novel approach for continual learning (CL) that combines convolutional prompt creation with language models to improve performance and efficiency. ConvPrompt addresses the limitations of existing prompt tuning methods by maintaining layer-wise shared embeddings, enabling both layer-specific learning and better concept transfer across tasks. The approach uses convolution over task-shared parameters to generate task-specific prompts, leveraging similarity between tasks to dynamically control the number of prompts. This method reduces parameter overhead while maintaining or improving performance. Extensive experiments on various benchmarks show that ConvPrompt outperforms state-of-the-art approaches by about 3% with significantly fewer parameters. The paper also includes ablation studies to disentangle the importance of different components and demonstrates the effectiveness of the proposed approach in handling similar tasks and preventing overfitting.