29 Mar 2024 | Ahmed Agiza*, Marina Neseem*, Sherief Reda
MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning
**Authors:** Ahmed Agiza, Marina Neseem, Sherief Reda
**Affiliation:** Brown University, Providence, RI
**Emails:** ahmed.agiza@brown.edu, marina.neseem@brown.edu, sherief.reda@brown.edu
**Abstract:**
Adapting pre-trained models to various downstream tasks is a common strategy in deep learning. Parameter-efficient fine-tuning methods have emerged as a promising approach to adapt pre-trained models to different tasks while training only a minimal number of parameters. While most of these methods are designed for single-task adaptation, parameter-efficient training in Multi-Task Learning (MTL) architectures is still underexplored. This paper introduces MTLoRA, a novel framework for parameter-efficient training of MTL models. MTLoRA employs Task-Agnostic and Task-Specific Low-Rank Adaptation modules, which effectively disentangle the parameter space in MTL fine-tuning, enabling the model to handle both task specialization and interaction within MTL contexts. Experiments on the PASCAL dataset show that MTLoRA achieves higher accuracy on downstream tasks compared to fully fine-tuning the MTL model while reducing the number of trainable parameters by 3.6×. Additionally, MTLoRA establishes a Pareto-optimal trade-off between the number of trainable parameters and the accuracy of the downstream tasks, outperforming current state-of-the-art parameter-efficient training methods in both accuracy and efficiency.
**Introduction:**
General-purpose vision and language models trained on large-scale datasets show remarkable adaptability to a wide range of downstream tasks. However, individually fine-tuning all parameters for each task poses significant efficiency challenges. Parameter-efficient training methods aim to optimize training efficiency by limiting the number of trainable parameters while preserving or enhancing task-specific fine-tuning. Most existing methods are tailored for single-task adaptation and may lose effectiveness in MTL scenarios due to the inherent complexity of MTL. This paper focuses on parameter-efficient training specifically for MTL architectures, where a single shared backbone is trained to extract feature representations for multiple tasks. The challenge in MTL is to balance conflicting updates from different tasks, and MTLoRA addresses this by using a combination of Task-Agnostic and Task-Specific low-rank decomposition modules.
**Contributions:**
- Introduces MTL$\mathrm{LoRA}$, the first approach to address parameter-efficient training of multi-task learning models.
- Designs novel Task-Agnostic and Task-Specific low-rank adaptation modules to adapt a shared vision-transformer backbone to multiple downstream dense prediction tasks.
- Observes that adding low-rank adaptation to the patch-merging layers in vision transformers significantly improves the accuracy-efficiency trade-off during fine-tuning MTL models.
- Applys MTL$\mathrm{LoRA}$ and MTL$\mathrm{LoRA}+$ to aMTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning
**Authors:** Ahmed Agiza, Marina Neseem, Sherief Reda
**Affiliation:** Brown University, Providence, RI
**Emails:** ahmed.agiza@brown.edu, marina.neseem@brown.edu, sherief.reda@brown.edu
**Abstract:**
Adapting pre-trained models to various downstream tasks is a common strategy in deep learning. Parameter-efficient fine-tuning methods have emerged as a promising approach to adapt pre-trained models to different tasks while training only a minimal number of parameters. While most of these methods are designed for single-task adaptation, parameter-efficient training in Multi-Task Learning (MTL) architectures is still underexplored. This paper introduces MTLoRA, a novel framework for parameter-efficient training of MTL models. MTLoRA employs Task-Agnostic and Task-Specific Low-Rank Adaptation modules, which effectively disentangle the parameter space in MTL fine-tuning, enabling the model to handle both task specialization and interaction within MTL contexts. Experiments on the PASCAL dataset show that MTLoRA achieves higher accuracy on downstream tasks compared to fully fine-tuning the MTL model while reducing the number of trainable parameters by 3.6×. Additionally, MTLoRA establishes a Pareto-optimal trade-off between the number of trainable parameters and the accuracy of the downstream tasks, outperforming current state-of-the-art parameter-efficient training methods in both accuracy and efficiency.
**Introduction:**
General-purpose vision and language models trained on large-scale datasets show remarkable adaptability to a wide range of downstream tasks. However, individually fine-tuning all parameters for each task poses significant efficiency challenges. Parameter-efficient training methods aim to optimize training efficiency by limiting the number of trainable parameters while preserving or enhancing task-specific fine-tuning. Most existing methods are tailored for single-task adaptation and may lose effectiveness in MTL scenarios due to the inherent complexity of MTL. This paper focuses on parameter-efficient training specifically for MTL architectures, where a single shared backbone is trained to extract feature representations for multiple tasks. The challenge in MTL is to balance conflicting updates from different tasks, and MTLoRA addresses this by using a combination of Task-Agnostic and Task-Specific low-rank decomposition modules.
**Contributions:**
- Introduces MTL$\mathrm{LoRA}$, the first approach to address parameter-efficient training of multi-task learning models.
- Designs novel Task-Agnostic and Task-Specific low-rank adaptation modules to adapt a shared vision-transformer backbone to multiple downstream dense prediction tasks.
- Observes that adding low-rank adaptation to the patch-merging layers in vision transformers significantly improves the accuracy-efficiency trade-off during fine-tuning MTL models.
- Applys MTL$\mathrm{LoRA}$ and MTL$\mathrm{LoRA}+$ to a