Multi-Task Learning as Multi-Objective Optimization

Multi-Task Learning as Multi-Objective Optimization

11 Jan 2019 | Ozan Sener, Vladlen Koltun
This paper presents a novel approach to multi-task learning (MTL) by reformulating it as a multi-objective optimization problem. The authors argue that traditional MTL methods, which rely on optimizing a weighted sum of task losses, are limited because they assume tasks are non-conflicting. Instead, they propose casting MTL as finding Pareto optimal solutions, where no task is dominated by another. To achieve this, they develop an efficient optimization algorithm based on the Frank-Wolfe method, which scales well to high-dimensional problems and large numbers of tasks. The algorithm uses an upper bound of the multi-objective loss function, allowing for efficient computation without explicit task-specific gradients. The authors demonstrate that this approach yields Pareto optimal solutions under realistic assumptions and evaluate their method on various MTL tasks, including digit classification, multi-label classification, and scene understanding. Their results show that the proposed method outperforms existing MTL approaches and achieves state-of-the-art performance on multiple benchmark datasets. The paper also provides theoretical guarantees for the effectiveness of their method, showing that it produces Pareto optimal solutions with minimal computational overhead.This paper presents a novel approach to multi-task learning (MTL) by reformulating it as a multi-objective optimization problem. The authors argue that traditional MTL methods, which rely on optimizing a weighted sum of task losses, are limited because they assume tasks are non-conflicting. Instead, they propose casting MTL as finding Pareto optimal solutions, where no task is dominated by another. To achieve this, they develop an efficient optimization algorithm based on the Frank-Wolfe method, which scales well to high-dimensional problems and large numbers of tasks. The algorithm uses an upper bound of the multi-objective loss function, allowing for efficient computation without explicit task-specific gradients. The authors demonstrate that this approach yields Pareto optimal solutions under realistic assumptions and evaluate their method on various MTL tasks, including digit classification, multi-label classification, and scene understanding. Their results show that the proposed method outperforms existing MTL approaches and achieves state-of-the-art performance on multiple benchmark datasets. The paper also provides theoretical guarantees for the effectiveness of their method, showing that it produces Pareto optimal solutions with minimal computational overhead.
Reach us at info@study.space