24 Apr 2018 | Alex Kendall, Yarin Gal, Roberto Cipolla
This paper introduces a principled approach to multi-task learning using homoscedastic uncertainty to weigh losses for scene geometry and semantics. The authors propose a method that automatically learns optimal task weightings by considering the uncertainty of each task, which is crucial for balancing different regression and classification tasks. The model is trained on a single monocular RGB image to simultaneously learn pixel-wise depth regression, semantic segmentation, and instance segmentation. The method outperforms separate models trained on each task individually, demonstrating the effectiveness of multi-task learning in improving performance.
The key contributions of this work include: (1) a novel multi-task loss function that uses homoscedastic uncertainty to weigh losses, (2) a unified architecture for semantic segmentation, instance segmentation, and depth regression, and (3) demonstrating the importance of loss weighting in multi-task learning and how it leads to superior performance compared to equivalent separately trained models.
The authors show that multi-task learning can improve accuracy by leveraging cues from one task to regularize and improve another. For example, depth information can help improve the generalization of segmentation tasks. The model is evaluated on the CityScapes dataset, where it achieves significant improvements in performance across all three tasks. The results demonstrate that the proposed method can automatically learn optimal task weightings, leading to better performance than traditional approaches that rely on manual tuning of loss weights.
The paper also discusses the importance of task uncertainty in multi-task learning, showing that the relative confidence between tasks can be captured by the uncertainty inherent to the regression or classification task. The authors propose a multi-task likelihood function based on maximizing the Gaussian likelihood with homoscedastic uncertainty, which allows for the automatic learning of task weights. This approach is shown to be effective in both regression and classification tasks, and is robust to initializations of the task uncertainty values.
The model is evaluated on the CityScapes dataset, where it achieves significant improvements in performance across all three tasks. The results demonstrate that the proposed method can automatically learn optimal task weightings, leading to better performance than traditional approaches that rely on manual tuning of loss weights. The paper concludes that multi-task learning is a powerful approach for scene understanding, and that the proposed method provides a principled way to learn optimal task weightings using homoscedastic uncertainty.This paper introduces a principled approach to multi-task learning using homoscedastic uncertainty to weigh losses for scene geometry and semantics. The authors propose a method that automatically learns optimal task weightings by considering the uncertainty of each task, which is crucial for balancing different regression and classification tasks. The model is trained on a single monocular RGB image to simultaneously learn pixel-wise depth regression, semantic segmentation, and instance segmentation. The method outperforms separate models trained on each task individually, demonstrating the effectiveness of multi-task learning in improving performance.
The key contributions of this work include: (1) a novel multi-task loss function that uses homoscedastic uncertainty to weigh losses, (2) a unified architecture for semantic segmentation, instance segmentation, and depth regression, and (3) demonstrating the importance of loss weighting in multi-task learning and how it leads to superior performance compared to equivalent separately trained models.
The authors show that multi-task learning can improve accuracy by leveraging cues from one task to regularize and improve another. For example, depth information can help improve the generalization of segmentation tasks. The model is evaluated on the CityScapes dataset, where it achieves significant improvements in performance across all three tasks. The results demonstrate that the proposed method can automatically learn optimal task weightings, leading to better performance than traditional approaches that rely on manual tuning of loss weights.
The paper also discusses the importance of task uncertainty in multi-task learning, showing that the relative confidence between tasks can be captured by the uncertainty inherent to the regression or classification task. The authors propose a multi-task likelihood function based on maximizing the Gaussian likelihood with homoscedastic uncertainty, which allows for the automatic learning of task weights. This approach is shown to be effective in both regression and classification tasks, and is robust to initializations of the task uncertainty values.
The model is evaluated on the CityScapes dataset, where it achieves significant improvements in performance across all three tasks. The results demonstrate that the proposed method can automatically learn optimal task weightings, leading to better performance than traditional approaches that rely on manual tuning of loss weights. The paper concludes that multi-task learning is a powerful approach for scene understanding, and that the proposed method provides a principled way to learn optimal task weightings using homoscedastic uncertainty.