This article provides an overview of multi-task learning (MTL) in deep neural networks. MTL is a technique that allows a model to learn multiple tasks simultaneously, improving generalization by sharing representations between related tasks. The article discusses the two most common methods for MTL in deep learning: hard parameter sharing and soft parameter sharing. Hard parameter sharing involves sharing hidden layers between tasks, while soft parameter sharing uses regularization to encourage similarity between parameters. The article also explores the mechanisms that make MTL effective, such as implicit data augmentation, attention focusing, eavesdropping, representation bias, and regularization. It reviews existing literature on MTL in non-neural models, including block-sparse regularization and learning task relationships. Recent advances in deep learning for MTL include deep relationship networks, fully-adaptive feature sharing, cross-stitch networks, and tensor factorisation. The article also discusses auxiliary tasks, which are used to improve performance on the main task by leveraging related tasks. It concludes that while MTL has been widely used, there is still much to learn about the best ways to apply it in practice.This article provides an overview of multi-task learning (MTL) in deep neural networks. MTL is a technique that allows a model to learn multiple tasks simultaneously, improving generalization by sharing representations between related tasks. The article discusses the two most common methods for MTL in deep learning: hard parameter sharing and soft parameter sharing. Hard parameter sharing involves sharing hidden layers between tasks, while soft parameter sharing uses regularization to encourage similarity between parameters. The article also explores the mechanisms that make MTL effective, such as implicit data augmentation, attention focusing, eavesdropping, representation bias, and regularization. It reviews existing literature on MTL in non-neural models, including block-sparse regularization and learning task relationships. Recent advances in deep learning for MTL include deep relationship networks, fully-adaptive feature sharing, cross-stitch networks, and tensor factorisation. The article also discusses auxiliary tasks, which are used to improve performance on the main task by leveraging related tasks. It concludes that while MTL has been widely used, there is still much to learn about the best ways to apply it in practice.