CutMix is a data augmentation strategy that improves the performance of convolutional neural networks (CNNs) by combining patches from different training images and mixing their labels proportionally. Unlike traditional regional dropout methods that remove informative pixels, CutMix replaces removed regions with patches from other images, preserving the regularization effect of regional dropout while making efficient use of training pixels. This approach leads to better generalization and object localization capabilities. CutMix consistently outperforms state-of-the-art augmentation strategies on tasks such as image classification, ImageNet localization, and weakly supervised object localization. It also improves model robustness and out-of-distribution detection performance. CutMix has been shown to enhance performance on Pascal detection and MS-COCO image captioning benchmarks when used as a pretrained model. The method is simple to implement and incurs minimal computational overhead. Experiments on various datasets and architectures demonstrate that CutMix significantly improves classification accuracy, localization performance, and transfer learning results. CutMix is effective for a wide range of tasks, including image classification, object localization, and transfer learning to other tasks such as object detection and image captioning. It also enhances model robustness and reduces over-confidence in deep networks. The source code and pretrained models are available at https://github.com/clovaai/CutMix-PyTorch.CutMix is a data augmentation strategy that improves the performance of convolutional neural networks (CNNs) by combining patches from different training images and mixing their labels proportionally. Unlike traditional regional dropout methods that remove informative pixels, CutMix replaces removed regions with patches from other images, preserving the regularization effect of regional dropout while making efficient use of training pixels. This approach leads to better generalization and object localization capabilities. CutMix consistently outperforms state-of-the-art augmentation strategies on tasks such as image classification, ImageNet localization, and weakly supervised object localization. It also improves model robustness and out-of-distribution detection performance. CutMix has been shown to enhance performance on Pascal detection and MS-COCO image captioning benchmarks when used as a pretrained model. The method is simple to implement and incurs minimal computational overhead. Experiments on various datasets and architectures demonstrate that CutMix significantly improves classification accuracy, localization performance, and transfer learning results. CutMix is effective for a wide range of tasks, including image classification, object localization, and transfer learning to other tasks such as object detection and image captioning. It also enhances model robustness and reduces over-confidence in deep networks. The source code and pretrained models are available at https://github.com/clovaai/CutMix-PyTorch.