This paper introduces the Image Processing Transformer (IPT), a pre-trained model designed to address low-level computer vision tasks such as denoising, super-resolution, and deraining. The IPT model is trained using a large-scale dataset, specifically the ImageNet benchmark, to generate corrupted image pairs for various tasks. The model consists of multiple heads and tails, each tailored for different tasks, and a shared transformer body. Multi-head and multi-tail architectures are used to handle different tasks, while contrastive learning is introduced to adapt the model to various image processing tasks. Experimental results show that the pre-trained IPT model outperforms state-of-the-art methods on multiple benchmarks, demonstrating its effectiveness and generalization capabilities. The code for the IPT model is available on GitHub and Gitie.This paper introduces the Image Processing Transformer (IPT), a pre-trained model designed to address low-level computer vision tasks such as denoising, super-resolution, and deraining. The IPT model is trained using a large-scale dataset, specifically the ImageNet benchmark, to generate corrupted image pairs for various tasks. The model consists of multiple heads and tails, each tailored for different tasks, and a shared transformer body. Multi-head and multi-tail architectures are used to handle different tasks, while contrastive learning is introduced to adapt the model to various image processing tasks. Experimental results show that the pre-trained IPT model outperforms state-of-the-art methods on multiple benchmarks, demonstrating its effectiveness and generalization capabilities. The code for the IPT model is available on GitHub and Gitie.