Diffusion-RWKV is a diffusion model adapted from the RWKV architecture for image generation, designed to handle long-range dependencies efficiently with linear computational complexity. The model uses Bi-RWKV blocks for processing image patches, incorporating skip connections and condition information to enhance performance. It achieves comparable or superior results to existing CNN and Transformer-based diffusion models in terms of FID and IS metrics while significantly reducing computational costs. The model is trained on datasets such as CIFAR-10, CelebA, and ImageNet, with experiments showing that it performs well in both unconditional and class-conditional image generation tasks. Diffusion-RWKV demonstrates scalability, with performance improving as model size and resolution increase. It outperforms other models in terms of FID scores and computational efficiency, particularly in high-resolution image synthesis. The model's architecture allows for efficient training and inference, making it a promising alternative to traditional Transformer-based models in image generation. The paper also discusses related works, including other diffusion models, efficient sequence modeling techniques, and the use of RWKV-like architectures in vision tasks. Overall, Diffusion-RWKV offers a scalable and efficient solution for image generation, with potential applications in various domains.Diffusion-RWKV is a diffusion model adapted from the RWKV architecture for image generation, designed to handle long-range dependencies efficiently with linear computational complexity. The model uses Bi-RWKV blocks for processing image patches, incorporating skip connections and condition information to enhance performance. It achieves comparable or superior results to existing CNN and Transformer-based diffusion models in terms of FID and IS metrics while significantly reducing computational costs. The model is trained on datasets such as CIFAR-10, CelebA, and ImageNet, with experiments showing that it performs well in both unconditional and class-conditional image generation tasks. Diffusion-RWKV demonstrates scalability, with performance improving as model size and resolution increase. It outperforms other models in terms of FID scores and computational efficiency, particularly in high-resolution image synthesis. The model's architecture allows for efficient training and inference, making it a promising alternative to traditional Transformer-based models in image generation. The paper also discusses related works, including other diffusion models, efficient sequence modeling techniques, and the use of RWKV-like architectures in vision tasks. Overall, Diffusion-RWKV offers a scalable and efficient solution for image generation, with potential applications in various domains.