Understanding Diffusion-RWKV%3A Scaling RWKV-Like Architectures for Diffusion Models

The paper introduces Diffusion-RWKV, a series of architectures adapted from the RWKV model for image generation tasks. These models are designed to efficiently handle patchified inputs in a sequence with extra conditions while scaling up effectively, accommodating large-scale parameters and extensive datasets. The key advantage of Diffusion-RWKV is its reduced spatial aggregation complexity, making it adept at processing high-resolution images without the need for windowing or group cached operations. Experimental results on both conditional and unconditional image generation tasks demonstrate that Diffusion-RWKV achieves performance comparable to or surpassing existing CNN or Transformer-based diffusion models in FID and IS metrics while significantly reducing total computation FLOP usage. The paper also explores various configuration choices, including conditioning, block design, and model parameter scaling, and provides empirical baselines to enhance the model's capability while ensuring scalability and stability.The paper introduces Diffusion-RWKV, a series of architectures adapted from the RWKV model for image generation tasks. These models are designed to efficiently handle patchified inputs in a sequence with extra conditions while scaling up effectively, accommodating large-scale parameters and extensive datasets. The key advantage of Diffusion-RWKV is its reduced spatial aggregation complexity, making it adept at processing high-resolution images without the need for windowing or group cached operations. Experimental results on both conditional and unconditional image generation tasks demonstrate that Diffusion-RWKV achieves performance comparable to or surpassing existing CNN or Transformer-based diffusion models in FID and IS metrics while significantly reducing total computation FLOP usage. The paper also explores various configuration choices, including conditioning, block design, and model parameter scaling, and provides empirical baselines to enhance the model's capability while ensuring scalability and stability.

Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models

6 Apr 2024 | Zhengcong Fei, Mingyuan Fan, Changqian Yu, Debang Li, Junshi Huang*