Accelerating Image Generation with Sub-Path Linear Approximation Model

Accelerating Image Generation with Sub-Path Linear Approximation Model

21 Jul 2024 | Chen Xu, Tianhui Song, Weixin Feng, Xubin Li, Tiezheng Ge, Bo Zheng, and Limin Wang
This paper proposes a novel approach called Sub-Path Linear Approximation Model (SPLAM) to accelerate diffusion models while maintaining high-quality image generation. SPLAM leverages the consistency models' approximation strategy and treats the Probability Flow (PF)-ODE trajectory as a series of sub-paths divided by sampled points. It uses Sub-Path Linear (SL) ODEs to form a progressive and continuous error estimation along each sub-path. The optimization of SL-ODEs allows SPLAM to construct denoising mappings with smaller cumulative approximated errors. An efficient distillation method is also developed to facilitate the incorporation of pre-trained diffusion models, such as latent diffusion models. The extensive experimental results demonstrate that SPLAM achieves remarkable training efficiency, requiring only 6 A100 GPU days to produce a high-quality generative model capable of 2 to 4-step generation. Comprehensive evaluations on LAION, MS COCO 2014, and MS COCO 2017 datasets show that SPLAM surpasses existing acceleration methods in few-step generation tasks, achieving state-of-the-art performance on FID and image quality. The key contributions include identifying the optimization process for consistency models as minimizing cumulative approximated error along PF-ODE sub-path endpoints, proposing SPLAM to continuously approximate PF-ODE trajectories and progressively optimize sub-path learning objectives, and developing an efficient distillation method for SPLAM that enables integration with pre-trained latent diffusion models. The results show that SPLAM achieves impressive FIDs on LAION, MS COCO 2014, and MS COCO 2017 datasets, with performance close to previous accelerating approaches. The method is efficient, with reduced training time and inference latency, and can generate high-quality images with fewer sampling steps.This paper proposes a novel approach called Sub-Path Linear Approximation Model (SPLAM) to accelerate diffusion models while maintaining high-quality image generation. SPLAM leverages the consistency models' approximation strategy and treats the Probability Flow (PF)-ODE trajectory as a series of sub-paths divided by sampled points. It uses Sub-Path Linear (SL) ODEs to form a progressive and continuous error estimation along each sub-path. The optimization of SL-ODEs allows SPLAM to construct denoising mappings with smaller cumulative approximated errors. An efficient distillation method is also developed to facilitate the incorporation of pre-trained diffusion models, such as latent diffusion models. The extensive experimental results demonstrate that SPLAM achieves remarkable training efficiency, requiring only 6 A100 GPU days to produce a high-quality generative model capable of 2 to 4-step generation. Comprehensive evaluations on LAION, MS COCO 2014, and MS COCO 2017 datasets show that SPLAM surpasses existing acceleration methods in few-step generation tasks, achieving state-of-the-art performance on FID and image quality. The key contributions include identifying the optimization process for consistency models as minimizing cumulative approximated error along PF-ODE sub-path endpoints, proposing SPLAM to continuously approximate PF-ODE trajectories and progressively optimize sub-path learning objectives, and developing an efficient distillation method for SPLAM that enables integration with pre-trained latent diffusion models. The results show that SPLAM achieves impressive FIDs on LAION, MS COCO 2014, and MS COCO 2017 datasets, with performance close to previous accelerating approaches. The method is efficient, with reduced training time and inference latency, and can generate high-quality images with fewer sampling steps.
Reach us at info@study.space
[slides] Accelerating Image Generation with Sub-path Linear Approximation Model | StudySpace