**LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights**
Latent Diffusion Models (LDMs) are powerful generative models known for their efficiency under limited computational resources. However, deploying LDMs on resource-limited devices remains challenging due to high memory consumption and inference speed issues. To address this, the authors introduce LD-Pruner, a novel structured pruning method that preserves performance while compressing LDMs. Traditional pruning methods for deep neural networks are not tailored to LDMs, which have unique characteristics such as high training costs and lack of a straightforward, task-agnostic evaluation method.
LD-Pruner leverages the latent space during pruning to effectively quantify the impact of pruning on model performance, independent of the task. This approach allows for faster convergence during training by removing components with minimal impact on the output, reducing computational costs. The method is demonstrated to achieve improved inference speed and reduced parameter count while maintaining minimal performance degradation across three tasks: text-to-image (T2I) generation, unconditional image generation (UIG), and unconditional audio generation (UAG).
Key contributions of the paper include:
- A novel, comprehensive metric for comparing LDM latent representations.
- A task-agnostic algorithm for compressing LDMs through architectural pruning.
- Successful application of the method in three diverse tasks, showcasing its versatility and effectiveness.
The paper also provides a detailed explanation of the proposed method, experimental setup, and results, highlighting the benefits of LD-Pruner in terms of speed, performance, and applicability in resource-constrained environments.**LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights**
Latent Diffusion Models (LDMs) are powerful generative models known for their efficiency under limited computational resources. However, deploying LDMs on resource-limited devices remains challenging due to high memory consumption and inference speed issues. To address this, the authors introduce LD-Pruner, a novel structured pruning method that preserves performance while compressing LDMs. Traditional pruning methods for deep neural networks are not tailored to LDMs, which have unique characteristics such as high training costs and lack of a straightforward, task-agnostic evaluation method.
LD-Pruner leverages the latent space during pruning to effectively quantify the impact of pruning on model performance, independent of the task. This approach allows for faster convergence during training by removing components with minimal impact on the output, reducing computational costs. The method is demonstrated to achieve improved inference speed and reduced parameter count while maintaining minimal performance degradation across three tasks: text-to-image (T2I) generation, unconditional image generation (UIG), and unconditional audio generation (UAG).
Key contributions of the paper include:
- A novel, comprehensive metric for comparing LDM latent representations.
- A task-agnostic algorithm for compressing LDMs through architectural pruning.
- Successful application of the method in three diverse tasks, showcasing its versatility and effectiveness.
The paper also provides a detailed explanation of the proposed method, experimental setup, and results, highlighting the benefits of LD-Pruner in terms of speed, performance, and applicability in resource-constrained environments.