19 Jul 2024 | Paul Friedrich, Julia Wolleb, Florentin Bieder, Alicia Durrer, and Philippe C. Cattin
This paper introduces WDM, a 3D wavelet diffusion model for high-resolution medical image synthesis. The method applies a diffusion model to wavelet-decomposed images, enabling efficient generation of high-resolution medical images with reduced GPU memory usage. The proposed framework is memory-efficient and can be trained on a single 40 GB GPU. Experimental results on the BraTS and LIDC-IDRI datasets demonstrate that WDM achieves state-of-the-art image fidelity (FID) and sample diversity (MS-SSIM) scores compared to recent GANs, diffusion models, and latent diffusion models. WDM is the only method capable of generating high-quality images at a resolution of 256×256×256, outperforming all other methods. The method operates in the wavelet domain, reducing the spatial dimension and allowing for shallower network architectures, less computation, and a significantly reduced memory footprint. The wavelet-based approach enables efficient scaling of 3D diffusion models to high resolutions while maintaining the same standard network architecture as 3D DDPM. The method also offers a simple and dataset-agnostic tool for spatial dimensionality reduction without requiring additional training. The results show that WDM outperforms other methods in terms of image quality and efficiency, making it a promising approach for high-resolution medical image synthesis. The project is available at https://pfriedri.github.io/wdm-3d-io.This paper introduces WDM, a 3D wavelet diffusion model for high-resolution medical image synthesis. The method applies a diffusion model to wavelet-decomposed images, enabling efficient generation of high-resolution medical images with reduced GPU memory usage. The proposed framework is memory-efficient and can be trained on a single 40 GB GPU. Experimental results on the BraTS and LIDC-IDRI datasets demonstrate that WDM achieves state-of-the-art image fidelity (FID) and sample diversity (MS-SSIM) scores compared to recent GANs, diffusion models, and latent diffusion models. WDM is the only method capable of generating high-quality images at a resolution of 256×256×256, outperforming all other methods. The method operates in the wavelet domain, reducing the spatial dimension and allowing for shallower network architectures, less computation, and a significantly reduced memory footprint. The wavelet-based approach enables efficient scaling of 3D diffusion models to high resolutions while maintaining the same standard network architecture as 3D DDPM. The method also offers a simple and dataset-agnostic tool for spatial dimensionality reduction without requiring additional training. The results show that WDM outperforms other methods in terms of image quality and efficiency, making it a promising approach for high-resolution medical image synthesis. The project is available at https://pfriedri.github.io/wdm-3d-io.