This paper proposes an efficient diffusion model for multi-contrast MRI super-resolution (SR), named DiffMSR. The method addresses the limitations of existing diffusion-based SR methods, which often require a large number of iterations and produce distorted results. DiffMSR leverages a highly compact latent space to generate prior knowledge with high-frequency details, reducing computational complexity and iteration steps. It integrates a Prior-Guide Large Window Transformer (PLWformer) as the decoder, which expands the receptive field while utilizing the prior knowledge to ensure accurate and undistorted reconstruction. The PLWformer benefits from large window self-attention with reduced computational burden, enhancing the performance of the diffusion model in generating detailed images.
The method is trained in two stages: first, the Prior Extraction (PE) module compresses HR images into compact latent features, which are used to guide the PLWformer. Second, the PE module is used to train the diffusion model to generate prior knowledge, which is then used to enhance the PLWformer's reconstruction capabilities. The PLWformer employs prior knowledge generated by the diffusion model to reconstruct high-frequency details accurately.
Extensive experiments on public and clinical datasets demonstrate that DiffMSR outperforms state-of-the-art methods in terms of PSNR and SSIM metrics. The method achieves superior performance by combining the strengths of diffusion models and Transformers, preserving the original image structure while maximizing the reconstruction of complex anatomical structures. The model has mid-range parameters, the smallest FLOPs, and the fastest inference speed, thanks to two strategies that reduce computational overhead and speed up inference. The method is efficient, requiring only four iterations to reconstruct high-quality images.This paper proposes an efficient diffusion model for multi-contrast MRI super-resolution (SR), named DiffMSR. The method addresses the limitations of existing diffusion-based SR methods, which often require a large number of iterations and produce distorted results. DiffMSR leverages a highly compact latent space to generate prior knowledge with high-frequency details, reducing computational complexity and iteration steps. It integrates a Prior-Guide Large Window Transformer (PLWformer) as the decoder, which expands the receptive field while utilizing the prior knowledge to ensure accurate and undistorted reconstruction. The PLWformer benefits from large window self-attention with reduced computational burden, enhancing the performance of the diffusion model in generating detailed images.
The method is trained in two stages: first, the Prior Extraction (PE) module compresses HR images into compact latent features, which are used to guide the PLWformer. Second, the PE module is used to train the diffusion model to generate prior knowledge, which is then used to enhance the PLWformer's reconstruction capabilities. The PLWformer employs prior knowledge generated by the diffusion model to reconstruct high-frequency details accurately.
Extensive experiments on public and clinical datasets demonstrate that DiffMSR outperforms state-of-the-art methods in terms of PSNR and SSIM metrics. The method achieves superior performance by combining the strengths of diffusion models and Transformers, preserving the original image structure while maximizing the reconstruction of complex anatomical structures. The model has mid-range parameters, the smallest FLOPs, and the fastest inference speed, thanks to two strategies that reduce computational overhead and speed up inference. The method is efficient, requiring only four iterations to reconstruct high-quality images.