21 Feb 2025 | Yunpeng Luo, Junlong Du, Ke Yan*, Shouhong Ding
LaRE²: A Latent Reconstruction Error-Based Method for Diffusion-Generated Image Detection
This paper proposes LaRE², a novel method for detecting diffusion-generated images. LaRE² introduces a latent reconstruction error (LaRE) as a more efficient feature extraction method compared to existing methods. LaRE is extracted in a single step of the diffusion reverse process, significantly improving efficiency. Additionally, LaRE is calculated in the latent space, further enhancing efficiency. LaRE is also positively correlated with the local information frequency of the original image, making it a valuable cue for generated image detection.
To enhance the discriminativeness of the image features, LaRE² incorporates an Error-Guided Feature Refinement module (EGRE). EGRE aligns LaRE with the image feature map and refines the image feature in both spatial and channel perspectives. This alignment and refinement process improves the model's ability to distinguish between real and generated images.
LaRE² is evaluated on the GenImage benchmark, which contains 2,681,167 images, including 1,331,167 real images and 1,350,000 generated images from 8 different generators. LaRE² achieves a significant performance gain of up to 11.9%/12.1% ACC/AP compared to the best state-of-the-art method. LaRE² is also 8 times faster than existing methods, demonstrating its efficiency and effectiveness.
The contributions of LaRE² include: (1) a novel feature, LaRE, which is the first reconstruction error in the latent space for generated image detection; (2) a novel module, EGRE, which conducts error-guided feature refinement to enhance the discriminativeness of image features; and (3) superior performance, with LaRE² achieving significant improvements on the GenImage benchmark.
LaRE² outperforms existing methods in terms of feature extraction efficiency and accuracy. It is also more generalizable across different diffusion models and generators. The method is effective in detecting generated images and has the potential to be applied in various real-world scenarios.LaRE²: A Latent Reconstruction Error-Based Method for Diffusion-Generated Image Detection
This paper proposes LaRE², a novel method for detecting diffusion-generated images. LaRE² introduces a latent reconstruction error (LaRE) as a more efficient feature extraction method compared to existing methods. LaRE is extracted in a single step of the diffusion reverse process, significantly improving efficiency. Additionally, LaRE is calculated in the latent space, further enhancing efficiency. LaRE is also positively correlated with the local information frequency of the original image, making it a valuable cue for generated image detection.
To enhance the discriminativeness of the image features, LaRE² incorporates an Error-Guided Feature Refinement module (EGRE). EGRE aligns LaRE with the image feature map and refines the image feature in both spatial and channel perspectives. This alignment and refinement process improves the model's ability to distinguish between real and generated images.
LaRE² is evaluated on the GenImage benchmark, which contains 2,681,167 images, including 1,331,167 real images and 1,350,000 generated images from 8 different generators. LaRE² achieves a significant performance gain of up to 11.9%/12.1% ACC/AP compared to the best state-of-the-art method. LaRE² is also 8 times faster than existing methods, demonstrating its efficiency and effectiveness.
The contributions of LaRE² include: (1) a novel feature, LaRE, which is the first reconstruction error in the latent space for generated image detection; (2) a novel module, EGRE, which conducts error-guided feature refinement to enhance the discriminativeness of image features; and (3) superior performance, with LaRE² achieving significant improvements on the GenImage benchmark.
LaRE² outperforms existing methods in terms of feature extraction efficiency and accuracy. It is also more generalizable across different diffusion models and generators. The method is effective in detecting generated images and has the potential to be applied in various real-world scenarios.