2024 | Jie Gao, Zhaoqiang Xia, Gian Luca Marcialis, Chen Dang, Jing Dai, Xiaoyi Feng
This paper addresses the challenge of DeepFake detection in highly compressed content, a significant issue due to the degradation of high-frequency information during compression. The authors propose a novel High-Frequency Enhancement (HiFE) framework that leverages a learnable adaptive high-frequency enhancement network to enrich weak high-frequency information in compressed content without uncompressed data supervision. The framework consists of three branches: a Basic branch using RGB domain, a Local High-Frequency Enhancement branch with Block-wise Discrete Cosine Transform (DCT), and a Global High-Frequency Enhancement branch with Multi-level Discrete Wavelet Transform (DWT). The local branch uses DCT coefficients and channel attention mechanisms to achieve adaptive frequency-aware multi-spatial attention, while the global branch supplements high-frequency information by extracting coarse-to-fine multi-scale cues and performing cascade-residual-based multi-level fusion. Additionally, a Two-Stage Cross-Fusion module is designed to integrate all information effectively. Experimental results on FaceForensics++, Celeb-DF, and OpenForensics datasets demonstrate that the proposed method outperforms existing state-of-the-art methods, particularly on low-quality data. The code for the method is available at [GitHub](https://github.com/Gina-YUE/HiFFE).This paper addresses the challenge of DeepFake detection in highly compressed content, a significant issue due to the degradation of high-frequency information during compression. The authors propose a novel High-Frequency Enhancement (HiFE) framework that leverages a learnable adaptive high-frequency enhancement network to enrich weak high-frequency information in compressed content without uncompressed data supervision. The framework consists of three branches: a Basic branch using RGB domain, a Local High-Frequency Enhancement branch with Block-wise Discrete Cosine Transform (DCT), and a Global High-Frequency Enhancement branch with Multi-level Discrete Wavelet Transform (DWT). The local branch uses DCT coefficients and channel attention mechanisms to achieve adaptive frequency-aware multi-spatial attention, while the global branch supplements high-frequency information by extracting coarse-to-fine multi-scale cues and performing cascade-residual-based multi-level fusion. Additionally, a Two-Stage Cross-Fusion module is designed to integrate all information effectively. Experimental results on FaceForensics++, Celeb-DF, and OpenForensics datasets demonstrate that the proposed method outperforms existing state-of-the-art methods, particularly on low-quality data. The code for the method is available at [GitHub](https://github.com/Gina-YUE/HiFFE).