Deepfake Generation and Detection: A Benchmark and Survey

Deepfake Generation and Detection: A Benchmark and Survey

2024 | Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, Guangtao Zhai, Jian Yang, Chunhua Shen, Dacheng Tao
This survey provides a comprehensive overview of deepfake generation and detection technologies, summarizing the latest advancements in this rapidly evolving field. Deepfake generation, which creates highly realistic facial images and videos, has seen significant progress with the development of deep learning techniques such as Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and diffusion models. These models have enabled the generation of content that is nearly indistinguishable from real media, with applications in entertainment, movie production, and digital human creation. However, the potential misuse of deepfakes, such as privacy invasion and phishing attacks, necessitates the development of effective forgery detection methods. The survey discusses four main deepfake generation tasks: face swapping, face reenactment, talking face generation, and facial attribute editing. Face swapping involves exchanging identities between two images while preserving non-identity attributes. Face reenactment transfers source movements and poses to a target image. Talking face generation creates videos where a character speaks based on a driving source, such as text or audio. Facial attribute editing modifies specific facial attributes, such as age or expressions, in a controlled manner. In addition to generation, the survey covers forgery detection techniques, which aim to identify anomalies or forgeries in images or videos. These techniques have evolved from handcrafted feature-based methods to deep learning-based approaches and recent hybrid detection techniques. The data modality has shifted from spatial and frequent domains to more challenging temporal domains, reflecting the increasing complexity of deepfake generation. The survey also discusses related research domains, including head swapping, face super-resolution, face reconstruction, face inpainting, body animation, portrait style transfer, makeup transfer, and adversarial sample detection. These areas are closely related to deepfake generation and detection and have seen significant advancements in recent years. The survey provides a comprehensive comparison of datasets, metrics, and loss functions used in deepfake generation and detection. It also evaluates the latest and most influential works published in top-tier conferences and journals, particularly recent diffusion-based approaches. The survey highlights the challenges and future research directions in the field, emphasizing the need for continuous evolution of detection technologies to keep pace with the advancements in generation techniques.This survey provides a comprehensive overview of deepfake generation and detection technologies, summarizing the latest advancements in this rapidly evolving field. Deepfake generation, which creates highly realistic facial images and videos, has seen significant progress with the development of deep learning techniques such as Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and diffusion models. These models have enabled the generation of content that is nearly indistinguishable from real media, with applications in entertainment, movie production, and digital human creation. However, the potential misuse of deepfakes, such as privacy invasion and phishing attacks, necessitates the development of effective forgery detection methods. The survey discusses four main deepfake generation tasks: face swapping, face reenactment, talking face generation, and facial attribute editing. Face swapping involves exchanging identities between two images while preserving non-identity attributes. Face reenactment transfers source movements and poses to a target image. Talking face generation creates videos where a character speaks based on a driving source, such as text or audio. Facial attribute editing modifies specific facial attributes, such as age or expressions, in a controlled manner. In addition to generation, the survey covers forgery detection techniques, which aim to identify anomalies or forgeries in images or videos. These techniques have evolved from handcrafted feature-based methods to deep learning-based approaches and recent hybrid detection techniques. The data modality has shifted from spatial and frequent domains to more challenging temporal domains, reflecting the increasing complexity of deepfake generation. The survey also discusses related research domains, including head swapping, face super-resolution, face reconstruction, face inpainting, body animation, portrait style transfer, makeup transfer, and adversarial sample detection. These areas are closely related to deepfake generation and detection and have seen significant advancements in recent years. The survey provides a comprehensive comparison of datasets, metrics, and loss functions used in deepfake generation and detection. It also evaluates the latest and most influential works published in top-tier conferences and journals, particularly recent diffusion-based approaches. The survey highlights the challenges and future research directions in the field, emphasizing the need for continuous evolution of detection technologies to keep pace with the advancements in generation techniques.
Reach us at info@study.space
[slides] Deepfake Generation and Detection%3A A Benchmark and Survey | StudySpace