AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

26 Mar 2024 | Huawei Wei*, Zejun Yang*, and Zhisheng Wang
AniPortrait is a novel framework designed to generate high-quality, audio-driven portrait animations. The method is divided into two stages: the first stage extracts 3D facial mesh and head pose from audio input, projecting them into a sequence of 2D facial landmarks. The second stage uses a robust diffusion model, coupled with a motion module, to convert these landmarks into temporally consistent and photorealistic portrait animations. Experimental results demonstrate the framework's superiority in terms of facial naturalness, pose diversity, and visual quality. The method also exhibits flexibility and controllability, making it suitable for applications like facial motion editing and face reenactment. The code and model weights are available at https://github.com/Zejun-Yang/AniPortrait.AniPortrait is a novel framework designed to generate high-quality, audio-driven portrait animations. The method is divided into two stages: the first stage extracts 3D facial mesh and head pose from audio input, projecting them into a sequence of 2D facial landmarks. The second stage uses a robust diffusion model, coupled with a motion module, to convert these landmarks into temporally consistent and photorealistic portrait animations. Experimental results demonstrate the framework's superiority in terms of facial naturalness, pose diversity, and visual quality. The method also exhibits flexibility and controllability, making it suitable for applications like facial motion editing and face reenactment. The code and model weights are available at https://github.com/Zejun-Yang/AniPortrait.
Reach us at info@study.space