CapHuman: Capture Your Moments in Parallel Universes

CapHuman: Capture Your Moments in Parallel Universes

17 May 2024 | Chao Liang, Fan Ma, Linchao Zhu, Yingying Deng, Yi Yang
CapHuman is a novel framework for human-centric image synthesis that generates photo-realistic, identity-preserved portraits with diverse head positions, poses, facial expressions, and illuminations. Given only one reference facial photograph, CapHuman can generate specific individual images with rich content and various head renditions. The framework is built upon the pre-trained text-to-image diffusion model, Stable Diffusion, and introduces the "encode then learn to align" paradigm to enable generalizable identity preservation without cumbersome tuning. It also incorporates a 3D facial prior to provide flexible and 3D-consistent head control. CapHuman outperforms established baselines in identity preservation, text-to-image alignment, and head control precision. The framework is evaluated on a new benchmark, HumanIPHC, and demonstrates superior performance in generating high-fidelity, identity-preserved portraits. CapHuman can be adapted to other pre-trained models, enabling flexible and diverse image generation. The framework is also capable of fine-grained head control and maintains a high level of prompt control. The results show that CapHuman can generate realistic and diverse human portraits with various head positions, poses, facial expressions, and illuminations in different contexts.CapHuman is a novel framework for human-centric image synthesis that generates photo-realistic, identity-preserved portraits with diverse head positions, poses, facial expressions, and illuminations. Given only one reference facial photograph, CapHuman can generate specific individual images with rich content and various head renditions. The framework is built upon the pre-trained text-to-image diffusion model, Stable Diffusion, and introduces the "encode then learn to align" paradigm to enable generalizable identity preservation without cumbersome tuning. It also incorporates a 3D facial prior to provide flexible and 3D-consistent head control. CapHuman outperforms established baselines in identity preservation, text-to-image alignment, and head control precision. The framework is evaluated on a new benchmark, HumanIPHC, and demonstrates superior performance in generating high-fidelity, identity-preserved portraits. CapHuman can be adapted to other pre-trained models, enabling flexible and diverse image generation. The framework is also capable of fine-grained head control and maintains a high level of prompt control. The results show that CapHuman can generate realistic and diverse human portraits with various head positions, poses, facial expressions, and illuminations in different contexts.
Reach us at info@study.space
[slides] CapHuman%3A Capture Your Moments in Parallel Universes | StudySpace