Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos

Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos

6 Feb 2024 | Alfredo Rivero*, ShahRukh Athar*, Zhixin Shu, Dimitris Samaras
Rig3DGS is a method for creating controllable 3D human portraits from casual monocular smartphone videos. The method enables reanimation with full control over facial expressions, head-pose, and novel view synthesis. It uses a learnable deformation prior based on a 3D morphable model to ensure photorealistic reanimation and generalization to novel expressions and head-poses. The method represents the entire scene using 3D Gaussians in a canonical space, which are transformed to a deformed space using learned deformations. The key innovation is a carefully designed deformation method guided by a learnable prior derived from a 3D face mesh. This approach is efficient in training and effective in controlling facial expressions, head positions, and view synthesis across various captures. Rig3DGS outperforms prior methods in rendering quality and is 50 times faster due to the use of 3D Gaussians. The method is evaluated on multiple settings, showing superior performance in terms of quality and fidelity. It is able to generate high-quality photorealistic renders with full control over facial expressions and head-pose. The method is also able to model details of the subject's face such as hair and glasses. However, it has limitations in modeling strong non-uniform illumination and requires the subject to remain relatively still during capture. The method is supported by grants from the CDC/NIOSH and Adobe.Rig3DGS is a method for creating controllable 3D human portraits from casual monocular smartphone videos. The method enables reanimation with full control over facial expressions, head-pose, and novel view synthesis. It uses a learnable deformation prior based on a 3D morphable model to ensure photorealistic reanimation and generalization to novel expressions and head-poses. The method represents the entire scene using 3D Gaussians in a canonical space, which are transformed to a deformed space using learned deformations. The key innovation is a carefully designed deformation method guided by a learnable prior derived from a 3D face mesh. This approach is efficient in training and effective in controlling facial expressions, head positions, and view synthesis across various captures. Rig3DGS outperforms prior methods in rendering quality and is 50 times faster due to the use of 3D Gaussians. The method is evaluated on multiple settings, showing superior performance in terms of quality and fidelity. It is able to generate high-quality photorealistic renders with full control over facial expressions and head-pose. The method is also able to model details of the subject's face such as hair and glasses. However, it has limitations in modeling strong non-uniform illumination and requires the subject to remain relatively still during capture. The method is supported by grants from the CDC/NIOSH and Adobe.
Reach us at info@study.space
[slides] Rig3DGS%3A Creating Controllable Portraits from Casual Monocular Videos | StudySpace