Three-dimensional reconstruction and editing from single images with generative models
Date
Authors
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
BUIR Usage Stats
views
downloads
Series
Abstract
Advancements in generative networks have significantly improved visual synthesis, particularly in three-dimensional (3D) applications. However, key challenges remain in achieving high-fidelity 3D reconstruction, preserving identity in 3D stylization, and enabling reference-based edits with 3D consistency. This thesis attempts to address these gaps through three interconnected studies. First, a framework of high-fidelity 3D head reconstruction from single images is introduced, leveraging dual encoder GAN inversion to reconstruct full 360-degree heads. By integrating an occlusion-aware triplane discriminator, this approach ensures seamless blending of visible and occluded regions, surpassing existing methods in realism and structural accuracy. Next, an identity-preserving 3D head stylization method is developed to balance artistic transformation with facial identity retention. Through multi-view score distillation and likelihood distillation, this technique enhances stylization diversity while maintaining subjectspecific features, outperforming prior diffusion-to-GAN adaptation strategies. Finally, a single image reference-based 3D-aware image editing method extends these advancements by enabling precise, high-quality edits using triplane representations. By incorporating automatic feature localization, spatial disentanglement, and fusion learning, this work achieves state-of-the-art performance in 3D-consistent, 2D reference-guided edits across various domains. Together, these contributions attempt to advance the field of 3D-aware generative modeling, providing robust solutions for reconstruction, stylization, and editing with greater fidelity, consistency, and control.