12 Jun 2024 | Hanwen Jiang, Qixing Huang, Georgios Pavlakos
Real3D is a novel framework for training Large Reconstruction Models (LRMs) using single-view real-world images, addressing the limitations of existing methods that rely on synthetic data or multi-view captures. The paper introduces a self-training framework that leverages both synthetic and real-world data, incorporating two unsupervised losses at the pixel and semantic levels. An automatic data curation method is developed to select high-quality instances from in-the-wild images, enhancing the model's performance and generalization. Experiments on diverse datasets, including real and synthetic data, show that Real3D consistently outperforms prior work, demonstrating its effectiveness in improving LRM performance and scalability. The key contributions include a novel self-training approach, automatic data curation, and superior performance across various evaluation settings.Real3D is a novel framework for training Large Reconstruction Models (LRMs) using single-view real-world images, addressing the limitations of existing methods that rely on synthetic data or multi-view captures. The paper introduces a self-training framework that leverages both synthetic and real-world data, incorporating two unsupervised losses at the pixel and semantic levels. An automatic data curation method is developed to select high-quality instances from in-the-wild images, enhancing the model's performance and generalization. Experiments on diverse datasets, including real and synthetic data, show that Real3D consistently outperforms prior work, demonstrating its effectiveness in improving LRM performance and scalability. The key contributions include a novel self-training approach, automatic data curation, and superior performance across various evaluation settings.