1 Aug 2020 | Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, Andreas Geiger
Convolutional Occupancy Networks (CONs) are introduced as a flexible implicit representation for detailed 3D reconstruction of objects and scenes. The method combines convolutional encoders with implicit occupancy decoders to incorporate inductive biases like translation equivariance, enabling structured reasoning in 3D space. The model is trained to reconstruct complex geometry from noisy point clouds and low-resolution voxel data, achieving fine-grained implicit 3D reconstruction of single objects, scalability to large indoor scenes, and generalization from synthetic to real data.
The model uses a convolutional encoder to generate features, which are then decoded into occupancy probabilities using a fully-connected network. The encoder processes input data (e.g., point clouds or voxel grids) into planar or volumetric feature representations, which are then processed by convolutional decoders to aggregate local and global information. The occupancy prediction is performed by querying the feature maps at specific 3D locations using interpolation techniques.
The method is evaluated on various datasets, including ShapeNet, ScanNet, and Matterport3D, demonstrating superior performance in object-level and scene-level reconstruction compared to baselines such as ONet and PointConv. The model is shown to be effective in reconstructing complex indoor scenes, with results showing that volumetric features outperform planar features in real-world scenarios, although planar features are more memory-efficient.
The model is also tested on real-world datasets, showing its ability to generalize to unseen classes and novel room layouts. The method is not rotation-equivariant but is translation-equivariant with respect to translations that are multiples of the defined voxel size. While there is still a performance gap between synthetic and real data, the model demonstrates strong results in 3D reconstruction tasks. The method is implemented in PyTorch and is available for further research and development.Convolutional Occupancy Networks (CONs) are introduced as a flexible implicit representation for detailed 3D reconstruction of objects and scenes. The method combines convolutional encoders with implicit occupancy decoders to incorporate inductive biases like translation equivariance, enabling structured reasoning in 3D space. The model is trained to reconstruct complex geometry from noisy point clouds and low-resolution voxel data, achieving fine-grained implicit 3D reconstruction of single objects, scalability to large indoor scenes, and generalization from synthetic to real data.
The model uses a convolutional encoder to generate features, which are then decoded into occupancy probabilities using a fully-connected network. The encoder processes input data (e.g., point clouds or voxel grids) into planar or volumetric feature representations, which are then processed by convolutional decoders to aggregate local and global information. The occupancy prediction is performed by querying the feature maps at specific 3D locations using interpolation techniques.
The method is evaluated on various datasets, including ShapeNet, ScanNet, and Matterport3D, demonstrating superior performance in object-level and scene-level reconstruction compared to baselines such as ONet and PointConv. The model is shown to be effective in reconstructing complex indoor scenes, with results showing that volumetric features outperform planar features in real-world scenarios, although planar features are more memory-efficient.
The model is also tested on real-world datasets, showing its ability to generalize to unseen classes and novel room layouts. The method is not rotation-equivariant but is translation-equivariant with respect to translations that are multiples of the defined voxel size. While there is still a performance gap between synthetic and real data, the model demonstrates strong results in 3D reconstruction tasks. The method is implemented in PyTorch and is available for further research and development.