22 Feb 2024 | Chenxi Huang, Yuenan HOU, Weicai Ye, Di Huang, Xiaoshui Huang, Binbin Lin, Deng Cai, Wanli Ouyang
NeRF-Det++ is a novel approach for indoor multi-view 3D detection that addresses three critical issues in the NeRF-Det framework: semantic ambiguity, inappropriate sampling, and insufficient depth supervision. The method introduces three key components: semantic enhancement, perspective-aware sampling, and ordinal residual depth supervision. Semantic enhancement improves the detector's ability to recognize object categories by incorporating semantic supervision. Perspective-aware sampling focuses more on nearby objects, allowing the detector to allocate more attention to visually rich regions. Ordinal residual depth supervision enhances depth learning by classifying depth bins and regressing residual depth values, leading to more stable depth estimation.
The algorithm is evaluated on the ScanNetV2 and ARKITScenes datasets, achieving significant improvements in detection performance. On ScanNetV2, NeRF-Det++ outperforms NeRF-Det by +1.9% in mAP@0.25 and +3.5% in mAP@0.50. The method also shows strong performance on ARKITScenes, demonstrating its effectiveness in indoor 3D detection. The approach is designed to be efficient, with no additional computational cost during testing. The results indicate that NeRF-Det++ is a robust and versatile solution for indoor multi-view 3D detection.NeRF-Det++ is a novel approach for indoor multi-view 3D detection that addresses three critical issues in the NeRF-Det framework: semantic ambiguity, inappropriate sampling, and insufficient depth supervision. The method introduces three key components: semantic enhancement, perspective-aware sampling, and ordinal residual depth supervision. Semantic enhancement improves the detector's ability to recognize object categories by incorporating semantic supervision. Perspective-aware sampling focuses more on nearby objects, allowing the detector to allocate more attention to visually rich regions. Ordinal residual depth supervision enhances depth learning by classifying depth bins and regressing residual depth values, leading to more stable depth estimation.
The algorithm is evaluated on the ScanNetV2 and ARKITScenes datasets, achieving significant improvements in detection performance. On ScanNetV2, NeRF-Det++ outperforms NeRF-Det by +1.9% in mAP@0.25 and +3.5% in mAP@0.50. The method also shows strong performance on ARKITScenes, demonstrating its effectiveness in indoor 3D detection. The approach is designed to be efficient, with no additional computational cost during testing. The results indicate that NeRF-Det++ is a robust and versatile solution for indoor multi-view 3D detection.