OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

21 Mar 2024 | Bohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jia
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation This paper introduces OA-CNNs, a family of sparse CNNs designed to enhance the adaptability of sparse CNNs for 3D semantic segmentation. The key innovation lies in two components: adaptive receptive fields and adaptive relations. These components enable OA-CNNs to achieve high performance with minimal computational cost, surpassing point transformers in both accuracy and efficiency. OA-CNNs achieve mIoU scores of 76.1%, 78.9%, and 70.6% on ScanNet v2, nuScenes, and SemanticKITTI benchmarks, respectively, while maintaining at most 5× better speed than transformer counterparts. The paper also presents extensive experiments showing that OA-CNNs outperform state-of-the-art point-based methods with transformer architectures, demonstrating the potential of sparse CNNs to outperform transformer-related networks. The proposed method is built upon Pointcept, a codebase for point cloud perception research. The paper discusses the design of OA-CNNs, including spatially adaptive receptive fields, adaptive relation convolution, and the overall architecture. The experiments show that OA-CNNs achieve superior performance in both indoor and outdoor scenes, highlighting the effectiveness of the proposed method. The paper also discusses the limitations of the current pyramid grid sizes and the need for future research to develop more scientifically grounded search algorithms. The work is supported by the Shenzhen Science and Technology Program.OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation This paper introduces OA-CNNs, a family of sparse CNNs designed to enhance the adaptability of sparse CNNs for 3D semantic segmentation. The key innovation lies in two components: adaptive receptive fields and adaptive relations. These components enable OA-CNNs to achieve high performance with minimal computational cost, surpassing point transformers in both accuracy and efficiency. OA-CNNs achieve mIoU scores of 76.1%, 78.9%, and 70.6% on ScanNet v2, nuScenes, and SemanticKITTI benchmarks, respectively, while maintaining at most 5× better speed than transformer counterparts. The paper also presents extensive experiments showing that OA-CNNs outperform state-of-the-art point-based methods with transformer architectures, demonstrating the potential of sparse CNNs to outperform transformer-related networks. The proposed method is built upon Pointcept, a codebase for point cloud perception research. The paper discusses the design of OA-CNNs, including spatially adaptive receptive fields, adaptive relation convolution, and the overall architecture. The experiments show that OA-CNNs achieve superior performance in both indoor and outdoor scenes, highlighting the effectiveness of the proposed method. The paper also discusses the limitations of the current pyramid grid sizes and the need for future research to develop more scientifically grounded search algorithms. The work is supported by the Shenzhen Science and Technology Program.
Reach us at info@study.space
[slides and audio] OA-CNNs%3A Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation