Understanding Point Cloud Mamba%3A Point Cloud Learning via State Space Model

The paper introduces Point Cloud Mamba (PCM), a novel architecture that combines local and global modeling to process 3D point cloud data using state space models (SSMs). Specifically, it employs the Mamba architecture, which offers linear computational complexity and strong global modeling capabilities. To effectively process 3D point clouds, the authors propose a Consistent Traverse Serialization (CTS) method to convert point clouds into 1-D sequences while preserving spatial adjacencies. This method yields six variants by permuting the x, y, and z coordinates, which can be combined to enhance Mamba's ability to model point cloud features. Additionally, order prompts are introduced to help Mamba handle different serialization orders, and a positional encoding based on spatial coordinate mapping is proposed to inject positional information into the point cloud sequences. PCM outperforms state-of-the-art (SOTA) point-based methods, such as PointNeXt, on several datasets, including ScanObjectNN, ModelNet40, ShapeNetPart, and S3DIS. Notably, when combined with a more powerful local feature extractor, DeLA, PCM achieves 82.6 mIoU on the S3DIS dataset, significantly surpassing previous SOTA models by 8.5 mIoU and 7.9 mIoU, respectively. The paper also discusses the limitations and future directions, including handling large-scale point clouds and integrating PCM with outdoor point cloud scenes.The paper introduces Point Cloud Mamba (PCM), a novel architecture that combines local and global modeling to process 3D point cloud data using state space models (SSMs). Specifically, it employs the Mamba architecture, which offers linear computational complexity and strong global modeling capabilities. To effectively process 3D point clouds, the authors propose a Consistent Traverse Serialization (CTS) method to convert point clouds into 1-D sequences while preserving spatial adjacencies. This method yields six variants by permuting the x, y, and z coordinates, which can be combined to enhance Mamba's ability to model point cloud features. Additionally, order prompts are introduced to help Mamba handle different serialization orders, and a positional encoding based on spatial coordinate mapping is proposed to inject positional information into the point cloud sequences. PCM outperforms state-of-the-art (SOTA) point-based methods, such as PointNeXt, on several datasets, including ScanObjectNN, ModelNet40, ShapeNetPart, and S3DIS. Notably, when combined with a more powerful local feature extractor, DeLA, PCM achieves 82.6 mIoU on the S3DIS dataset, significantly surpassing previous SOTA models by 8.5 mIoU and 7.9 mIoU, respectively. The paper also discusses the limitations and future directions, including handling large-scale point clouds and integrating PCM with outdoor point cloud scenes.

Point Cloud Mamba: Point Cloud Learning via State Space Model

30 May 2024 | Tao Zhang, Xiangtai Li, Haobo Yuan, Shunping Ji, Shuicheng Yan