PointRWKV is a novel model designed for hierarchical point cloud learning, inspired by the RWKV model in natural language processing. The primary goal is to address the quadratic complexity issue of transformers, which hinders their application in long sequence tasks and computational resource efficiency. PointRWKV achieves linear complexity by modifying the RWKV architecture to suit point cloud data. It employs a hierarchical structure with multi-scale point cloud encoding, incorporating both global and local feature aggregation. Key components include:
1. **Global Processing**: Modified multi-headed matrix-valued states and a dynamic attention recurrence mechanism to explore global processing capabilities.
2. **Local Geometric Features**: A parallel branch that encodes point clouds efficiently using a fixed radius near-neighbors graph with a graph stabilizer to extract local geometric features.
3. **Multi-Scale Framework**: Facilitates hierarchical feature learning of 3D point clouds, enhancing various downstream tasks.
Experiments on different point cloud learning tasks, such as classification, part segmentation, and few-shot learning, demonstrate that PointRWKV outperforms transformer- and mamba-based models while significantly reducing parameters and FLOPs. The model's efficiency and performance make it a promising alternative for 3D vision tasks.PointRWKV is a novel model designed for hierarchical point cloud learning, inspired by the RWKV model in natural language processing. The primary goal is to address the quadratic complexity issue of transformers, which hinders their application in long sequence tasks and computational resource efficiency. PointRWKV achieves linear complexity by modifying the RWKV architecture to suit point cloud data. It employs a hierarchical structure with multi-scale point cloud encoding, incorporating both global and local feature aggregation. Key components include:
1. **Global Processing**: Modified multi-headed matrix-valued states and a dynamic attention recurrence mechanism to explore global processing capabilities.
2. **Local Geometric Features**: A parallel branch that encodes point clouds efficiently using a fixed radius near-neighbors graph with a graph stabilizer to extract local geometric features.
3. **Multi-Scale Framework**: Facilitates hierarchical feature learning of 3D point clouds, enhancing various downstream tasks.
Experiments on different point cloud learning tasks, such as classification, part segmentation, and few-shot learning, demonstrate that PointRWKV outperforms transformer- and mamba-based models while significantly reducing parameters and FLOPs. The model's efficiency and performance make it a promising alternative for 3D vision tasks.