24 May 2024 | Qingdong He, Jiangning Zhang, Jinlong Peng, Haoyang He, Yabiao Wang, Chengjie Wang
PointRWKV is an efficient RWKV-like model designed for hierarchical point cloud learning. It addresses the quadratic complexity issue of traditional transformers, offering linear computational complexity. The model is derived from the RWKV architecture, with modifications to suit point cloud tasks. It processes embedded point patches and uses modified multi-headed matrix-valued states and dynamic attention recurrence for global processing. A parallel branch encodes point clouds efficiently using a fixed-radius near-neighbors graph with a graph stabilizer. PointRWKV is structured as a multi-scale framework for hierarchical feature learning, enabling various downstream tasks. Experiments show that PointRWKV outperforms transformer- and mamba-based models, achieving significant reductions in FLOPs (about 46%) and demonstrating strong performance in tasks like 3D object classification, part segmentation, and few-shot learning. The model's linear complexity and efficient processing make it suitable for deep point cloud learning, offering a promising alternative for 3D vision tasks. Key contributions include the innovative application of RWKV to point cloud learning, the design of a multi-stage hierarchical architecture for feature learning, and the development of a parallel strategy for local and global feature aggregation. PointRWKV achieves state-of-the-art performance across various tasks, showcasing its effectiveness and efficiency in 3D point cloud processing.PointRWKV is an efficient RWKV-like model designed for hierarchical point cloud learning. It addresses the quadratic complexity issue of traditional transformers, offering linear computational complexity. The model is derived from the RWKV architecture, with modifications to suit point cloud tasks. It processes embedded point patches and uses modified multi-headed matrix-valued states and dynamic attention recurrence for global processing. A parallel branch encodes point clouds efficiently using a fixed-radius near-neighbors graph with a graph stabilizer. PointRWKV is structured as a multi-scale framework for hierarchical feature learning, enabling various downstream tasks. Experiments show that PointRWKV outperforms transformer- and mamba-based models, achieving significant reductions in FLOPs (about 46%) and demonstrating strong performance in tasks like 3D object classification, part segmentation, and few-shot learning. The model's linear complexity and efficient processing make it suitable for deep point cloud learning, offering a promising alternative for 3D vision tasks. Key contributions include the innovative application of RWKV to point cloud learning, the design of a multi-stage hierarchical architecture for feature learning, and the development of a parallel strategy for local and global feature aggregation. PointRWKV achieves state-of-the-art performance across various tasks, showcasing its effectiveness and efficiency in 3D point cloud processing.