SO-Net is a permutation-invariant architecture for deep learning with point clouds. It models the spatial distribution of point clouds using a Self-Organizing Map (SOM), enabling hierarchical feature extraction on individual points and SOM nodes. The network's receptive field can be systematically adjusted through point-to-node k-nearest neighbor (kNN) search. SO-Net achieves performance comparable or better than state-of-the-art approaches in tasks like point cloud reconstruction, classification, object part segmentation, and shape retrieval. It also offers faster training speeds due to its parallelizability and simplicity. The network is applied to various computer vision tasks, including classification, per-point segmentation, and point cloud reconstruction. SO-Net uses a SOM to model spatial distribution, allowing for efficient feature aggregation and permutation invariance. The network's encoder architecture includes a SOM for hierarchical feature extraction and kNN search for receptive field adjustment. The decoder is designed for point cloud reconstruction, with two branches for flexibility. The network is trained using a pre-trained autoencoder to improve performance. Experiments show that SO-Net outperforms existing methods in classification and segmentation tasks, with robustness to point and SOM corruption. The network's design allows for efficient training and inference, making it suitable for various 3D data applications.SO-Net is a permutation-invariant architecture for deep learning with point clouds. It models the spatial distribution of point clouds using a Self-Organizing Map (SOM), enabling hierarchical feature extraction on individual points and SOM nodes. The network's receptive field can be systematically adjusted through point-to-node k-nearest neighbor (kNN) search. SO-Net achieves performance comparable or better than state-of-the-art approaches in tasks like point cloud reconstruction, classification, object part segmentation, and shape retrieval. It also offers faster training speeds due to its parallelizability and simplicity. The network is applied to various computer vision tasks, including classification, per-point segmentation, and point cloud reconstruction. SO-Net uses a SOM to model spatial distribution, allowing for efficient feature aggregation and permutation invariance. The network's encoder architecture includes a SOM for hierarchical feature extraction and kNN search for receptive field adjustment. The decoder is designed for point cloud reconstruction, with two branches for flexibility. The network is trained using a pre-trained autoencoder to improve performance. Experiments show that SO-Net outperforms existing methods in classification and segmentation tasks, with robustness to point and SOM corruption. The network's design allows for efficient training and inference, making it suitable for various 3D data applications.