PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution

PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution

16 Mar 2024 | Honghao Chen, Xiangxiang Chu, Yongjian Ren, Xin Zhao, Kaiqi Huang
PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution This paper proposes a novel convolutional neural network architecture called PeLK, which uses a peripheral convolution to significantly reduce parameter complexity while maintaining the performance of large kernel convolutions. The proposed peripheral convolution is inspired by human vision, which has a more efficient visual processing mechanism. By leveraging the human-like peripheral vision, the peripheral convolution reduces the parameter complexity of dense grid convolution from O(K²) to O(log K), enabling the use of extremely large kernel sizes without a significant increase in model size. The paper introduces a new architecture called PeLK, which is based on the peripheral convolution. PeLK outperforms modern vision Transformers and ConvNet architectures like Swin, ConvNeXt, RepLKNet, and SLaK on various vision tasks including ImageNet classification, semantic segmentation on ADE20K, and object detection on MS COCO. PeLK successfully scales up the kernel size of CNNs to an unprecedented 101×101, demonstrating consistent improvements. The paper also explores the effectiveness of dense grid convolution compared to stripe convolution. It shows that dense grid convolution consistently outperforms stripe convolution across multiple kernel sizes, indicating the essential advantage of dense convolution over stripe form. However, the square complexity of large dense convolution leads to a proliferation of parameters, causing rapidly increasing model size and greater optimization difficulty. To address these issues, the paper proposes a peripheral convolution that reduces parameter complexity while maintaining the dense computational form. The peripheral convolution consists of three key designs: i) a focus and blur mechanism, ii) exponentially-increasing sharing granularity, and iii) kernel-wise positional embedding. These designs enable the efficient reduction of parameters for large kernels, allowing for the design of large dense kernel convnets with strong performance. The paper also conducts extensive experiments to evaluate the performance of PeLK on various vision tasks. The results show that PeLK achieves state-of-the-art performance across a variety of vision tasks, demonstrating the potential of pure CNN architecture when equipped with extremely large kernel sizes. The paper also analyzes the effective receptive field (ERF) of PeLK, showing that it has a much larger ERF than previous large kernel paradigms, which is believed to contribute to its strong performance. Overall, the paper presents a novel approach to large kernel convolutional networks by leveraging the human-like peripheral vision mechanism. The proposed peripheral convolution significantly reduces parameter complexity while maintaining the performance of large kernel convolutions, enabling the design of extremely large kernel convnets with strong performance.PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution This paper proposes a novel convolutional neural network architecture called PeLK, which uses a peripheral convolution to significantly reduce parameter complexity while maintaining the performance of large kernel convolutions. The proposed peripheral convolution is inspired by human vision, which has a more efficient visual processing mechanism. By leveraging the human-like peripheral vision, the peripheral convolution reduces the parameter complexity of dense grid convolution from O(K²) to O(log K), enabling the use of extremely large kernel sizes without a significant increase in model size. The paper introduces a new architecture called PeLK, which is based on the peripheral convolution. PeLK outperforms modern vision Transformers and ConvNet architectures like Swin, ConvNeXt, RepLKNet, and SLaK on various vision tasks including ImageNet classification, semantic segmentation on ADE20K, and object detection on MS COCO. PeLK successfully scales up the kernel size of CNNs to an unprecedented 101×101, demonstrating consistent improvements. The paper also explores the effectiveness of dense grid convolution compared to stripe convolution. It shows that dense grid convolution consistently outperforms stripe convolution across multiple kernel sizes, indicating the essential advantage of dense convolution over stripe form. However, the square complexity of large dense convolution leads to a proliferation of parameters, causing rapidly increasing model size and greater optimization difficulty. To address these issues, the paper proposes a peripheral convolution that reduces parameter complexity while maintaining the dense computational form. The peripheral convolution consists of three key designs: i) a focus and blur mechanism, ii) exponentially-increasing sharing granularity, and iii) kernel-wise positional embedding. These designs enable the efficient reduction of parameters for large kernels, allowing for the design of large dense kernel convnets with strong performance. The paper also conducts extensive experiments to evaluate the performance of PeLK on various vision tasks. The results show that PeLK achieves state-of-the-art performance across a variety of vision tasks, demonstrating the potential of pure CNN architecture when equipped with extremely large kernel sizes. The paper also analyzes the effective receptive field (ERF) of PeLK, showing that it has a much larger ERF than previous large kernel paradigms, which is believed to contribute to its strong performance. Overall, the paper presents a novel approach to large kernel convolutional networks by leveraging the human-like peripheral vision mechanism. The proposed peripheral convolution significantly reduces parameter complexity while maintaining the performance of large kernel convolutions, enabling the design of extremely large kernel convnets with strong performance.
Reach us at info@study.space