[slides] Wavelet Convolutions for Large Receptive Fields

The paper introduces WTConv, a novel layer that leverages the Wavelet Transform (WT) to achieve large receptive fields in Convolutional Neural Networks (CNNs) without suffering from over-parameterization. The WTConv layer uses the cascade WT decomposition to perform a set of small-kernel convolutions, each focusing on different frequency bands of the input, allowing for a larger effective receptive field (ERF) while maintaining a linear increase in the number of trainable parameters. The proposed layer is designed to be a drop-in replacement for depth-wise convolutions in existing CNN architectures and demonstrates improved performance in various computer vision tasks, including image classification, semantic segmentation, and object detection. Empirical evaluations show that WTConv improves the network's scalability, robustness to image corruption, and shape bias, making it a promising approach for enhancing CNNs' receptive field and global feature mixing capabilities.The paper introduces WTConv, a novel layer that leverages the Wavelet Transform (WT) to achieve large receptive fields in Convolutional Neural Networks (CNNs) without suffering from over-parameterization. The WTConv layer uses the cascade WT decomposition to perform a set of small-kernel convolutions, each focusing on different frequency bands of the input, allowing for a larger effective receptive field (ERF) while maintaining a linear increase in the number of trainable parameters. The proposed layer is designed to be a drop-in replacement for depth-wise convolutions in existing CNN architectures and demonstrates improved performance in various computer vision tasks, including image classification, semantic segmentation, and object detection. Empirical evaluations show that WTConv improves the network's scalability, robustness to image corruption, and shape bias, making it a promising approach for enhancing CNNs' receptive field and global feature mixing capabilities.

Wavelet Convolutions for Large Receptive Fields

15 Jul 2024 | Shahaf E. Finder, Roy Amoyal, Eran Treister, and Oren Freifeld