21 May 2024 | Linwei Chen, Lin Gu, Dezhi Zheng, Ying Fu
Frequency-Adaptive Dilated Convolution for Semantic Segmentation
This paper proposes Frequency-Adaptive Dilated Convolution (FADC), a method that improves dilated convolution by analyzing the frequency spectrum. FADC introduces three strategies: Adaptive Dilation Rate (AdaDR), Adaptive Kernel (AdaKern), and Frequency Selection (FreqSelect). AdaDR dynamically adjusts dilation rates based on local frequency components, enhancing spatial adaptability. AdaKern decomposes convolution weights into low- and high-frequency components, dynamically adjusting their ratio to improve effective bandwidth. FreqSelect balances high- and low-frequency components in feature representations through spatially variant reweighting, encouraging FADC to learn larger dilations and expand the receptive field.
FADC is evaluated on semantic segmentation and object detection tasks, showing significant improvements in performance. On the Cityscapes dataset, FADC achieves an mIoU of 81.0 at 37.7 FPS when applied with PIDNet. It also performs well in other tasks, including deformable convolution and dilated attention. The method effectively reduces aliasing artifacts by dynamically adjusting dilation rates based on local frequency, leading to more accurate and consistent predictions.
The proposed approach is validated through extensive experiments, demonstrating its effectiveness in enhancing the performance of semantic segmentation and object detection. The code is publicly available at https://github.com/ying-fu/FADC.Frequency-Adaptive Dilated Convolution for Semantic Segmentation
This paper proposes Frequency-Adaptive Dilated Convolution (FADC), a method that improves dilated convolution by analyzing the frequency spectrum. FADC introduces three strategies: Adaptive Dilation Rate (AdaDR), Adaptive Kernel (AdaKern), and Frequency Selection (FreqSelect). AdaDR dynamically adjusts dilation rates based on local frequency components, enhancing spatial adaptability. AdaKern decomposes convolution weights into low- and high-frequency components, dynamically adjusting their ratio to improve effective bandwidth. FreqSelect balances high- and low-frequency components in feature representations through spatially variant reweighting, encouraging FADC to learn larger dilations and expand the receptive field.
FADC is evaluated on semantic segmentation and object detection tasks, showing significant improvements in performance. On the Cityscapes dataset, FADC achieves an mIoU of 81.0 at 37.7 FPS when applied with PIDNet. It also performs well in other tasks, including deformable convolution and dilated attention. The method effectively reduces aliasing artifacts by dynamically adjusting dilation rates based on local frequency, leading to more accurate and consistent predictions.
The proposed approach is validated through extensive experiments, demonstrating its effectiveness in enhancing the performance of semantic segmentation and object detection. The code is publicly available at https://github.com/ying-fu/FADC.