21 May 2024 | Linwei Chen, Lin Gu, Dezhi Zheng, Ying Fu
The paper introduces Frequency-Adaptive Dilated Convolution (FADC) to enhance the performance of dilated convolution in semantic segmentation and object detection tasks. FADC consists of three key strategies: Adaptive Dilation Rate (AdaDR), Adaptive Kernel (AdaKern), and Frequency Selection (FreqSelect). AdaDR dynamically adjusts dilation rates based on local frequency components, balancing effective bandwidth and receptive field size. AdaKern decomposes convolution weights into low-frequency and high-frequency components, dynamically adjusting their ratio to capture more high-frequency information. FreqSelect balances high- and low-frequency components in feature representations through spatially variant reweighting, encouraging larger receptive fields. Extensive experiments on datasets like Cityscapes and ADE20K validate the effectiveness of FADC, demonstrating improvements in mIoU and real-time performance. The method also integrates well with deformable convolutions and dilated attention mechanisms, further enhancing performance in various tasks.The paper introduces Frequency-Adaptive Dilated Convolution (FADC) to enhance the performance of dilated convolution in semantic segmentation and object detection tasks. FADC consists of three key strategies: Adaptive Dilation Rate (AdaDR), Adaptive Kernel (AdaKern), and Frequency Selection (FreqSelect). AdaDR dynamically adjusts dilation rates based on local frequency components, balancing effective bandwidth and receptive field size. AdaKern decomposes convolution weights into low-frequency and high-frequency components, dynamically adjusting their ratio to capture more high-frequency information. FreqSelect balances high- and low-frequency components in feature representations through spatially variant reweighting, encouraging larger receptive fields. Extensive experiments on datasets like Cityscapes and ADE20K validate the effectiveness of FADC, demonstrating improvements in mIoU and real-time performance. The method also integrates well with deformable convolutions and dilated attention mechanisms, further enhancing performance in various tasks.