29 Mar 2024 | Xu Ma1, Xiyang Dai2, Jianwei Yang2, Bin Xiao2, Yinpeng Chen2, Yun Fu1, Lu Yuan2
This paper introduces Efficient Modulation (EfficientMod), a novel design for efficient vision networks. The authors revisit the modulation mechanism, which operates input through convolutional context modeling and feature projection layers, and fuses features via element-wise multiplication and an MLP block. They propose the EfficientMod block, which is a key building block for their networks. This block is designed to be more efficient than existing methods while maintaining or improving performance. The authors demonstrate that their network achieves better trade-offs between accuracy and efficiency, setting new state-of-the-art performance in efficient networks. When integrated with self-attention blocks, the hybrid architecture further improves performance without losing efficiency. Comprehensive experiments validate the effectiveness and efficiency of EfficientMod, showing superior results on various tasks, including image classification, object detection, and semantic segmentation. The code and checkpoints are available at <https://github.com/ma-xu/EfficientMod>.This paper introduces Efficient Modulation (EfficientMod), a novel design for efficient vision networks. The authors revisit the modulation mechanism, which operates input through convolutional context modeling and feature projection layers, and fuses features via element-wise multiplication and an MLP block. They propose the EfficientMod block, which is a key building block for their networks. This block is designed to be more efficient than existing methods while maintaining or improving performance. The authors demonstrate that their network achieves better trade-offs between accuracy and efficiency, setting new state-of-the-art performance in efficient networks. When integrated with self-attention blocks, the hybrid architecture further improves performance without losing efficiency. Comprehensive experiments validate the effectiveness and efficiency of EfficientMod, showing superior results on various tasks, including image classification, object detection, and semantic segmentation. The code and checkpoints are available at <https://github.com/ma-xu/EfficientMod>.