17 Jul 2024 | Xiaojiao Guo, Xuhang Chen, Shenghong Luo, Shuqiang Wang, and Chi-Man Pun
The Dual-Hybrid Attention Network for Specular Highlight Removal (DHAN-SHR) is an end-to-end network designed to effectively remove specular highlights from images and videos, enhancing their quality and interpretability for downstream tasks. The network introduces novel hybrid attention mechanisms to capture and process information across different scales and domains without relying on additional priors or supervision. DHAN-SHR consists of two key components: the Adaptive Local Hybrid-Domain Dual Attention Transformer (L-HD-DAT) and the Adaptive Global Dual Attention Transformer (G-DAT). L-HD-DAT captures local inter-channel and inter-pixel dependencies while incorporating spectral domain features, enabling the network to model complex interactions between specular highlights and underlying surface properties. G-DAT models global inter-channel relationships and long-distance pixel dependencies, allowing the network to propagate contextual information across the entire image and generate more coherent and consistent highlight-free results. To evaluate DHAN-SHR, a large-scale benchmark dataset was compiled, comprising images from three different datasets. Extensive experiments demonstrate that DHAN-SHR outperforms 18 state-of-the-art methods both quantitatively and qualitatively, setting a new standard for specular highlight removal in multimedia applications. The code and dataset are available for future research. The network's architecture enables effective capture and processing of features at different scales and semantic levels, leading to accurate and efficient specular highlight removal. The method is evaluated using metrics such as PSNR, SSIM, and LPIPS, and the results show that DHAN-SHR achieves superior performance compared to traditional and learning-based approaches. The network's ability to preserve the original image's color tone, structure, and crucial details, while effectively removing specular highlights, is highlighted in visual comparisons. Ablation studies confirm the importance of the network's key components in achieving optimal performance. The study concludes that DHAN-SHR is a significant advancement in specular highlight removal, offering a robust solution for real-world applications.The Dual-Hybrid Attention Network for Specular Highlight Removal (DHAN-SHR) is an end-to-end network designed to effectively remove specular highlights from images and videos, enhancing their quality and interpretability for downstream tasks. The network introduces novel hybrid attention mechanisms to capture and process information across different scales and domains without relying on additional priors or supervision. DHAN-SHR consists of two key components: the Adaptive Local Hybrid-Domain Dual Attention Transformer (L-HD-DAT) and the Adaptive Global Dual Attention Transformer (G-DAT). L-HD-DAT captures local inter-channel and inter-pixel dependencies while incorporating spectral domain features, enabling the network to model complex interactions between specular highlights and underlying surface properties. G-DAT models global inter-channel relationships and long-distance pixel dependencies, allowing the network to propagate contextual information across the entire image and generate more coherent and consistent highlight-free results. To evaluate DHAN-SHR, a large-scale benchmark dataset was compiled, comprising images from three different datasets. Extensive experiments demonstrate that DHAN-SHR outperforms 18 state-of-the-art methods both quantitatively and qualitatively, setting a new standard for specular highlight removal in multimedia applications. The code and dataset are available for future research. The network's architecture enables effective capture and processing of features at different scales and semantic levels, leading to accurate and efficient specular highlight removal. The method is evaluated using metrics such as PSNR, SSIM, and LPIPS, and the results show that DHAN-SHR achieves superior performance compared to traditional and learning-based approaches. The network's ability to preserve the original image's color tone, structure, and crucial details, while effectively removing specular highlights, is highlighted in visual comparisons. Ablation studies confirm the importance of the network's key components in achieving optimal performance. The study concludes that DHAN-SHR is a significant advancement in specular highlight removal, offering a robust solution for real-world applications.