The paper introduces the Dual-Hybrid Attention Network for Specular Highlight Removal (DHAN-SHR), an end-to-end network designed to effectively remove specular highlights from images and videos. The network incorporates two key components: the Adaptive Local Hybrid-Domain Dual Attention Transformer (L-HD-DAT) and the Adaptive Global Dual Attention Transformer (G-DAT). L-HD-DAT captures local inter-channel and inter-pixel dependencies while incorporating spectral domain features, enabling the network to model complex interactions between specular highlights and underlying surface properties. G-DAT models global inter-channel relationships and long-distance pixel dependencies, allowing the network to propagate contextual information across the entire image and generate more coherent and consistent highlight-free results.
To evaluate the performance of DHAN-SHR, the authors compiled a large-scale benchmark dataset combining images from three different highlight removal datasets (PSD, SHIQ, and SSHR). Extensive experiments demonstrate that DHAN-SHR outperforms 18 state-of-the-art methods in both quantitative and qualitative evaluations, setting a new standard for specular highlight removal in multimedia applications. The paper also includes ablation studies to validate the effectiveness of each component of the network and a user study to assess the perceptual quality of the highlight removal results.The paper introduces the Dual-Hybrid Attention Network for Specular Highlight Removal (DHAN-SHR), an end-to-end network designed to effectively remove specular highlights from images and videos. The network incorporates two key components: the Adaptive Local Hybrid-Domain Dual Attention Transformer (L-HD-DAT) and the Adaptive Global Dual Attention Transformer (G-DAT). L-HD-DAT captures local inter-channel and inter-pixel dependencies while incorporating spectral domain features, enabling the network to model complex interactions between specular highlights and underlying surface properties. G-DAT models global inter-channel relationships and long-distance pixel dependencies, allowing the network to propagate contextual information across the entire image and generate more coherent and consistent highlight-free results.
To evaluate the performance of DHAN-SHR, the authors compiled a large-scale benchmark dataset combining images from three different highlight removal datasets (PSD, SHIQ, and SSHR). Extensive experiments demonstrate that DHAN-SHR outperforms 18 state-of-the-art methods in both quantitative and qualitative evaluations, setting a new standard for specular highlight removal in multimedia applications. The paper also includes ablation studies to validate the effectiveness of each component of the network and a user study to assess the perceptual quality of the highlight removal results.