Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3664647.3680745acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article
Open access

Dual-Hybrid Attention Network for Specular Highlight Removal

Published: 28 October 2024 Publication History

Abstract

Specular highlight removal plays a pivotal role in multimedia applications, as it enhances the quality and interpretability of images and videos, ultimately improving the performance of downstream tasks such as content-based retrieval, object recognition, and scene understanding. Despite significant advances in deep learning-based methods, current state-of-the-art approaches often rely on additional priors or supervision, limiting their practicality and generalization capability. In this paper, we propose the Dual-Hybrid Attention Network for Specular Highlight Removal (DHAN-SHR), an end-to-end network that introduces novel hybrid attention mechanisms to effectively capture and process information across different scales and domains without relying on additional priors or supervision. DHAN-SHR consists of two key components: the Adaptive Local Hybrid-Domain Dual Attention Transformer (L-HD-DAT) and the Adaptive Global Dual Attention Transformer (G-DAT). The L-HD-DAT captures local inter-channel and inter-pixel dependencies while incorporating spectral domain features, enabling the network to effectively model the complex interactions between specular highlights and the underlying surface properties. The G-DAT models global inter-channel relationships and long-distance pixel dependencies, allowing the network to propagate contextual information across the entire image and generate more coherent and consistent highlight-free results. To evaluate the performance of DHAN-SHR and facilitate future research in this area, we compile a large-scale benchmark dataset comprising a diverse range of images with varying levels of specular highlights. Through extensive experiments, we demonstrate that DHAN-SHR outperforms 18 state-of-the-art methods both quantitatively and qualitatively, setting a new standard for specular highlight removal in multimedia applications. The code and dataset are available at https://github.com/CXH-Research/DHAN-SHR.

References

[1]
Steven A Shafer. 1985. Using color to separate reflection components. Color Research & Application, Vol. 10, 4 (1985), 210--218.
[2]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV. 10012--10022.
[3]
Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. 2021. Swinir: Image restoration using swin transformer. In ICCV. 1833--1844.
[4]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In ICLR.
[5]
Zhongqi Wu, Chuanqing Zhuang, Jian Shi, Jianwei Guo, Jun Xiao, Xiaopeng Zhang, and Dong-Ming Yan. 2021. Single-image specular highlight removal via real-world dataset construction. IEEE Transactions on Multimedia, Vol. 24 (2021), 3782--3793.
[6]
Gang Fu, Qing Zhang, Lei Zhu, Ping Li, and Chunxia Xiao. 2021. A multi-task network for joint specular highlight detection and removal. In CVPR. 7752--7761.
[7]
Gang Fu, Qing Zhang, Lei Zhu, Chunxia Xiao, and Ping Li. 2023. Towards High-Quality Specular Highlight Removal by Leveraging Large-Scale Synthetic Data. In ICCV. 12857--12865.
[8]
Xuhang Chen, Xiaodong Cun, Chi-Man Pun, and Shuqiang Wang. 2023. Shadocnet: Learning Spatial-Aware Tokens in Transformer for Document Shadow Removal. In ICASSP. 1--5.
[9]
Zinuo Li, Xuhang Chen, Chi-Man Pun, and Xiaodong Cun. 2023. High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net. In ICCV. 12449--12458.
[10]
Xuhang Chen, Baiying Lei, Chi-Man Pun, and Shuqiang Wang. 2023. Brain Diffuser: An End-to-End Brain Image to Brain Network Pipeline. In PRCV. 16--26.
[11]
Shenghong Luo, Xuhang Chen, Weiwen Chen, Zinuo Li, Shuqiang Wang, and Chi-Man Pun. 2024. Devignet: High-Resolution Vignetting Removal via a Dual Aggregated Fusion Transformer with Adaptive Channel Expansion. In AAAI. 4000--4008.
[12]
Zinuo Li, Xuhang Chen, Shuna Guo, Shuqiang Wang, and Chi-Man Pun. 2024. WavEnhancer: Unifying Wavelet and Transformer for Image Enhancement. Journal of Computer Science and Technology, Vol. 39, 2 (2024), 336--345.
[13]
Yiguo Jiang, Xuhang Chen, Chi-Man Pun, Shuqiang Wang, and Wei Feng. 2024. MFDNet: Multi-Frequency Deflare Network for efficient nighttime flare removal. The Visual Computer (2024).
[14]
Guoli Huang, Xuhang Chen, Yanyan Shen, and Shuqiang Wang. 2023. MR Image Super-Resolution Using Wavelet Diffusion for Predicting Alzheimer's Disease. In BI. 146--157.
[15]
Zhizhen Zhou, Yejing Huo, Guoheng Huang, An Zeng, Xuhang Chen, Lian Huang, and Zinuo Li. 2024. QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation. The Visual Computer (2024).
[16]
Tong Zhou, Xuhang Chen, Yanyan Shen, Martin Nieuwoudt, Chi-Man Pun, and Shuqiang Wang. 2023. Generative AI Enables EEG Data Augmentation for Alzheimer's Disease Detection Via Diffusion Model. In ISPCE-ASIA. 1--6.
[17]
Zinuo Li, Xuhang Chen, Shuqiang Wang, and Chi-Man Pun. 2023. A Large-Scale Film Style Dataset for Learning Multi-frequency Driven Film Enhancement. In IJCAI. 1160--1168.
[18]
Zishan Xu, Xiaofeng Zhang, Wei Chen, Jueting Liu, Tingting Xu, and Zehua Wang. 2024. MuralDiff: Diffusion for Ancient Murals Restoration on Large-Scale Pre-Training. IEEE Transactions on Emerging Topics in Computational Intelligence (2024).
[19]
Xiaofeng Zhang, Zishan Xu, Hao Tang, Chaochen Gu, Shanying Zhu, and Xinping Guan. 2024. Shadclips: When Parameter-Efficient Fine-Tuning with Multimodal Meets Shadow Removal.
[20]
Jingchao Wang, Guoheng Huang, Guo Zhong, Xiaochen Yuan, Chi-Man Pun, and Jie Deng. 2023. Qgd-net: a lightweight model utilizing pixels of affinity in feature layer for dermoscopic lesion segmentation. IEEE Journal of Biomedical and Health Informatics (2023).
[21]
Xiaofeng Zhang, Feng Chen, Cailing Wang, Ming Tao, and Guo-Ping Jiang. 2020. Sienet: Siamese expansion network for image extrapolation. IEEE Signal Processing Letters, Vol. 27 (2020), 1590--1594.
[22]
Xiaofeng Zhang, Yudi Zhao, Chaochen Gu, Changsheng Lu, and Shanying Zhu. 2023. SpA-Former: An effective and lightweight transformer for image shadow removal. In IJCNN. 1--8.
[23]
Xiao Feng Zhang, Chao Chen Gu, and Shan Ying Zhu. 2023. Memory augment is All You Need for image restoration. arXiv preprint arXiv:2309.01377 (2023).
[24]
Qihan Zhao, Xiaofeng Zhang, Hao Tang, Chaochen Gu, and Shanying Zhu. 2023. Enlighten-anything: When Segment Anything Model Meets Low-light Image Enhancement. arXiv preprint arXiv:2306.10286 (2023).
[25]
Long Quan, Heung-Yeung Shum, et al. 2003. Highlight removal by illumination-constrained inpainting. In ICCV. 164--169.
[26]
Robby T Tan and Katsushi Ikeuchi. 2005. Separating reflection components of textured surfaces using a single image. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, 2 (2005), 178--193.
[27]
Hui-Liang Shen, Hong-Gang Zhang, Si-Jie Shao, and John H Xin. 2008. Chromaticity-based separation of reflection components in a single image. Pattern Recognition, Vol. 41, 8 (2008), 2461--2469.
[28]
Hui-Liang Shen and Qing-Yuan Cai. 2009. Simple and efficient method for specularity removal in an image. Applied optics, Vol. 48, 14 (2009), 2711--2719.
[29]
Qingxiong Yang, Shengnan Wang, and Narendra Ahuja. 2010. Real-time specular highlight removal using bilateral filtering. In ECCV. 87--100.
[30]
Hui-Liang Shen and Zhi-Huan Zheng. 2013. Real-time highlight removal using intensity ratio. Applied optics, Vol. 52, 19 (2013), 4483--4493.
[31]
Yasuhiro Akashi and Takayuki Okatani. 2015. Separation of reflection components by sparse non-negative matrix factorization. In ACCV. 611--625.
[32]
Antonio CS Souza, Márcio CF Macedo, Verônica P Nascimento, and Bruno S Oliveira. 2018. Real-time high-quality specular highlight removal using efficient pixel clustering. In SIBGRAPI. 56--63.
[33]
Irina Nurutdinova, Ronny Hänsch, Vincent Mühler, Stavroula Bourou, Alexandra I Papadaki, and Olaf Hellwich. 2017. Specularity, shadow, and occlusion removal for planar objects in stereo case. In VISAPP, Vol. 5. 98--106.
[34]
Gang Fu, Qing Zhang, Chengfang Song, Qifeng Lin, and Chunxia Xiao. 2019. Specular Highlight Removal for Real-world Images. In Computer graphics forum, Vol. 38. 253--263.
[35]
Takahisa Yamamoto and Atsushi Nakazawa. 2019. General improvement method of specular component separation using high-emphasis filter and similarity function. ITE Transactions on Media Technology and Applications, Vol. 7, 2 (2019), 92--102.
[36]
Rappy Saha, Partha Pratim Banik, Shantanu Sen Gupta, and Ki-Doo Kim. 2020. Combining highlight removal and low-light image enhancement technique for HDR-like image generation. IET Image Processing, Vol. 14, 9 (2020), 1851--1861.
[37]
Sijia Wen, Yinqiang Zheng, and Feng Lu. 2021. Polarization guided specular reflection separation. IEEE Transactions on Image Processing, Vol. 30 (2021), 7280--7291.
[38]
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In NeurIPS, Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger (Eds.). 2672--2680.
[39]
Jie Guo, Zuojian Zhou, and Limin Wang. 2018. Single image highlight removal with a sparse and low-rank reflection model. In ECCV. 268--283.
[40]
Shiyu Hou, Chaoqun Wang, Weize Quan, Jingen Jiang, and Dong-Ming Yan. 2021. Text-aware single image specular highlight removal. In PRCV. 115--127.
[41]
Bin Liang, Dongdong Weng, Ziqi Tu, Le Luo, and Jie Hao. 2021. Research on face specular removal and intrinsic decomposition based on polarization characteristics. Optics Express, Vol. 29, 20 (2021), 32256--32270.
[42]
Zhongqi Wu, Jianwei Guo, Chuanqing Zhuang, Jun Xiao, Dong-Ming Yan, and Xiaopeng Zhang. 2023. Joint specular highlight detection and removal in single images via Unet-Transformer. Computational Visual Media, Vol. 9, 1 (2023), 141--154.
[43]
Guangwei Hu, Yuanfeng Zheng, Haoran Yan, Guang Hua, and Yuchen Yan. 2022. Mask-guided cycle-GAN for specular highlight removal. Pattern Recognition Letters, Vol. 161 (2022), 108--114.
[44]
Atif Anwer, Samia Ainouz, Naufal M. Saad, Syed Saad Azhar Ali, and Fabrice Meriaudeau. 2023. Joint network for specular highlight detection and adversarial generation of specular-free images trained with polarimetric data. Neurocomputing, Vol. 559 (Nov. 2023), 126769.
[45]
Kun Hu, Zhaoyangfan Huang, and Xingjun Wang. 2024. Highlight Removal Network Based on an Improved Dichromatic Reflection Model. In ICASSP. 2645--2649.
[46]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, Vol. 13, 4 (2004), 600--612.
[47]
Xiaojie Guo, Xiaochun Cao, and Yi Ma. 2014. Robust separation of reflection from multiple images. In CVPR. 2187--2194.
[48]
Kuk-Jin Yoon, Yoojin Choi, and In So Kweon. 2006. Fast separation of reflection components using a specularity-invariant image representation. In ICIP. 973--976.
[49]
Yongqing Huo, Fan Yang, and Chao Li. 2015. HDR image generation from LDR image with highlight removal. In ICME Workshops. 1--5.
[50]
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In CVPR. 586--595.

Cited By

View all
  • (2024)UIE-UnFold: Deep Unfolding Network with Color Priors and Vision Transformer for Underwater Image Enhancement2024 IEEE 11th International Conference on Data Science and Advanced Analytics (DSAA)10.1109/DSAA61799.2024.10722842(1-10)Online publication date: 6-Oct-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
October 2024
11719 pages
ISBN:9798400706868
DOI:10.1145/3664647
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Check for updates

Author Tags

  1. dual-hybrid attention
  2. spatial and spectral
  3. specular highlight removal

Qualifiers

  • Research-article

Funding Sources

Conference

MM '24
Sponsor:
MM '24: The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)90
  • Downloads (Last 6 weeks)79
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)UIE-UnFold: Deep Unfolding Network with Color Priors and Vision Transformer for Underwater Image Enhancement2024 IEEE 11th International Conference on Data Science and Advanced Analytics (DSAA)10.1109/DSAA61799.2024.10722842(1-10)Online publication date: 6-Oct-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media