Abstract
In this paper, we propose an Efficient Frequency-Guided image deraining transformer, called former, to explore the more useful self-attention values from the frequency domain for better image deraining. Inspired by the traditional convolution theorem, we design frequency domain guidance attention to learn rich global and local dependencies. Firstly, we employ affine coupling to increase receptive fields implicitly, enabling the capturing of multi-scale spatial feature representations, and then they are transferred to the frequency domain using the Fourier transform. Instead of using vanilla attention, we adopt element-wise product to model global frequency information for better feature aggregation and reducing spatial complexity. As traditional feed-forward networks struggle with frequency information, we introduce an adaptive frequency collaborative block to adaptively learn frequency information and integrate local spatial information for improved image restoration. Moreover, a scale feature enhancement block is designed to exchange and aggregate information at different scales for learning mixed features of various scales. Extensive experimental results on commonly used benchmark datasets demonstrate that our method outperforms competitive methods in terms of performance.
Similar content being viewed by others
Data Availability
The online experimental datasets of this paper are available at https://github.com/MingTian99/Deraining_Studies.
References
Tang, F., Ling, Q.: Ranking-based siamese visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 8741–8750 (2022)
Brémond, F., Thonnat, M., Zúniga, M.: Video-understanding framework for automatic behavior recognition. Behav. Res. Methods 38(3), 416–426 (2006)
Crassidis, J.L., Markley, F.L., Cheng, Y.: Survey of nonlinear attitude estimation methods. J. Guidance Control Dyn. 30(1), 12–28 (2007)
Li, Y., Tan, R.T., Guo, X., Lu, J., Brown, M.S.: Rain streak removal using layer priors. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2736–2744 (2016)
Chen, D., He, M., Fan, Q., Liao, J., Zhang, L., Hou, D., et al.: Gated context aggregation network for image dehazing and deraining. In: 2019 IEEE Winter Conference on Applications of Computer Vision. pp. 1375–1383 (2019)
Ren, D., Zuo, W., Hu, Q., Zhu, P., Meng, D.: Progressive image deraining networks: A better and simpler baseline. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3937–3946 (2019)
Li, X., Wu, J., Lin, Z., Liu, H., Zha, H.: Recurrent squeeze-and-excitation context aggregation net for single image deraining. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 254–269 (2018)
Fu, X., Liang, B., Huang, Y., Ding, X., Paisley, J.: Lightweight pyramid networks for image deraining. IEEE Trans. Neural Netw. Learn Syst. 31(6), 1794–1807 (2019)
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: Efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5728–5739 (2022)
Xiao, J., Fu, X., Liu, A., Wu, F., Zha, Z.J.: Image De-raining Transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence. pp. 1–18 (2022)
Kong, L., Dong, J., Ge, J., Li, M., Pan, J.: Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5886–5895 (2023)
Pan, H., Zhu, X., Atici, S.F., Cetin, A.: A hybrid quantum-classical approach based on the hadamard transform for the convolutional layer. In: International Conference on Machine Learning. PMLR. p. 26891–26903 (2023)
Kang, L.W., Lin, C.W., Fu, Y.H.: Automatic single-image-based rain streaks removal via image decomposition. IEEE Trans. Image Process. 21(4), 1742–1755 (2011)
Gu, S., Meng, D., Zuo, W., Zhang, L.: Joint convolutional analysis and synthesis sparse representation for single image layer separation. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1708–1716 (2017)
Luo, Y., Xu, Y., Ji, H.: Removing rain from a single image via discriminative sparse coding. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 3397–3405 (2015)
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H., et al.: Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 14821–14831 (2021)
Jiang, K., Wang, Z., Chen, C., Wang, Z., Cui, L., Lin, C.W.: Magic ELF: Image deraining meets association learning and transformer. ACM International Conference on Multimedia. pp. 827–836 (2022)
Chen, X., Li, H., Li, M., Pan, J.: Learning A Sparse Transformer Network for Effective Image Deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . pp. 5896–5905 (2023)
Suvorov, R., Logacheva, E., Mashikhin, A., Remizova, A., Ashukha, A., Silvestrov, A., et al.: Resolution-robust large mask inpainting with fourier convolutions. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 2149–2159 (2022)
Zou, W., Jiang, M., Zhang, Y., Chen, L., Lu, Z., Wu, Y.: Sdwnet: A straight dilated network with wavelet transformation for image deblurring. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 1895–1904 (2021)
Scribano, C., Franchini, G., Prato, M., Bertogna, M.: DCT-former: Efficient self-attention with discrete cosine transform. J. Sci. Comput. 94(3), 67 (2023)
Pan, H., Badawi, D., Cetin, A.E.: Block walsh-hadamard transform-based binary layers in deep neural networks. ACM Trans. Embed. Comput. Syst. 21(6), 1–25 (2022)
Pan, H., Zhu, X., Ye, Z., Chen, P.Y.: Cetin AE. Real-time wireless ecg-derived respiration rate estimation using an autoencoder with a dct layer. In: IEEE International Conference on Acoustics, Speech and Signal Processing. pp. 1–5 (2023)
Song, T., Fan, S., Li, P., Jin, J., Jin, G., Fan, L.: Learning an effective transformer for remote sensing satellite image dehazing. IEEE Geosci. Remote Sens. Lett. 20, 1–5 (2023)
Cho, S.J., Ji, S.W., Hong, J.P., Jung, S.W., Ko, S.J.: Rethinking coarse-to-fine approach in single image deblurring. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4641–4650 (2021)
Yang, W., Tan, R.T., Feng, J., Liu, J., Guo, Z., Yan, S.: Deep joint rain detection and removal from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1357–1366 (2017)
Wang, T., Yang, X., Xu, K., Chen, S., Zhang, Q., Lau, R.W.: Spatial attentive single-image deraining with a high quality real rain dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. p. 12270–12279 (2019)
Wang, H., Xie, Q., Zhao, Q., Meng, D.: A model-driven deep neural network for single image rain removal. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3103–3112 (2020)
Mou, C., Wang, Q., Zhang, J.: Deep generalized unfolding networks for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 17399–17410 (2022)
Chen, L., Chu, X., Zhang, X., Sun, J.: Simple baselines for image restoration. In: European Conference on Computer Vision. pp. 17–33 (2022)
Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J., Li, H.: Uformer: A general u-shaped transformer for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 17683–17693 (2022)
Kulkarni, A., Phutke, S.S., Murala, S.: Unified Transformer network for multi-weather image restoration. In: European Conference on Computer Vision. pp. 344–360 (2022)
Song, T., Li, P., Jin, G., Jin, J., Fan, S., Chen, X.: Image Deraining transformer with sparsity and frequency guidance. In: IEEE International Conference on Multimedia and Expo (ICME). pp. 1889–1894 (2023)
Zhang, H., Sindagi, V., Patel, V.M.: Image de-raining using a conditional generative adversarial network. IEEE Trans. Circuit Syst. Video Technol. 30(11), 3943–3956 (2019)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 10012–10022 (2021)
Li, P., Jin, J., Jin, G., Fan, L., Gao, X., Song, T., et al.: Deep scale-space mining network for single image deraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Worksh. pp. 4276–4285 (2022)
Acknowledgements
This work was partly supported by the Scientific Research Project of the Education Department of Liaoning Province (LJKZ0518, LJKZ0519).
Author information
Authors and Affiliations
Contributions
TS and SF wrote the main manuscript text and performed the related experiments. LF prepared all figures. JJ and GJ gave guidance. All authors reviewed and revised the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors indicated no conflicts of interest with this work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Song, T., Fan, S., Jin, J. et al. Exploring an efficient frequency-guidance transformer for single image deraining. SIViP 18, 2429–2438 (2024). https://doi.org/10.1007/s11760-023-02918-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-023-02918-z