Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

PSC diffusion: patch-based simplified conditional diffusion model for low-light image enhancement

Published: 21 June 2024 Publication History

Abstract

Low-light image enhancement is pivotal for augmenting the utility and recognition of visuals captured under inadequate lighting conditions. Previous methods based on Generative Adversarial Networks (GAN) are affected by mode collapse and lack attention to the inherent characteristics of low-light images. This paper propose the Patch-based Simplified Conditional Diffusion Model (PSC Diffusion) for low-light image enhancement due to the outstanding performance of diffusion models in image generation. Specifically, recognizing the potential issue of gradient vanishing in extremely low-light images due to smaller pixel values, we design a simplified U-Net architecture with SimpleGate and Parameter-free attention (SimPF) block to predict noise. This architecture utilizes parameter-free attention mechanism and fewer convolutional layers to reduce multiplication operations across feature maps, resulting in a 12–51% reduction in parameters compared to U-Nets used in several prominent diffusion models, which also accelerates the sampling speed. In addition, preserving intricate details in images during the diffusion process is achieved through employing a patch-based diffusion strategy, integrated with global structure-aware regularization, which effectively enhances the overall quality of the enhanced images. Experiments show that the method proposed in this paper achieves richer image details and better perceptual quality, while the sampling speed is over 35% faster than similar diffusion model-based methods.

References

[1]
Liang J, Wang J, Quan Y, Chen T, Liu J, Ling H, and Xu Y Recurrent exposure generation for low-light face detection IEEE Trans. Multimed. 2021 24 1609-1621
[2]
Li G, Yang Y, Qu X, Cao D, and Li K A deep learning based image enhancement approach for autonomous driving at night Knowl.-Based Syst. 2021 213
[3]
Abdullah-Al-Wadud M, Kabir MH, Dewan MAA, and Chae O A dynamic histogram equalization for image contrast enhancement IEEE Trans. Consum. Electron. 2007 53 2 593-600
[4]
Land EH The retinex theory of color vision Sci. Am. 1977 237 6 108-129
[5]
Li C, Guo C, Han L, Jiang J, Cheng M-M, Gu J, and Loy CC Low-light image and video enhancement using deep learning: a survey IEEE Trans. Pattern Anal. Mach. Intell. 2021 44 12 9396-9416
[6]
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18, pp. 234–241. Springer, London (2015)
[7]
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.-A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103 (2008)
[8]
Jiang Y, Gong X, Liu D, Cheng Y, Fang C, Shen X, Yang J, Zhou P, and Wang Z Enlightengan: deep light enhancement without paired supervision IEEE Trans. Image Process. 2021 30 2340-2349
[9]
Fu Y, Hong Y, Chen L, and You S LE-GAN: unsupervised low-light image enhancement network using attention module and identity invariant loss Knowl.-Based Syst. 2022 240
[10]
Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, and Bharath AA Generative adversarial networks: an overview IEEE Signal Process. Mag. 2018 35 1 53-65
[11]
Wang, Y., Wan, R., Yang, W., Li, H., Chau, L.-P., Kot, A.: Low-light image enhancement with normalizing flow. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2604–2612 (2022)
[12]
Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. Preprint arXiv:1808.04560 (2018)
[13]
Ho J, Jain A, and Abbeel P Denoising diffusion probabilistic models Adv. Neural Inform. Process. Syst. 2020 33 6840-6851
[14]
Dhariwal P and Nichol A Diffusion models beat gans on image synthesis Adv. Neural Inform. Process. Syst. 2021 34 8780-8794
[15]
Saharia C, Ho J, Chan W, Salimans T, Fleet DJ, and Norouzi M Image super-resolution via iterative refinement IEEE Trans. Pattern Anal. Mach. Intell. 2022 45 4 4713-4726
[16]
Whang, J., Delbracio, M., Talebi, H., Saharia, C., Dimakis, A.G., Milanfar, P.: Deblurring via stochastic refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16293–16303 (2022)
[17]
Fei, B., Lyu, Z., Pan, L., Zhang, J., Yang, W., Luo, T., Zhang, B., Dai, B.: Generative diffusion prior for unified image restoration and enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9935–9946 (2023)
[18]
Zhou, D., Yang, Z., Yang, Y.: Pyramid diffusion models for low-light image enhancement. Preprint arXiv:2305.10028 (2023)
[19]
Jinhui, H., Zhu, Z., Hou, J., Hui, L., Zeng, H., Yuan, H.: Global structure-aware diffusion process for low-light image enhancement. In: 37th Conference on Neural Information Processing Systems (2023)
[20]
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. Preprint arXiv:2010.02502 (2020)
[21]
Lore KG, Akintayo A, and Sarkar S LLNET: a deep autoencoder approach to natural low-light image enhancement Pattern Recogn. 2017 61 650-662
[22]
Yang S, Zhou D, Cao J, and Guo Y Rethinking low-light enhancement via transformer-GAN IEEE Signal Process. Lett. 2022 29 1082-1086
[23]
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems, vol. 30 (2017)
[24]
Guo, C., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., Cong, R.: Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1780–1789 (2020)
[25]
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: International Conference on Machine Learning, pp. 2256–2265. PMLR (2015)
[26]
Song Y and Ermon S Generative modeling by estimating gradients of the data distribution Adv. Neural Inform. Process. Syst. 2019 32 1
[27]
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. Preprint arXiv:2011.13456 (2020)
[28]
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022)
[29]
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint: inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11461–11471 (2022)
[30]
Özdenizci O and Legenstein R Restoring vision in adverse weather conditions with patch-based denoising diffusion models IEEE Trans. Pattern Anal. Mach. Intell. 2023 2023 1
[31]
Zhao, C., Cai, W., Dong, C., Hu, C.: Wavelet-based Fourier information interaction with frequency diffusion adjustment for underwater image restoration. Preprint arXiv:2311.16845 (2023)
[32]
Zhao, C., Dong, C., Cai, W.: Learning a physical-aware diffusion model based on transformer for underwater image enhancement. Preprint arXiv:2403.01497 (2024)
[33]
Jiang H, Luo A, Fan H, Han S, and Liu S Low-light image enhancement with wavelet-based diffusion models ACM Trans. Graph. (TOG) 2023 42 6 1-14
[34]
Fran, C., et al.: Deep learning with depth wise separable convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
[35]
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.-H.: Restormer: efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5728–5739 (2022)
[36]
Du, Z., Liu, J., Tang, J., Wu, G.: Anchor-based plain net for mobile image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2494–2502 (2021)
[37]
Chen, L., Chu, X., Zhang, X., Sun, J.: Simple baselines for image restoration. In: European Conference on Computer Vision, pp. 17–33. Springer, London (2022)
[38]
Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUS). Preprint arXiv:1606.08415 (2016)
[39]
Yang, L., Zhang, R.-Y., Li, L., Xie, X.: Simam: a simple, parameter-free attention module for convolutional neural networks. In: International Conference on Machine Learning, pp. 11863–11874. PMLR (2021)
[40]
Hu J, Shen L, Albanie S, Sun G, and Vedaldi A Gather-excite: exploiting feature context in convolutional neural networks Adv. Neural Inform. Process. Syst. 2018 31 1
[41]
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
[42]
Yin, Y., Xu, D., Tan, C., Liu, P., Zhao, Y., Wei, Y.: Cle diffusion: controllable light enhancement diffusion model. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 8145–8156 (2023)
[43]
Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: International Conference on Machine Learning, pp. 8162–8171. PMLR (2021)
[44]
Yang, W., Wang, S., Fang, Y., Wang, Y., Liu, J.: From fidelity to perceptual quality: a semi-supervised approach for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3063–3072 (2020)
[45]
Ding K, Ma K, Wang S, and Simoncelli EP Image quality assessment: unifying structure and texture similarity IEEE Trans. Pattern Anal. Mach. Intell. 2020 44 5 2567-2581
[46]
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
[47]
Ma, L., Ma, T., Liu, R., Fan, X., Luo, Z.: Toward fast, flexible, and robust low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5637–5646 (2022)
[48]
Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J., Li, H.: Uformer: a general u-shaped transformer for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17683–17693 (2022)
[49]
Liu, R., Ma, L., Zhang, J., Fan, X., Luo, Z.: Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10561–10570 (2021)
[50]
Luo, Z., Gustafsson, F.K., Zhao, Z., Sjölund, J., Schön, T.B.: Image restoration with mean-reverting stochastic differential equations. Preprint arXiv:2301.11699 (2023)
[51]
Kawar B, Elad M, Ermon S, and Song J Denoising diffusion restoration models Adv. Neural Inform. Process. Syst. 2022 35 23593-23606
[52]
Luo, Z., Gustafsson, F.K., Zhao, Z., Sjölund, J., Schön, T.B.: Refusion: enabling large-size realistic image restoration with latent-space diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1680–1691 (2023)
[53]
Li, B., Xue, K., Liu, B., Lai, Y.-K.: Bbdm: image-to-image translation with Brownian bridge diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1952–1961 (2023)
[54]
Wang, Y., Yu, Y., Yang, W., Guo, L., Chau, L.-P., Kot, A.C., Wen, B.: Exposurediffusion: learning to expose for low-light image enhancement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12438–12448 (2023)

Index Terms

  1. PSC diffusion: patch-based simplified conditional diffusion model for low-light image enhancement
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image Multimedia Systems
            Multimedia Systems  Volume 30, Issue 4
            Aug 2024
            1069 pages

            Publisher

            Springer-Verlag

            Berlin, Heidelberg

            Publication History

            Published: 21 June 2024
            Accepted: 12 June 2024
            Received: 22 February 2024

            Author Tags

            1. Low-light image enhancement
            2. Generative model
            3. Diffusion model
            4. U-Net
            5. Image patching
            6. Parameter-free attention

            Qualifiers

            • Research-article

            Funding Sources

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 27 Jan 2025

            Other Metrics

            Citations

            View Options

            View options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media