research-article

PSC diffusion: patch-based simplified conditional diffusion model for low-light image enhancement

Authors:

Hongzhe LiuAuthors Info & Claims

Multimedia Systems, Volume 30, Issue 4

https://doi.org/10.1007/s00530-024-01391-z

Published: 21 June 2024 Publication History

Abstract

Low-light image enhancement is pivotal for augmenting the utility and recognition of visuals captured under inadequate lighting conditions. Previous methods based on Generative Adversarial Networks (GAN) are affected by mode collapse and lack attention to the inherent characteristics of low-light images. This paper propose the Patch-based Simplified Conditional Diffusion Model (PSC Diffusion) for low-light image enhancement due to the outstanding performance of diffusion models in image generation. Specifically, recognizing the potential issue of gradient vanishing in extremely low-light images due to smaller pixel values, we design a simplified U-Net architecture with SimpleGate and Parameter-free attention (SimPF) block to predict noise. This architecture utilizes parameter-free attention mechanism and fewer convolutional layers to reduce multiplication operations across feature maps, resulting in a 12–51% reduction in parameters compared to U-Nets used in several prominent diffusion models, which also accelerates the sampling speed. In addition, preserving intricate details in images during the diffusion process is achieved through employing a patch-based diffusion strategy, integrated with global structure-aware regularization, which effectively enhances the overall quality of the enhanced images. Experiments show that the method proposed in this paper achieves richer image details and better perceptual quality, while the sampling speed is over 35% faster than similar diffusion model-based methods.

References

[1]

Liang J, Wang J, Quan Y, Chen T, Liu J, Ling H, and Xu Y Recurrent exposure generation for low-light face detection IEEE Trans. Multimed. 2021 24 1609-1621

Digital Library

[2]

Li G, Yang Y, Qu X, Cao D, and Li K A deep learning based image enhancement approach for autonomous driving at night Knowl.-Based Syst. 2021 213

Digital Library

[3]

Abdullah-Al-Wadud M, Kabir MH, Dewan MAA, and Chae O A dynamic histogram equalization for image contrast enhancement IEEE Trans. Consum. Electron. 2007 53 2 593-600

Digital Library

[4]

Land EH The retinex theory of color vision Sci. Am. 1977 237 6 108-129

[5]

Li C, Guo C, Han L, Jiang J, Cheng M-M, Gu J, and Loy CC Low-light image and video enhancement using deep learning: a survey IEEE Trans. Pattern Anal. Mach. Intell. 2021 44 12 9396-9416

[6]

Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18, pp. 234–241. Springer, London (2015)

[7]

Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.-A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1096–1103 (2008)

[8]

Jiang Y, Gong X, Liu D, Cheng Y, Fang C, Shen X, Yang J, Zhou P, and Wang Z Enlightengan: deep light enhancement without paired supervision IEEE Trans. Image Process. 2021 30 2340-2349

Digital Library

[9]

Fu Y, Hong Y, Chen L, and You S LE-GAN: unsupervised low-light image enhancement network using attention module and identity invariant loss Knowl.-Based Syst. 2022 240

Digital Library

[10]

Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, and Bharath AA Generative adversarial networks: an overview IEEE Signal Process. Mag. 2018 35 1 53-65

[11]

Wang, Y., Wan, R., Yang, W., Li, H., Chau, L.-P., Kot, A.: Low-light image enhancement with normalizing flow. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2604–2612 (2022)

[12]

Wei, C., Wang, W., Yang, W., Liu, J.: Deep retinex decomposition for low-light enhancement. Preprint arXiv:1808.04560 (2018)

[13]

Ho J, Jain A, and Abbeel P Denoising diffusion probabilistic models Adv. Neural Inform. Process. Syst. 2020 33 6840-6851

[14]

Dhariwal P and Nichol A Diffusion models beat gans on image synthesis Adv. Neural Inform. Process. Syst. 2021 34 8780-8794

[15]

Saharia C, Ho J, Chan W, Salimans T, Fleet DJ, and Norouzi M Image super-resolution via iterative refinement IEEE Trans. Pattern Anal. Mach. Intell. 2022 45 4 4713-4726

[16]

Whang, J., Delbracio, M., Talebi, H., Saharia, C., Dimakis, A.G., Milanfar, P.: Deblurring via stochastic refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16293–16303 (2022)

[17]

Fei, B., Lyu, Z., Pan, L., Zhang, J., Yang, W., Luo, T., Zhang, B., Dai, B.: Generative diffusion prior for unified image restoration and enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9935–9946 (2023)

[18]

Zhou, D., Yang, Z., Yang, Y.: Pyramid diffusion models for low-light image enhancement. Preprint arXiv:2305.10028 (2023)

[19]

Jinhui, H., Zhu, Z., Hou, J., Hui, L., Zeng, H., Yuan, H.: Global structure-aware diffusion process for low-light image enhancement. In: 37th Conference on Neural Information Processing Systems (2023)

[20]

Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. Preprint arXiv:2010.02502 (2020)

[21]

Lore KG, Akintayo A, and Sarkar S LLNET: a deep autoencoder approach to natural low-light image enhancement Pattern Recogn. 2017 61 650-662

Digital Library

[22]

Yang S, Zhou D, Cao J, and Guo Y Rethinking low-light enhancement via transformer-GAN IEEE Signal Process. Lett. 2022 29 1082-1086

[23]

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems, vol. 30 (2017)

[24]

Guo, C., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., Cong, R.: Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1780–1789 (2020)

[25]

Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: International Conference on Machine Learning, pp. 2256–2265. PMLR (2015)

[26]

Song Y and Ermon S Generative modeling by estimating gradients of the data distribution Adv. Neural Inform. Process. Syst. 2019 32 1

[27]

Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. Preprint arXiv:2011.13456 (2020)

[28]

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022)

[29]

Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint: inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11461–11471 (2022)

[30]

Özdenizci O and Legenstein R Restoring vision in adverse weather conditions with patch-based denoising diffusion models IEEE Trans. Pattern Anal. Mach. Intell. 2023 2023 1

[31]

Zhao, C., Cai, W., Dong, C., Hu, C.: Wavelet-based Fourier information interaction with frequency diffusion adjustment for underwater image restoration. Preprint arXiv:2311.16845 (2023)

[32]

Zhao, C., Dong, C., Cai, W.: Learning a physical-aware diffusion model based on transformer for underwater image enhancement. Preprint arXiv:2403.01497 (2024)

[33]

Jiang H, Luo A, Fan H, Han S, and Liu S Low-light image enhancement with wavelet-based diffusion models ACM Trans. Graph. (TOG) 2023 42 6 1-14

Digital Library

[34]

Fran, C., et al.: Deep learning with depth wise separable convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

[35]

Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.-H.: Restormer: efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5728–5739 (2022)

[36]

Du, Z., Liu, J., Tang, J., Wu, G.: Anchor-based plain net for mobile image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2494–2502 (2021)

[37]

Chen, L., Chu, X., Zhang, X., Sun, J.: Simple baselines for image restoration. In: European Conference on Computer Vision, pp. 17–33. Springer, London (2022)

[38]

Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUS). Preprint arXiv:1606.08415 (2016)

[39]

Yang, L., Zhang, R.-Y., Li, L., Xie, X.: Simam: a simple, parameter-free attention module for convolutional neural networks. In: International Conference on Machine Learning, pp. 11863–11874. PMLR (2021)

[40]

Hu J, Shen L, Albanie S, Sun G, and Vedaldi A Gather-excite: exploiting feature context in convolutional neural networks Adv. Neural Inform. Process. Syst. 2018 31 1

[41]

Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)

[42]

Yin, Y., Xu, D., Tan, C., Liu, P., Zhao, Y., Wei, Y.: Cle diffusion: controllable light enhancement diffusion model. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 8145–8156 (2023)

[43]

Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: International Conference on Machine Learning, pp. 8162–8171. PMLR (2021)

[44]

Yang, W., Wang, S., Fang, Y., Wang, Y., Liu, J.: From fidelity to perceptual quality: a semi-supervised approach for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3063–3072 (2020)

[45]

Ding K, Ma K, Wang S, and Simoncelli EP Image quality assessment: unifying structure and texture similarity IEEE Trans. Pattern Anal. Mach. Intell. 2020 44 5 2567-2581

[46]

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)

[47]

Ma, L., Ma, T., Liu, R., Fan, X., Luo, Z.: Toward fast, flexible, and robust low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5637–5646 (2022)

[48]

Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J., Li, H.: Uformer: a general u-shaped transformer for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17683–17693 (2022)

[49]

Liu, R., Ma, L., Zhang, J., Fan, X., Luo, Z.: Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10561–10570 (2021)

[50]

Luo, Z., Gustafsson, F.K., Zhao, Z., Sjölund, J., Schön, T.B.: Image restoration with mean-reverting stochastic differential equations. Preprint arXiv:2301.11699 (2023)

[51]

Kawar B, Elad M, Ermon S, and Song J Denoising diffusion restoration models Adv. Neural Inform. Process. Syst. 2022 35 23593-23606

[52]

Luo, Z., Gustafsson, F.K., Zhao, Z., Sjölund, J., Schön, T.B.: Refusion: enabling large-size realistic image restoration with latent-space diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1680–1691 (2023)

[53]

Li, B., Xue, K., Liu, B., Lai, Y.-K.: Bbdm: image-to-image translation with Brownian bridge diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1952–1961 (2023)

[54]

Wang, Y., Yu, Y., Yang, W., Guo, L., Chau, L.-P., Kot, A.C., Wen, B.: Exposurediffusion: learning to expose for low-light image enhancement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12438–12448 (2023)

Index Terms

PSC diffusion: patch-based simplified conditional diffusion model for low-light image enhancement
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
      2. Image and video acquisition
        Computational photography
  2. Computer graphics
    1. Image manipulation
      1. Image processing
      2. Image-based rendering
    2. Rendering
      1. Non-photorealistic rendering

Index terms have been assigned to the content through auto-classification.

Recommendations

Denoising diffusion post-processing for low-light image enhancement
Abstract
Low-light image enhancement (LLIE) techniques attempt to increase the visibility of images captured in low-light scenarios. However, as a result of enhancement, a variety of image degradations such as noise and color bias are revealed. ...
Graphical abstract

Display Omitted
Highlights
- Low-light enhancement results in noisy images, treated by post-processing denoising.
- Mapping from poor- to well-lit images can be captured by a conditional distribution.
- Diffusion model is used to learn the low-light to well-lit ...
Low Light Image Enhancement Based on Retinex Theory and Diffusion Model
ICDSP '24: Proceedings of the 2024 8th International Conference on Digital Signal Processing

This article proposes a new method Maximum Decomposition Diffusion Enhancement(MDDE) for low light image enhancement. This method combines the advantages of Retinex theory and diffusion models, making the model physically interpretable and improving the ...
Low-light image enhancement based on variational image decomposition
Abstract
Due to the significant differences in brightness regions in real-world images, existing low-light image enhancement methods may lead to insufficient enhancement in low-light regions or over-enhancement in normal-light regions, as well as color ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Multimedia Systems

Multimedia Systems Volume 30, Issue 4

Aug 2024

1069 pages

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 21 June 2024

Accepted: 12 June 2024

Received: 22 February 2024

Author Tags

Qualifiers

Research-article

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents