research-article

On the frequency bias of generative models

AUTHORs:

Andreas GeigerAuthors Info & Claims

NIPS'21: Proceedings of the 35th International Conference on Neural Information Processing Systems

Article No.: 1387, Pages 18126 - 18136

Published: 10 June 2024 Publication History

Abstract

The key objective of Generative Adversarial Networks (GANs) is to generate new data with the same statistics as the provided training data. However, multiple recent works show that state-of-the-art architectures yet struggle to achieve this goal. In particular, they report an elevated amount of high frequencies in the spectral statistics which makes it straightforward to distinguish real and generated images. Explanations for this phenomenon are controversial: While most works attribute the artifacts to the generator, other works point to the discriminator. We take a sober look at those explanations and provide insights on what makes proposed measures against high-frequency artifacts effective. To achieve this, we first independently assess the architectures of both the generator and discriminator and investigate if they exhibit a frequency bias that makes learning the distribution of high-frequency content particularly problematic. Based on these experiments, we make the following four observations: 1) Different upsampling operations bias the generator towards different spectral properties. 2) Checkerboard artifacts introduced by upsampling cannot explain the spectral discrepancies alone as the generator is able to compensate for these artifacts. 3) The discriminator does not struggle with detecting high frequencies per se but rather struggles with frequencies of low magnitude. 4) The downsampling operations in the discriminator can impair the quality of the training signal it provides. In light of these findings, we analyze proposed measures against high-frequency artifacts in state-of-the-art GAN training but find that none of the existing approaches can fully resolve spectral artifacts yet. Our results suggest that there is great potential in improving the discriminator and that this could be key to match the distribution of the training data more closely.

References

[1]

I. Anokhin, K. Demochkin, T. Khakhulin, G. Sterkin, V. Lempitsky, and D. Korzhenkov. Image generators with conditionally-independent pixel synthesis. Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021. 9

[2]

S. Arora, S. S. Du, W. Hu, Z. Li, and R. Wang. Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks. In Proc. of the International Conf. on Machine learning (ICML), 2019. 9

[3]

R. Basri, D. W. Jacobs, Y. Kasten, and S. Kritchman. The convergence rate of neural networks for learned functions of different frequencies. In Advances in Neural Information Processing Systems (NeurIPS), 2019. 9

[4]

K. Chandrasegaran, N. Tran, and N. Cheung. A closer look at fourier spectrum discrepancies for cnn-generated images detection. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021. 2, 4, 8

[5]

Y. Chen, G. Li, C. Jin, S. Liu, and T. Li. SSD-GAN: measuring the realness in the spatial and spectral domains. In Proc. of the Conf. on Artificial Intelligence (AAAI), 2021. 2, 6, 7, 9

[6]

Y. Choi, Y. Uh, J. Yoo, and J.-W. Ha. Stargan v2: Diverse image synthesis for multiple domains. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. 8

[7]

R. Durall, M. Keuper, and J. Keuper. Watch your up-convolution: CNN based generative deep neural networks are failing to reproduce spectral distributions. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. 1, 2, 3, 4, 9

[8]

T. Dzanic, K. Shah, and F. D. Witherden. Fourier spectrum discrepancies in deep network generated images. In Advances in Neural Information Processing Systems (NeurIPS), 2020. 1, 3, 9

[9]

J. Frank, T. Eisenhofer, L. Schönherr, A. Fischer, D. Kolossa, and T. Holz. Leveraging frequency analysis for deep fake image recognition. In Proc. of the International Conf. on Machine learning (ICML), 2020. 1, 2, 4, 9

[10]

R. Gal, D. Cohen, A. Bermano, and D. Cohen-Or. SWAGAN: A style-based wavelet-driven generative model. CoRR, 2102.06108, 2021. 2, 6, 7

[11]

I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems (NeurIPS), 2014. 3, 9

[12]

M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in Neural Information Processing Systems (NeurIPS), 2017. 8

Digital Library

[13]

A. Jacot, C. Hongler, and F. Gabriel. Neural tangent kernel: Convergence and generalization in neural networks. In Advances in Neural Information Processing Systems (NeurIPS), 2018. 9

[14]

L. Jiang, B. Dai, W. Wu, and C. C. Loy. Focal frequency loss for generative models. arXiv.org, 2012.12821, 2020. 1, 2, 4, 9

[15]

S. Jung and M. Keuper. Spectral distribution aware image generation. In Proc. of the Conf. on Artificial Intelligence (AAAI), 2021. 2, 3, 6, 7, 9

[16]

T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of GANs for improved quality, stability, and variation. In Proc. of the International Conf. on Learning Representations (ICLR), 2018. 3, 4, 6, 8, 9

[17]

T. Karras, M. Aittala, J. Hellsten, S. Laine, J. Lehtinen, and T. Aila. Training generative adversarial networks with limited data. In Advances in Neural Information Processing Systems (NeurIPS), 2020. 8, 10

[18]

T. Karras, S. Laine, and T. Aila. A style-based generator architecture for generative adversarial networks. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019. 3, 8, 9

[19]

T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila. Analyzing and improving the image quality of StyleGAN. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. 3, 7, 8, 9

[20]

M. Khayatkhoei and A. Elgammal. Spatial frequency bias in convolutional generative adversarial networks. arXiv.org, 2010.01473, 2020. 2, 4

[21]

Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face attributes in the wild. In Proc. of the IEEE International Conf. on Computer Vision (ICCV), 2015. 4

Digital Library

[22]

Z. Liu, X. Qi, and P. H. S. Torr. Global texture enhancement for fake face detection in the wild. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. 1, 9

[23]

F. Marra, D. Gragnaniello, D. Cozzolino, and L. Verdoliva. Detection of gan-generated fake images over social networks. In Proc. IEEE Conf. on Multimedia Information Processing and Retrieval (MIPR), 2018. 9

[24]

L. Mescheder, A. Geiger, and S. Nowozin. Which training methods for gans do actually converge? In Proc. of the International Conf. on Machine learning (ICML), 2018. 3, 4, 8, 9

[25]

A. Odena, V. Dumoulin, and C. Olah. Deconvolution and checkerboard artifacts. Distill, 2016. 2

[26]

N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F. A. Hamprecht, Y. Bengio, and A. C. Courville. On the spectral bias of neural networks. In Proc. of the International Conf. on Machine learning (ICML), 2019. 4, 9

[27]

A. Rössler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner. Faceforensics++: Learning to detect manipulated facial images. In Proc. of the IEEE International Conf. on Computer Vision (ICCV), 2019. 9

[28]

E. Schönfeld, B. Schiele, and A. Khoreva. A u-net based discriminator for generative adversarial networks. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. 7, 10

[29]

W. Shi, J. Caballero, F. Huszar, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2016. 4

[30]

A. Shocher, B. Feinstein, N. Haim, and M. Irani. From discrete to continuous convolution layers. arXiv.org, 2020. 7, 10

[31]

I. Skorokhodov, S. Ignatyev, and M. Elhoseiny. Adversarial generation of continuous images. Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021. 9

[32]

D. Ulyanov, A. Vedaldi, and V. S. Lempitsky. Deep image prior. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2018. 9

[33]

H. Wang, X. Wu, Z. Huang, and E. P. Xing. High-frequency component helps explain the generalization of convolutional neural networks. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. 9

[34]

S. Wang, O. Wang, R. Zhang, A. Owens, and A. A. Efros. Cnn-generated images are surprisingly easy to spot... for now. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. 1, 9

[35]

Z. J. Xu, Y. Zhang, T. Luo, Y. Xiao, and Z. Ma. Frequency principle: Fourier analysis sheds light on deep neural networks. arXiv.org, 1901.06523, 2019. 9

[36]

F. Yu, Y. Zhang, S. Song, A. Seff, and J. Xiao. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv.org, 1506.03365, 2015. 8

[37]

R. Zhang. Making convolutional networks shift-invariant again. In Proc. of the International Conf. on Machine learning (ICML), 2019. 6

[38]

X. Zhang, S. Karaman, and S. Chang. Detecting and simulating artifacts in GAN fake images. In IEEE International Workshop on Information Forensics and Security, 2019. 1, 2, 9

[39]

S. Zhao, Z. Liu, J. Lin, J. Zhu, and S. Han. Differentiable augmentation for data-efficient GAN training. In Advances in Neural Information Processing Systems (NeurIPS), 2020. 10

Recommendations

HFD-SRGAN: Super-Resolution Generative Adversarial Network with High-frequency discriminator
2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
The high-frequencies of images is very important both in keeping the edges and suppressing artifacts. To improve the performance of single image super-resolution (SISR) based on the SRGAN framework, we propose Super-Resolution Generative Adversarial ...
Auxiliary deep generative models
ICML'16: Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48

Deep generative models parameterized by neural networks have recently achieved state-of-the-art performance in unsupervised and semi-supervised learning. We extend deep generative models with auxiliary variables which improves the variational ...
Context-based bias removal of statistical models of wavelet coefficients for image denoising
ICIP'09: Proceedings of the 16th IEEE international conference on Image processing

Existing wavelet-based image denoising techniques all assume a probability model of wavelet coefficients that has zero mean, such as zero-mean Laplacian, Gaussian, or generalized Gaussian distributions. While such a zero-mean probability model fits a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems

December 2021

30517 pages

ISBN:9781713845393

Copyright © 2021 Neural Information Processing Systems Foundation, Inc.

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 10 June 2024

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents