Real Sample Consistency Regularization for GANs
<p>Result of SGAN, SGAN-0GP, and SGAN-RSC on 50 Gaussians at 100 k iterations. SGAN, SAGN-0GP, and SGAN-RSC were iterated 100 k times on 50 Gaussians. The orange dots represent real samples, and the green dots represent randomly generated samples. There are 1000 green points in each subfigure.</p> "> Figure 2
<p>Result of the number of fake real samples in SGAN, SGAN-0GP, and SGAN-RSC with ResNet on CIFAR-10. We randomly sampled 1000 fake samples and counted the number of false samples. SGAN-0GP-R represents 0GP regularization on real samples; SGAN-0GP-I represents 0GP regularization on the linear interpolation between the real samples and the generated samples.</p> "> Figure 3
<p>Result of the output of discriminator in SGAN-0GP and SGAN-RSC with ResNet on CIFAR-10. We randomly sampled 1000 real samples and 1000 fake samples. The statistical discriminator’s output distribution for real samples and fake samples, respectively.</p> "> Figure 4
<p>Result on FARGAN and FARGAN-RSC on CIFAR-10 with the conventional network architecture.</p> "> Figure 5
<p>Result for SGAN-RSC with different <math display="inline"><semantics> <mi>λ</mi> </semantics></math> on CIFAR-10 with the conventional network and ResNet.</p> "> Figure 6
<p>FID of SGAN-0GP and SGAN-RSC. Calculated on CIFAR-10 and CIFAR-100 with the conventional network and ResNet.</p> "> Figure 7
<p>Result of 0GP regularization and RSC regularization on different GAN variants.</p> "> Figure 8
<p>The loss of the generator and discriminator in SGAN-0GP and SGAN-RSC with ResNet on CIFAR-10.</p> "> Figure 9
<p>Randomly images generated by SGAN-0GP and SGAN-RSC with ResNet on CIFAR-10.</p> "> Figure 10
<p>Randomly images generated by SGAN-0GP and SGAN-RSC with ResNet on CIFAR-100.</p> "> Figure 11
<p>Randomly images generated by SGAN-RSC and the image closest to it in CIFAR-10. We used the cosine distance as the basis for comparison.</p> "> Figure A1
<p>Randomly images generated by WGAN-0GP and WGAN-RSC with ResNet on CIFAR-10.</p> "> Figure A1 Cont.
<p>Randomly images generated by WGAN-0GP and WGAN-RSC with ResNet on CIFAR-10.</p> "> Figure A2
<p>Randomly images generated by HingeGAN-0GP and HingeGAN-RSC with ResNet on CIFAR-10.</p> "> Figure A3
<p>Randomly images generated by LSGAN-0GP and LSGAN-RSC with ResNet on CIFAR-10.</p> "> Figure A4
<p>Randomly images generated by SGAN-0GP and SGAN-RSC with ResNet on ImageNet.</p> "> Figure A4 Cont.
<p>Randomly images generated by SGAN-0GP and SGAN-RSC with ResNet on ImageNet.</p> ">
Abstract
:1. Introduction
- 1.
- We analyze the discriminator’s misjudgment. Due to the 0GP regularization, there will be more cases where the discriminator’s gradient at the real samples is less than the discriminator’s gradient at the generated samples during the training process;
- 2.
- We propose Real Sample Consistency (RSC) regularization, forcing the discriminator to output the same value for all real samples. For real samples, real sample consistency regularization can reduce the proportion of the discriminator output to be less than . Experiments on synthetic and real-world datasets verified that our method achieves better performance than 0GP regularization.
2. Related Work
3. Approach
3.1. Background
3.2. Misjudgment by the Discriminator
3.3. Real Sample Consistency Regularization
Algorithm 1: Minibatch stochastic gradient descent training of SGAN-RSC. |
4. Experimental Results
4.1. Synthetic Data
4.2. CIFAR-10 and CIFAR-100
4.3. ImageNet
4.4. Summary of the Experimental Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Layer | Output Size | Filter |
---|---|---|
Fully connected | 64 | 2 → 64 |
RELU | 64 | – |
Fully connected | 64 | 64 → 64 |
RELU | 64 | – |
Fully connected | 64 | 64 → 64 |
RELU | 64 | – |
Fully connected | 2 | 64 → 2 |
Layer | Output Size | Filter |
---|---|---|
Fully connected | 64 | 2 → 64 |
RELU | 64 | – |
Fully connected | 64 | 64 → 64 |
RELU | 64 | – |
Fully connected | 64 | 64 → 64 |
RELU | 64 | – |
Fully connected | 1 | 64 → 1 |
Layer | Output Size | Filter |
---|---|---|
Fully connected | 512 × 4 × 4 | 128 → 512 × 4 × 4 |
Reshape | 512 × 4 × 4 | – |
ResNet-Block | 256 × 4 × 4 | 512 → 256 → 256 |
NN-Upsampling | 256 × 8 × 8 | – |
ResNet-Block | 128 × 8 × 8 | 256 → 128 → 128 |
NN-Upsampling | 128 × 16 × 16 | – |
ResNet-Block | 64 × 16 × 16 | 128 → 64 → 64 |
NN-Upsampling | 64 × 32 × 32 | – |
ResNet-Block | 64 × 32 × 32 | 64 → 64 → 64 |
Conv2D | 3 × 32 × 32 | 64 → 3 |
Layer | Output Size | Filter |
---|---|---|
Conv2D | 64 × 32 × 32 | 3→ 64 |
ResNet-Block | 128 × 32 × 32 | 64 → 64 → 64 |
Avg-Pool2D | 128 × 16 × 16 | – |
ResNet-Block | 256 × 16 × 16 | 128 → 128 → 256 |
Avg-Pool2D | 256 × 8 × 8 | – |
ResNet-Block | 512 × 8 × 8 | 256 → 256 → 512 |
Avg-Pool2D | 512 × 4 × 4 | – |
Reshape | 512 × 4 × 4 | – |
Conv2D | 1 | 512 × 4 × 4 → 1 |
Layer | Output Size | Filter |
---|---|---|
Fully connected | 512 × 4 × 4 | 128 → 512 × 4 × 4 |
Reshape | 512 × 4 × 4 | – |
Conv2D | 512 × 4 × 4 | 512 → 512 |
Conv2D | 512 × 4 × 4 | 512 → 512 |
NN-Upsampling | 512 × 8 × 8 | – |
Conv2D | 256 × 8 × 8 | 512 →256 |
Conv2D | 256 × 8 × 8 | 256 → 256 |
NN-Upsampling | 512 × 16 × 16 | – |
Conv2D | 128 × 16 × 16 | 256 →128 |
Conv2D | 128 × 16 × 16 | 128 → 128 |
NN-Upsampling | 128 × 32 × 32 | – |
Conv2D | 64 × 32 × 32 | 128 →64 |
Conv2D | 64 × 32 × 32 | 64 → 64 |
Conv2D | 3 × 32 × 32 | 64 → 3 |
Layer | Output Size | Filter |
---|---|---|
Fully connected | 512 × 4 × 4 | 128 → 512 × 4 × 4 |
Reshape | 512 × 4 × 4 | – |
Conv2D | 64 × 32 × 32 | 3 → 64 |
Conv2D | 64 × 32 × 32 | 64 → 64 |
Conv2D | 128 × 32 × 32 | 64 → 128 |
Avg-Pool2D | 128 × 16 × 16 | – |
Conv2D | 128 × 16 × 16 | 128 → 128 |
Conv2D | 256 × 16 × 16 | 128 → 256 |
Avg-Pool2D | 256 × 8 × 8 | – |
Conv2D | 256 × 8 × 8 | 256 → 256 |
Conv2D | 512 × 8 × 8 | 256 → 512 |
Avg-Pool2D | 512 × 4 × 4 | – |
Reshape | 512 × 4 × 4 | – |
Fully connected | 1 | 512 × 4 × 4→ 1 |
Layer | Output Size | Filter |
---|---|---|
Fully connected | 1024 × 4 × 4 | 256 → 1024 × 4 × 4 |
Reshape | 1024 × 4 × 4 | – |
ResNet-Block | 1024 × 4 × 4 | 1024 → 1024 → 1024 |
ResNet-Block | 1024 × 4 × 4 | 1024 → 1024 →1024 |
NN-Upsampling | 1024 × 8 × 8 | – |
ResNet-Block | 512 × 8 × 8 | 1024 → 512 → 512 |
ResNet-Block | 512 × 8 × 8 | 512 → 512 →512 |
NN-Upsampling | 512 × 16 × 16 | – |
ResNet-Block | 256 × 16 × 16 | 512 → 256 → 256 |
ResNet-Block | 256 × 16 × 16 | 256 → 256 →256 |
NN-Upsampling | 256 × 32 × 32 | – |
ResNet-Block | 128 × 32 × 32 | 256 → 128 → 128 |
ResNet-Block | 128 × 32 × 32 | 128 → 128 →128 |
NN-Upsampling | 128 × 64 × 64 | – |
ResNet-Block | 64 × 64 × 64 | 128 → 64 → 64 |
ResNet-Block | 64 × 64 × 64 | 64 → 64 →64 |
Conv2D | 3 × 64 × 64 | 64 → 3 |
Layer | Output Size | Filter |
---|---|---|
Conv2D | 64 × 64 × 64 | 3 → 64 |
ResNet-Block | 64 × 64 × 64 | 64 → 64 → 64 |
ResNet-Block | 128 × 64 × 64 | 64 → 64 →128 |
Avg-Pool2D | 128 × 32 × 32 | – |
ResNet-Block | 128 × 32 × 32 | 128 → 128 → 128 |
ResNet-Block | 256 × 32 × 32 | 128 → 128 →256 |
Avg-Pool2D | 256 × 16 × 16 | – |
ResNet-Block | 256 × 16 × 16 | 256 → 256 → 256 |
ResNet-Block | 512 × 16 × 16 | 256 → 256 →512 |
NN-Upsampling | 512 × 8 × 8 | – |
ResNet-Block | 512 × 8 × 8 | 512 → 512 → 512 |
ResNet-Block | 1024 × 8 × 8 | 512 → 512 →1024 |
NN-Upsampling | 512 × 4 × 4 | – |
ResNet-Block | 1024 × 4 × 4 | 1024 → 1024 →1024 |
ResNet-Block | 1024 × 4 × 4 | 1024 → 1024 →1024 |
Fully connected | 1 | 1024 × 4 × 4→ 1 |
References
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.C.; Bengio, Y. Generative Adversarial Nets. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; pp. 2672–2680. [Google Scholar]
- Jolicoeur-Martineau, A. The relativistic discriminator: A key element missing from standard GAN. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Zhang, H.; Goodfellow, I.; Metaxas, D.; Odena, A. Self-Attention Generative Adversarial Networks. In Proceedings of the 36th International Conference on Machine Learning, Los Angeles, CA, USA, 10–15 June 2019; Volume 97, pp. 7354–7363. [Google Scholar]
- Gu, J.; Shen, Y.; Zhou, B. Image processing using multi-code GaN prior. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3009–3018. [Google Scholar]
- Shen, Y.; Gu, J.; Tang, X.; Zhou, B. Interpreting the latent space of GANs for semantic face editing. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 9240–9249. [Google Scholar]
- Brock, A.; Donahue, J.; Simonyan, K. Large Scale GAN Training for High Fidelity Natural Image Synthesis. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Ledig, C.; Theis, L.; Huszar, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Shaham, T.R.; Dekel, T.; Michaeli, T. SinGAN: Learning a generative model from a single natural image. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 4569–4579. [Google Scholar]
- Choi, Y.; Choi, M.; Kim, M.; Ha, J.W.; Kim, S.; Choo, J. StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Miyato, T.; Kataoka, T.; Koyama, M.; Yoshida, Y. Spectral normalization for generative adversarial networks. In Proceedings of the 6th International Conference on Learning Representations, ICLR 2018—Conference Track Proceedings, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Karras, T.; Laine, S.; Aila, T. A Style-Based Generator Architecture for Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Thanh-Tung, H.; Venkatesh, S.; Tran, T. Improving generalization and stability of generative adversarial networks. In Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019; pp. 1–18. [Google Scholar]
- Mescheder, L.; Geiger, A.; Nowozin, S. Which training methods for GANs do actually converge? In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholm, Sweden, 10–15 July 2018; Volume 8, pp. 5589–5626. [Google Scholar]
- Tao, S.; Wang, J. Alleviation of gradient exploding in GANs: Fake can be real. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 1188–1197. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, Australia, 6–11 August 2017; Volume 1, pp. 298–321. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A. Improved Training of Wasserstein GANs. In Proceedings of the 31st International Conference on Neural Information Processing Systems NIPS’17, Long Beach, CA, USA, 4–9 December 2017; pp. 5769–5779. [Google Scholar]
- Zhang, H.; Zhang, Z.; Odena, A.; Lee, H. Consistency Regularization for Generative Adversarial Networks. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Mao, X.; Li, Q.; Xie, H.; Lau, R.Y.; Wang, Z.; Smolley, S.P. Least Squares Generative Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2813–2821. [Google Scholar]
- Wu, Y.; Liu, Y. Robust Truncated Hinge Loss Support Vector Machines. J. Am. Stat. Assoc. 2007, 102, 974–983. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Sønderby, C.K.; Caballero, J.; Theis, L.; Shi, W.; Huszár, F. Amortised map inference for image super-resolution. arXiv 2017, arXiv:1610.04490. [Google Scholar]
- Klinker, F. Exponential moving average versus moving exponential average. Math. Semesterber. 2011, 58, 97–107. [Google Scholar] [CrossRef] [Green Version]
- Metz, L.; Poole, B.; Pfau, D.; Sohl-Dickstein, J. Unrolled generative adversarial networks. arXiv 2016, arXiv:1611.02163. [Google Scholar]
- Ghosh, A.; Kulharia, V.; Namboodiri, V.P.; Torr, P.H.S.; Dokania, P.K. Multi-Agent Diverse Generative Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Hoang, Q.; Nguyen, T.D.; Le, T.; Phung, D. MGAN: Training generative adversarial nets with multiple generators. In Proceedings of the 6th International Conference on Learning Representations, ICLR 2018—Conference Track Proceedings, Vancouver, BC, Canada, 30 April–3 May 2018; pp. 1–24. [Google Scholar]
- Arora, S.; Ge, R.; Liang, Y.; Ma, T.; Zhang, Y. Generalization and Equilibrium in Generative Adversarial Nets (GANs). In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; Volume 70, pp. 224–232. [Google Scholar]
- Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved techniques for training GANs. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 2234–2242. [Google Scholar]
- Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. In Proceedings of the 4th International Conference on Learning Representations, ICLR 2016—Conference Track Proceedings, San Juan, PR, USA, 2–4 May 2016; pp. 1–16. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 6629–6640. [Google Scholar]
- Petzka, H.; Fischer, A.; Lukovnikov, D. On the regularization of Wasserstein GANs. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
SGAN-0GP | SGAN-RSC | |
---|---|---|
25 Gaussians | 0.028 ± 0.0053 | 0.025 ± 0.0048 |
50 Gaussians | 0.058 ± 0.0084 | 0.048 ± 0.0102 |
100 Gaussians | 0.072 ± 0.0094 | 0.057 ± 0.014 |
0GP | RSC | |
---|---|---|
CIFAR-10 | ||
ResNet SGAN | 24.15 ± 0.27 | 12.05 ± 0.50 |
ResNet WGAN | 24.33 ± 0.16 | 12.90 ± 0.07 |
ResNet LSGAN | 22.32 ± 0.05 | 14.40 ± 0.40 |
ResNet HingeGAN | 23.39 ± 0.12 | 12.37 ± 0.31 |
ResNet FARGAN | 14.28 ± 0.16 | 11.66 ± 0.09 |
Conventional SGAN | 13.12 ± 0.41 | 10.92 ± 0.04 |
CIFAR-100 | ||
ResNet SGAN | 34.48 ± 0.02 | 19.80 ± 0.19 |
Conventional SGAN | 23.42 ± 0.29 | 17.14 ± 0.07 |
Method | SNGAN | BigGAN | CR-BigGAN | FARGAN | FARGAN-RSC (Ours) |
---|---|---|---|---|---|
FID | 17.5 | 14.73 | 11.48 | 14.28 | 9.8 |
Method | SGAN-0GP | SGAN-RSC |
---|---|---|
FID | 53.79 | 46.92 |
0GP | RSC | |
---|---|---|
Synthetic data (distance) | 0.028 | 0.025 |
CIFARA-10 (FID) | 13.12 | 9.8 |
CIFAR-100 (FID) | 23.42 | 17.14 |
ImageNet2012 (FID) | 53.79 | 46.92 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, X.; Zhang, J. Real Sample Consistency Regularization for GANs. Entropy 2021, 23, 1231. https://doi.org/10.3390/e23091231
Zhang X, Zhang J. Real Sample Consistency Regularization for GANs. Entropy. 2021; 23(9):1231. https://doi.org/10.3390/e23091231
Chicago/Turabian StyleZhang, Xiangde, and Jian Zhang. 2021. "Real Sample Consistency Regularization for GANs" Entropy 23, no. 9: 1231. https://doi.org/10.3390/e23091231