Abstract
We study a new approach to learning energy-based models (EBMs) based on adversarial training (AT). We show that (binary) AT learns a special kind of energy function that models the support of the data distribution, and the learning process is closely related to MCMC-based maximum likelihood learning of EBMs. We further propose improved techniques for generative modeling with AT, and demonstrate that this new approach is capable of generating diverse and realistic images. Aside from having competitive image generation performance to explicit EBMs, the studied approach is stable to train, is well-suited for image translation tasks, and exhibits strong out-of-distribution adversarial robustness. Our results demonstrate the viability of the AT approach to generative modeling, suggesting that AT is a competitive alternative approach to learning EBMs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arbel, M., Zhou, L., Gretton, A.: Generalized energy based models. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=0PtUPB9z6qK
Augustin, M., Meinke, A., Hein, M.: Adversarial robustness on in- and out-distribution improves explainability. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12371, pp. 228–245. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_14
Bitterwolf, J., Meinke, A., Hein, M.: Certifiably adversarially robust detection of out-of-distribution data. Adv. Neural Inf. Process. Syst. 33, 16085–16095 (2020)
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=B1xsqj09Fm
Ceylan, C., Gutmann, M.U.: Conditional noise-contrastive estimation of unnormalised models. In: International Conference on Machine Learning, pp. 726–734. PMLR (2018)
Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: Stargan v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8188–8197 (2020)
Croce, F., Hein, M.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: ICML (2020)
Du, Y., Li, S., Mordatch, I.: Compositional visual generation with energy based models. Adv. Neural Inf. Process. Syst. 33, 6637–6647 (2020)
Du, Y., Li, S., Tenenbaum, J.B., Mordatch, I.: Improved contrastive divergence training of energy based models. In: ICML (2021)
Du, Y., Mordatch, I.: Implicit generation and modeling with energy based models. Adv. Neural Inf. Process. Syst. 32, 1–11 (2019). https://proceedings.neurips.cc/paper/2019/file/378a063b8fdb1db941e34f4bde584c7d-Paper.pdf
Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Tran, B., Madry, A.: Adversarial robustness as a prior for learned representations. arXiv preprint arXiv:1906.00945 (2019)
Gao, R., Song, Y., Poole, B., Wu, Y.N., Kingma, D.P.: Learning energy-based models by diffusion recovery likelihood. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=v_1Soh8QUNc
Goodfellow, I., et al.: Generative adversarial nets. Adv. Neural Inf. Process. Syst., 2672–2680 (2014)
Grathwohl, W., Wang, K.C., Jacobsen, J.H., Duvenaud, D., Norouzi, M., Swersky, K.: Your classifier is secretly an energy based model and you should treat it like one. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=Hkxzx0NtDB
Grathwohl, W.S., Kelly, J.J., Hashemi, M., Norouzi, M., Swersky, K., Duvenaud, D.: No mcmc for me: amortized sampling for fast and stable training of energy-based models. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=ixpSxO9flk3
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein gans. Adv. Neural Inf. Process. Syst. 30 (2017). https://proceedings.neurips.cc/paper/2017/file/892c3b1c6dccd52936e27cbd0ff683d6-Paper.pdf
Gutmann, M., Hyvärinen, A.: Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 297–304. JMLR Workshop and Conference Proceedings (2010)
Han, T., Nijkamp, E., Fang, X., Hill, M., Zhu, S.C., Wu, Y.N.: Divergence triangle for joint training of generator model, energy-based model, and inferential model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8670–8679 (2019)
Han, T., Nijkamp, E., Zhou, L., Pang, B., Zhu, S.C., Wu, Y.N.: Joint training of variational auto-encoder and latent energy-based model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7978–7987 (2020)
Hein, M., Andriushchenko, M., Bitterwolf, J.: Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 41–50 (2019)
Hendrycks, D., Mazeika, M., Dietterich, T.: Deep anomaly detection with outlier exposure. arXiv preprint arXiv:1812.04606 (2018)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process. Syst. 30, 6626–6637 (2017)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33 (2020). https://proceedings.neurips.cc/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf
Hyvärinen, A., Dayan, P.: Estimation of non-normalized statistical models by score matching. J. Mach. Learn. Res. 6(4) (2005)
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. arXiv preprint arXiv:1905.02175 (2019)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)
Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., Aila, T.: Training generative adversarial networks with limited data. Adv. Neural Inf. Process. Syst. 33, 12104–12114 (2020). https://proceedings.neurips.cc/paper/2020/file/8d30aa96e72440759f74bd2306c1fa3d-Paper.pdf
Kingma, D.P., Dhariwal, P.: Glow: Generative flow with invertible 1\(\times \)1 convolutions. Adv. Neural Inf. Process. Syst., 10215–10224 (2018)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Kumar, R., Ozair, S., Goyal, A., Courville, A., Bengio, Y.: Maximum entropy generators for energy-based models. arXiv preprint arXiv:1901.08508 (2019)
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236 (2016)
LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M., Huang, F.: A tutorial on energy-based learning. Predict. Struct. Data 1(0) (2006)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Meinke, A., Hein, M.: Towards neural networks that provably know when they don’t know. arXiv preprint arXiv:1909.12180 (2019)
Mescheder, L., Geiger, A., Nowozin, S.: Which training methods for gans do actually converge? In: International Conference on Machine Learning, pp. 3481–3490. PMLR (2018)
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=B1QRgziT-
Nijkamp, E., et al.: MCMC should mix: learning energy-based model with flow-based backbone. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=4C93Qvn-tz
Nijkamp, E., Hill, M., Han, T., Zhu, S.C., Wu, Y.N.: On the anatomy of mcmc-based maximum likelihood learning of energy-based models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 5272–5280 (2020)
Nijkamp, E., Hill, M., Zhu, S.C., Wu, Y.N.: Learning non-convergent non-persistent short-run mcmc toward energy-based model. In: NeurIPS (2019)
Pang, B., Han, T., Nijkamp, E., Zhu, S.C., Wu, Y.N.: Learning latent space energy-based prior model. Adv. Neural Inf. Process. Syst. 33 (2020). https://proceedings.neurips.cc/paper/2020/file/fa3060edb66e6ff4507886f9912e1ab9-Paper.pdf
Pidhorskyi, S., Adjeroh, D.A., Doretto, G.: Adversarial latent autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14104–14113 (2020)
Ramachandran, P., Zoph, B., Le, Q.V.: Swish: a self-gated activation function, vol. 7, no. 1. arXiv preprint arXiv:1710.05941 (2017)
Rhodes, B., Xu, K., Gutmann, M.U.: Telescoping density-ratio estimation. arXiv preprint arXiv:2006.12204 (2020)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. Adv. Neural Inf. Process. Syst. 29, 2234–2242 (2016)
Salimans, T., Kingma, D.P.: Weight normalization: a simple reparameterization to accelerate training of deep neural networks. Adv. Neural Inf. Process. Syst. 29, 901–909 (2016)
Santurkar, S., Ilyas, A., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Image synthesis with a single (robust) classifier. Adv. Neural Inf. Process. Syst., 1260–1271 (2019)
Sehwag, V., et al.: Better the devil you know: an analysis of evasion attacks using out-of-distribution adversarial examples. arXiv preprint arXiv:1905.01726 (2019)
Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. Adv. Neural Inf. Process. Syst. 32 (2019). https://proceedings.neurips.cc/paper/2019/file/3001ef257407d5a371a96dcd947c7d93-Paper.pdf
Song, Y., Ermon, S.: Improved techniques for training score-based generative models. arXiv preprint arXiv:2006.09011 (2020)
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=PxTIG12RRHS
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. arXiv preprint arXiv:1805.12152 (2018)
Vahdat, A., Kautz, J.: Nvae: a deep hierarchical variational autoencoder. Adv. Neural Inf. Process. Syst. 33 (2020). https://proceedings.neurips.cc/paper/2020/file/e3b21256183cf7c2c7a66be163579d37-Paper.pdf
Wang, Y., Wang, Y., Yang, J., Lin, Z.: A unified contrastive energy-based model for understanding the generative ability of adversarial training. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=XhF2VOMRHS
Welling, M., Teh, Y.W.: Bayesian learning via stochastic gradient langevin dynamics. In: Proceedings of the 28th International Conference on Machine Learning (ICML-2011), pp. 681–688. Citeseer (2011)
Xiao, Z., Kreis, K., Kautz, J., Vahdat, A.: Vaebm: a symbiosis between variational autoencoders and energy-based models. In: International Conference on Learning Representations (2021), https://openreview.net/forum?id=5m3SEczOV8L
Xie, J., Lu, Y., Gao, R., Zhu, S.C., Wu, Y.N.: Cooperative training of descriptor and generator networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(1), 27–45 (2018)
Xie, J., Lu, Y., Zhu, S.C., Wu, Y.: A theory of generative convnet. In: International Conference on Machine Learning, pp. 2635–2644. PMLR (2016)
Xie, J., Zheng, Z., Fang, X., Zhu, S.C., Wu, Y.N.: Cooperative training of fast thinking initializer and slow thinking solver for conditional learning. IEEE Trans. Pattern Anal. Mach. Intell. (2021)
Xie, J., Zheng, Z., Gao, R., Wang, W., Zhu, S.C., Wu, Y.N.: Learning descriptor networks for 3D shape synthesis and analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8629–8638 (2018)
Xie, J., Zheng, Z., Gao, R., Wang, W., Zhu, S.C., Wu, Y.N.: Generative voxelnet: learning energy-based models for 3D shape synthesis and analysis. IEEE Trans. Pattern Anal. Mach. Intell. (2020)
Xie, J., Zheng, Z., Li, P.: Learning energy based model with variational auto-encoder as amortized sampler. In: The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), vol. 2 (2021)
Xie, J., Zhu, Y., Li, J., Li, P.: A tale of two flows: cooperative learning of langevin flow and normalizing flow toward energy-based model. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=31d5RLCUuXC
Yin, X., Kolouri, S., Rohde, G.K.: Gat: Generative adversarial training for adversarial example detection and robust classification. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=SJeQEp4YDH
Yu, F., Seff, A., Zhang, Y., Song, S., Funkhouser, T., Xiao, J.: Lsun: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015)
Zhao, Y., Xie, J., Li, P.: Learning energy-based generative models via coarse-to-fine expanding and sampling. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=aD1_5zowqV
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yin, X., Li, S., Rohde, G.K. (2022). Learning Energy-Based Models with Adversarial Training. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13665. Springer, Cham. https://doi.org/10.1007/978-3-031-20065-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-20065-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20064-9
Online ISBN: 978-3-031-20065-6
eBook Packages: Computer ScienceComputer Science (R0)