Open AccessBrief Report

A Short Note on Physics-Guided GAN to Learn Physical Models without Gradients

Kazuo Yonekura

Department of Systems Innovations, Graduate School of Engineering, The University of Tokyo, Tokyo 113-8656, Japan

Algorithms 2024, 17(7), 279; https://doi.org/10.3390/a17070279

Submission received: 3 June 2024 / Revised: 23 June 2024 / Accepted: 25 June 2024 / Published: 26 June 2024

Download

Browse Figures

Versions Notes

Abstract

This study briefly describes the concept of guided training of deep neural networks (DNNs) to learn physically reasonable solutions. The proposed method does not need the gradients of the physical equations, although the conventional physics-informed models need the gradients. DNNs are widely used to predict phenomena in physics and mechanics. One of the issues with DNNs is that their output does not always satisfy physical equations. One approach to consider with physical equations is adding a residual of the equations into the loss function; this is called physics-informed neural network (PINN). One feature of PINNs is that the physical equations and corresponding residuals must be implemented as part of a neural network model. In addition, the residual does not always converge to a small value. The proposed model is a physics-guided generative adversarial network (PG-GAN) that uses a GAN architecture, in which physical equations are used to judge whether the neural network’s output is consistent with physics. The proposed method was applied to a simple problem to assess its potential usability.

Keywords:

physics-informed neural network; physics-guided generative adversarial networks (PG-GAN); GAN; deep neural networks

1. Introduction

Deep neural networks (DNNs) are widely employed in diverse research areas, including physics and engineering. In mechanical engineering, DNNs are often used as a surrogate model [1,2,3]. In general, a DNN learns from data, and the output is not guaranteed to be consistent with physics, even if the data were generated by certain physical models.

Generative adversarial networks (GANs) [4] constitute a type of DNN. GANs are applied to image generation tasks at first [5,6,7]. Then, GANs are also used to solve inverse design problems [8,9,10,11]. For example, in [12,13], the GAN models were trained using a set of data consisting of airfoil shapes and their aerodynamic coefficients. Then, by inputting the aerodynamic coefficients, the trained model output airfoil shapes associated with those inputted aerodynamic coefficients. The aerodynamic coefficients of the generated shapes were close to the specified label, but some errors were also identified. These errors were due to the fact that the aerodynamic coefficient was calculated from the airfoil shapes, but the physical equations were not inputted into the DNN model.

To consider physical consistency, physics-informed neural networks (PINNs) have been proposed [14], which add the residual of physical equations to the loss function of the DNN [2,15,16,17]. PINNs were used to predict various targets, such as lake water levels [18], surface water levels [19], and seismic responses [20]. However, a PINN model needs the gradients of the physical equations, and hence the physical equations must be implemented in the DNN architecture. This causes difficulties from an application point of view. For example, general purpose software and commercial software cannot be used in PINN models. Especially in the inverse design problems, the commercial software are often mandatory to compute the label; e.g., flow computations to calculate aerodynamic coefficients in the airfoil shapes are mandatory. Hence, it is desirable to consider arbitrary physical equations, including general purpose and commercial software in DNN surrogate models and generative models.

The proposed method aims to consider arbitrary physical equations using a GAN architecture. To this aim, the gradients of the physical equations have to be eliminated from the algorithm. GAN consists of a generator network and a discriminator network. The generator network outputs data that mimic training data, whereas the discriminator network distinguishes the generated data from the training data. Training data are referred to as true data, whereas generated data constitute fake data. In the literature, physics-informed adversarial learning [16,21] and PID-GAN [22], which couples PINN and GAN, are proposed, but those methods also use the residual of the physical equations in the same manner as PINNs, and hence the physical models need to be implemented inside the computation graph. In the proposed PG-GAN model, true or fake is defined by physical equations; if the residual is smaller than a specific value, then the generated data are true; otherwise, the generated data are fake. The physical model guides the DNN to learn physical consistency, and is only used to categorize data as true or fake. The physical equations remain outside of the DNN model, and are not implemented inside the DNN. Therefore, arbitrary physical models can be used. It is also noted that, by decreasing the threshold value, the residual of the generated data are decreased. In the PINN model, the residual is added to the loss function and cannot be controlled.

The concept of PG-GAN model is validated using a simple example. The merit of the PG-GAN is that an arbitrary physics model can be used. However, to compare with PINN, Newton’s equation of motion is employed, which can be solved by PINN.

The present paper is organized as follows. GAN and PINN models are explained in Section 2. The concept of PG-GAN is explained in Section 3, and its formulation is given in Section 3. A numerical study is presented in Section 4. Conclusions are provided in Section 5.

2. GAN and Physics-Informed GAN

A conditional GAN model consists of a generator network G and a discriminator network D, as illustrated in Figure 1. The input of the generator network is a noise vector

z

and label

θ

, and the output represents fake data

x^{'}

, expressed as

x^{'} = G (z ∣ θ)

. The input of the discriminator network is given by real

x

and fake

x^{'}

data, and the network distinguishes the real data from the fake data. The loss function is defined as

\begin{matrix} V (G, D) = E_{x \sim p_{x}} [log D (x)] + E_{z \sim p_{z}} [log (1 - D (G (z)))], \end{matrix}

(1)

and the generator minimizes V, whereas the discriminator maximizes V, i.e.,

{min}_{G} {max}_{D} V (G, D) .

The discriminator only considers data

x

and

x^{'}

; physical reasonableness is not considered.

A PINN can be coupled with a GAN model. The resulting architecture is called PI-GAN in the present article. Suppose that a physical model is expressed as

P (x) = 0

. For a variable

x^{'}

, the residual is given by

r = P (x^{'})

. The PINN adds the residual into the loss function, which is minimized. In the GAN model, the loss function of the generator is modified as

V (G, D) + λ r

, where

λ

is a constant. In the numerical example described later on,

λ

was set to

0.1

. The loss function of the discriminator remains the same as the original GAN.

3. PG-GAN Model

3.1. Concept of PG-GAN

When PINN is used, the physics model in the PINN is located inside the computation graph. In addition, if the GAN model is coupled with the PINN, which we call a physics-informed GAN (PI-GAN), the physics model is also located inside the computation graph.

The PG-GAN is designed so that physics model is eliminated from the computation graph as shown in Figure 2. The generator generates data in the same way as a normal GAN. The generated data are then passed to the physical model to determine their physical validity. If the data are determined to be physically valid, then they are labeled as true data; otherwise, as false data. The input of the discriminator is only the generated data, and the discriminator distinguishes if the data are true or not. In the PG-GAN, the computation graph consists of only the generator and discriminator. The physical model is used only to determine the truth of the data, and is not included in the calculation graph.

3.2. Formulation of PG-GAN

Generated data should be considered as true if they are physically reasonable. To judge the physical reasonableness, a physical model is used. Suppose that the physical model is described as

P (x ∣ θ) = 0,

and the value of

P (x ∣ θ)

is treated as a residual. We treat datapoint

\hat{x}

as true if

P (\hat{x} ∣ \hat{θ}) \leq ε

. Hence, the input of our discriminator is given by generated data

x^{'}

, and the output is whether data are physically reasonable or not. For a given

ε

, let a set

D_{ε}

represent a set of data whose residual is equal to, or less than,

ε

\begin{matrix} R_{ε} = \{x ∣ P (x ∣ θ) \leq ε\}, F_{ε} = \{x ∣ P (x ∣ θ) > ε\} \end{matrix}

In this case, the loss function becomes

\begin{matrix} V_{ε} (G, D) = E_{G (z) \in R_{ε}} [log D (x)] + E_{G (z) \in F_{ε}} [log (1 - D (G (z)))] . \end{matrix}

(2)

The optimization problem for the discriminator is

{max}_{D} V_{ε} (G, D)

. The discriminator tries to mimic the physical model to judge physical reasonableness.

If we minimize

V_{ε} (G, D)

with respect to G, the generator is not trained as desired. In the ordinal GAN loss function, the first term of

V (G, D)

is not a function of G, and the generator optimization problem becomes

\begin{matrix} min_{G} E_{z \sim p_{z}} [log (1 - D (G (z)))] . \end{matrix}

However, in Equation (2), both the first and second terms were functions of G. Therefore, the generator optimization problem uses only the second term of

V_{ε} (G, D)

, and is

\begin{matrix} min_{G} E_{G (z) \in F_{ε}} [log (1 - D (G (z)))], \end{matrix}

instead of

{min}_{G} V_{ε} (G, D)

The architecture of the proposed model is illustrated in Figure 3. The physics-guided GAN model uses a physics model as a referee to judge whether the generated data are physically reasonable or not. The discriminator is a surrogate model of the physical model. If we could use the physical model itself as a discriminator, the generator would be trained much more efficiently. However, if we use the physical model in the architecture, back propagation would stop at the physical model, because we are assuming that the physical model’s software is a black-box. This is also a feature of the PG-GAN: the training data no longer appear in the model. The real data are not necessary, because their being true/false is judged by the physical model.

The PINN can be coupled with the PG-GAN by adding the residual into the loss function of the generator; the resulting architecture is called PG-PI-GAN. The loss function is modified as

\begin{matrix} min_{G} E_{G (z) \in F_{ε}} [log (1 - D (G (z)))] + λ r . \end{matrix}

Training the PG-GAN without pre-training is not efficient, because in the early epochs, the generator cannot generate physically reasonable data and

R_{ε}

always becomes an empty set. In such a case, both the generator and discriminator are not well-trained, because the discriminator always outputs 0 (fake), whereas the generator has no clue to generate reasonable data. Hence, it is necessary to start from a pre-trained generator that generates non-empty

R_{ε}

. To obtain such a pre-trained generator, an ordinal GAN model is used.

Data generated by the pre-trained generator exhibits a large residual of physical equation

P (x^{'} ∣ θ)

ε

must be large enough so that both sets

R_{ε}

and

F_{ε}

are not empty sets. However, it is not desirable to terminate with large

ε

. Hence,

ε

is reduced as training proceeds, until it reaches the target value

\bar{ε}

. In the following numerical example,

ε

was constant for 10,000 epochs and then changed. Alternatively,

ε

can be gradually reduced.

4. Numerical Study: Newton’s Equations of Motion

Newton’s equation of motion under gravity is expressed as

x (t) = x_{0} - \frac{1}{2} g t^{2} + v_{0} t,

where

x_{0} \in ℜ^{2}

g \in ℜ^{2}

v_{0} \in ℜ^{2}

, and t denote the coordinates of an initial point, gravitational acceleration, initial velocity vector, and time, respectively. Physical equation P is formulated as

\begin{matrix} P (x (t), θ = v_{0}) = {∥x (t) - x_{0} + \frac{1}{2} g t^{2} - v_{0} t∥}^{2}, \end{matrix}

where

v_{0} = v_{0} {(cos ϕ, sin ϕ)}^{⊤}

. The task is to output a sequence of coordinates

\begin{matrix} ξ (θ = v_{0}) = {(x {(δ_{t}; θ)}^{⊤}, x {(2 δ_{t}; θ)}^{⊤}, \dots, x {(100 δ_{t}; θ)}^{⊤})}^{⊤}, \end{matrix}

where the parameter

v_{0}

is given. A dataset for pre-training is first prepared as

\begin{matrix} D = \{ξ (θ = v_{0}) ∣ v_{0} \in {1, 2, \dots 100}, ϕ \in {0, 1, \dots, 90}\} . \end{matrix}

The total number of training datapoints was 9000. In the pre-training, the GAN model was trained using dataset

D

for 10,000 epochs. Then, PG-GAN training was carried out. Threshold

ε

is defined as a function of the number of epochs e as follows:

\begin{matrix} ε = \{\begin{matrix} 5, & (if 10, 000 \leq e < 20, 000), \\ 2.5, & (if 20, 000 \leq e < 30, 000), \\ 1.25, & (if 30, 000 \leq e < 70, 000), \\ 0.625, & (if 70, 000 \leq e < 100, 000), \end{matrix} \end{matrix}

The PG-GAN model was trained, and data

ξ

were output.

After the training was conducted, the coordinates were obtained by using the generators. To compare the accuracy of the models, the residual was calculated for the output coordinates. The residual was calculated by

\begin{matrix} r = \frac{1}{100} \sum_{k = 1}^{k = 100} P (x (k δ_{t}), θ = v_{0}) . \end{matrix}

An ordinal GAN, physics-informed GAN (PI-GAN), and PG-PI-GAN were also trained and compared. Each model was trained and evaluated three times separately. Figure 4 shows the boxplots of the residuals of each model. Data outside of the 1.5 interquartile range (IQR) from the first and third quartiles was treated as outliers in the boxplot. Table 1 shows the median, first quartile, and interquartile range of the residuals. The physics-informed GAN featured the same GAN architecture, except for the residual r, which was added to the loss function. All network structures were the same in all models. The PG-GAN was characterized by a lower median than the PI-GAN and by the first quartile values. The PG-PI-GAN presented similar median and first quartile values as those of the PG-GAN, but the IQR was lower than that of the PG-GAN. These results show that the PG-GAN effectively reduces the median value, but does not reduce the IQR value. This difference comes from the loss functions of both models. The residual was added in the loss function of the PI-GAN, and hence the residual of all generated data were reduced. By contrast, the PG-GAN considered no residual in the loss function. The undesired data indicated larger residuals. Note that the amount of residuals of undesired data does not affect the loss function. Hence, the residuals of the undesired data tend to become large. Therefore, the proposed PG-PI-GAN, which couples a PI-GAN and a PG-GAN, successfully reduces the median, first quartile, and IQR.

5. Conclusions

This short note describes the concept of a PG-GAN. This PG-GAN uses arbitrary physical models, regardless of differentiability and smoothness, to guide neural networks to output physically reasonable solutions. One advantage of the proposed PG-GAN is that the physical model is outside of the neural network calculation graph; back-propagation in the neural network is not conducted on the physical model. Hence, any physical model can be utilized. For example, commercial software could be used, and one does not need to implement the physical model. Existing PINN models require physics equations to be implemented inside the calculation graph. Hence, arbitrary physics equations cannot be used (e.g., commercial software cannot be used). The proposed PG-GAN network does not need training data. The generator creates data, and the physical model judges whether the output is reasonable or not. However, the PG-GAN model is pre-trained using an ordinal GAN model with training data.

The proposed PG-GAN model was tested using Newton’s equation of motion. The PG-GAN and PG-PI-GAN featured lower median values of residuals. When the PG-GAN was coupled with the PI-GAN, the IQR value also decreased.

Funding

This study was supported by JSPS KAKENHI Grant Number JP23K13239.

Data Availability Statement

The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.

Conflicts of Interest

The author is a board member at MJOLNIR SPACEWORKS.

References

Tripathy, R.K.; Bilionis, I. Deep UQ: Learning deep neural network surrogate models for high dimensional uncertainty quantification. J. Comput. Phys. 2018, 375, 565–588. [Google Scholar] [CrossRef]
Sun, L.; Gao, H.; Pan, S.; Wang, J.X. Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data. Comput. Methods Appl. Mech. Eng. 2020, 361, 112732. [Google Scholar] [CrossRef]
Haghighat, E.; Raissi, M.; Moure, A.; Gomez, H.; Juanes, R. A physics-informed deep learning framework for inversion and surrogate modeling in solid mechanics. Comput. Methods Appl. Mech. Eng. 2021, 379, 113741. [Google Scholar] [CrossRef]
Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS’14, Cambridge, MA, USA, 8–13 December 2014; Volume 2, pp. 2672–2680. [Google Scholar]
Brock, A.; Donahue, J.; Simonyan, K. Large Scale GAN Training for High Fidelity Natural Image Synthesis. arXiv 2018, arXiv:1809.11096. [Google Scholar]
Frid-Adar, M.; Diamant, I.; Klang, E.; Amitai, M.; Goldberger, J.; Greenspan, H. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 2018, 321, 321–331. [Google Scholar] [CrossRef]
Yi, Z.; Zhang, H.; Tan, P.; Gong, M. DualGAN: Unsupervised Dual Learning for Image-To-Image Translation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
Liu, Z.; Zhu, D.; Rodrigues, S.P.; Lee, K.T.; Cai, W. Generative Model for the Inverse Design of Metasurfaces. Nano Lett. 2018, 18, 6570–6576. [Google Scholar] [CrossRef] [PubMed]
Liu, Z.; Zhu, D.; Raju, L.; Cai, W. Tackling Photonic Inverse Design with Machine Learning. Adv. Sci. 2021, 8, 2002923. [Google Scholar] [CrossRef] [PubMed]
So, S.; Badloe, T.; Noh, J.; Bravo-Abad, J.; Rho, J. Deep learning enabled inverse design in nanophotonics. Nanophotonics 2020, 9, 1041–1057. [Google Scholar] [CrossRef]
Wang, J.; Chen, W.W.; Da, D.; Fuge, M.; Rai, R. IH-GAN: A conditional generative model for implicit surface-based inverse design of cellular structures. Comput. Methods Appl. Mech. Eng. 2022, 396, 115060. [Google Scholar] [CrossRef]
Yonekura, K.; Miyamoto, N.; Suzuki, K. Inverse airfoil design method for generating varieties of smooth airfoils using conditional WGAN-gp. Struct. Multidiscip. Optim. 2022, 65, 173. [Google Scholar] [CrossRef]
Achour, G.; Sung, W.J.; Pinon-Fischer, O.J.; Mavris, D.N. Development of a Conditional Generative Adversarial Network for Airfoil Shape Optimization. In Proceedings of the AIAA Scitech 2020 Forum, Orlando, FL, USA, 6–10 January 2020; American Institute of Aeronautics and Astronautics: Reston, VA, USA, 2020; p. 2261. [Google Scholar]
Raissi, M.; Perdikaris, P.; Karniadakis, G. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Willard, J.; Jia, X.; Xu, S.; Steinbach, M.; Kumar, V. Integrating Scientific Knowledge with Machine Learning for Engineering and Environmental Systems. arXiv 2022, arXiv:2003.04919. [Google Scholar] [CrossRef]
Yang, L.; Zhang, D.; Karniadakis, G.E. Physics-Informed Generative Adversarial Networks for Stochastic Differential Equations. SIAM J. Sci. Comput. 2020, 42, A292–A317. [Google Scholar] [CrossRef]
Yonekura, K.; Maruoka, K.; Tyou, K.; Suzuki, K. Super-resolving 2D stress tensor field conserving equilibrium constraints using physics-informed U-Net. Finite Elem. Anal. Des. 2023, 213, 103852. [Google Scholar] [CrossRef]
Jia, X.; Willard, J.; Karpatne, A.; Read, J.S.; Zwart, J.A.; Steinbach, M.; Kumar, V. Physics-Guided Machine Learning for Scientific Discovery: An Application in Simulating Lake Temperature Profiles. ACM/IMS Trans. Data Sci. 2021, 2, 1–26. [Google Scholar] [CrossRef]
Bertels, D.; Willems, P. Physics-informed machine learning method for modelling transport of a conservative pollutant in surface water systems. J. Hydrol. 2023, 619, 129354. [Google Scholar] [CrossRef]
Zhang, R.; Liu, Y.; Sun, H. Physics-guided convolutional neural network (PhyCNN) for data-driven seismic response modeling. Eng. Struct. 2020, 215, 110704. [Google Scholar] [CrossRef]
Yang, Y.; Perdikaris, P. Adversarial uncertainty quantification in physics-informed neural networks. J. Comput. Phys. 2019, 394, 136–152. [Google Scholar] [CrossRef]
Daw, A.; Maruf, M.; Karpatne, A. PID-GAN: A GAN Framework based on a Physics-informed Discriminator for Uncertainty Quantification with Physics. arXiv 2021, arXiv:2106.02993. [Google Scholar]

Figure 1. Architecture of GAN.

Figure 2. Overview of PG-GAN.

Figure 3. Architecture of PG-GAN.

Figure 4. Accuracy of GAN, PI-GAN, PG-GAN, and PG-PI-GAN.

Table 1. Statistics of residuals for Newton’s equation of motion.

	GAN	PI-GAN	PG-GAN	PG-PI-GAN
Median	3.24	1.67	1.07	1.05
First quartile	2.17	0.84	0.60	0.52
Interquartile range	6.59	2.60	2.88	1.71

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yonekura, K. A Short Note on Physics-Guided GAN to Learn Physical Models without Gradients. Algorithms 2024, 17, 279. https://doi.org/10.3390/a17070279

AMA Style

Yonekura K. A Short Note on Physics-Guided GAN to Learn Physical Models without Gradients. Algorithms. 2024; 17(7):279. https://doi.org/10.3390/a17070279

Chicago/Turabian Style

Yonekura, Kazuo. 2024. "A Short Note on Physics-Guided GAN to Learn Physical Models without Gradients" Algorithms 17, no. 7: 279. https://doi.org/10.3390/a17070279

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu