1 Introduction

The goal of MOR techniques [1,2,3], which are particularly suited for the real-time computations and many-query context, is to obtain efficient and reliable approximations of solutions of high dimensional systems of partial differential equations (PDEs). Let us consider the approximate solution \(u(t; \mu )\in L^2(\Omega )\) of a parametrized PDE, with \(\Omega \subset {\mathbb {R}}^d\), with the parameter \(\mu \in {\mathcal {P}}_{\text {phys}}\subset {\mathbb {R}}^s\) and with time \(t\in [0,t_f]\): for the spatial discretization one can consider, for instance, the Finite Volume (FV) discretization. We introduce the solution manifold related to this parametric PDE: \({\mathcal {M}} = \{u(t; \mu )\in V_{{\mathcal {N}}},\, \mu \in {{\mathcal {P}}_{\text {phys}}}, \,t\in [0, t_{f}]\}, \) where \(V_{{\mathcal {N}}}\) is a suitable functional space defined by the chosen spatial discretization. The key idea behind MOR is to represent \({\mathcal {M}}\) with a finite dimensional linear space \(V_N\), such that \(N\ll {\mathcal {N}}\), where \(N = \text {dim}V_N\) and \({\mathcal {N}}=\text {dim}V_{{\mathcal {N}}}\). To find the lower dimensional space \(V_N\), one can use the well known Proper Orthogonal Decomposition (POD) strategy that, given in input a set of discrete solutions (obtained, for example, with the FV method), is able to extract a set of small cardinality N, which contains the so-called reduced basis functions that best approximate the manifold. A pivotal aspect for the efficiency of the MOR is the ability of the POD of compressing the discrete solution manifold \({\mathcal {M}}\): this concept is strictly related to the definition of Kolmogorov N-width of \({\mathcal {M}}\) and, ultimately, to the reducibility of the problem of interest.

The Kolmogorov N-width \(d_N\) of \({\mathcal {M}}\) is defined as

$$\begin{aligned} d_N({\mathcal {M}}, V_{{\mathcal {N}}}) = \inf _{\begin{array}{c} V_N\subset V_{{\mathcal {N}}}\\ \text {dim}V_N=N \end{array}}\sup _{f\in {\mathcal {M}}}\inf _{g\in V_N}||f-g||, \end{aligned}$$
(1)

where \(||\cdot ||\) is a suitable norm in \(V_{{\mathcal {N}}}\). Definition (1) describes in a rigorous mathematical setting the capability of finite dimensional linear subspaces \(V_N\subseteq V\) of reproducing any element in \({\mathcal {M}}\), that is, any discrete solution of the problem of interest. Therefore, the faster \(d_N\) decays the more efficient a linear MOR will be for such a problem, as N grows. Some rigorous bounds for \(d_N\), for particular classes of problems, are available in literature [4, 5]; as an alternative, one can look at the rate of decay of the eigenvalues \(\lambda _k\) returned by the POD on \({\mathcal {M}}\).

Despite the capability of standard MOR techniques to handle a vast number of applications, problems advecting local structures still represent a challenge for the MOR community. Indeed, for such problems the decay of the Kolmogorov N-width is slow, see for example [6]. As a result, standard MOR struggles to suitably reproduce steep features, such as solutions with (multiple) travelling shocks. For this reason, in the last decade a great number of works appeared in the literature, offering numerous approaches to deal with advection dominated problems. We mention the method of freezing [7], the shifted POD [8] (also in combination with the use of neural networks [9]), the generation of advection modes by means of an optimal mass transport problem [10,11,12,13], \(L^1\) minimization [14], the calibration method introduced in [15, 16], Lagrangian based MOR techniques [17, 18], the preprocessing of the snapshots used in [19, 20], the registration method [21,22,23], adaptive basis methods [24], implicit feature tracking [25] and displacement interpolation [26, 27]. Next to these more classical techniques, some nonlinear approaches have been lately studied starting from convolutional autoencoders neural network based approaches for learning the solution manifold [28, 29], passing through graph neural networks autoencoders [30] to graph neural network to perform the limit to vanishing viscosity [31] and entropy-stable rational quadratic manifolds [32]. Motivated by the interest that MOR for transport dominated problems sparks in the applied mathematics community, the goal of this work is to propose a calibration based reduced order algorithm that can be used to gain significant speedup in the simulations of solutions of hyperbolic PDEs. The proposed methodology will avoid the use of implicit feature tracking [25], with the goal of not increasing the computational costs of the offline phase that is performed explicitly. The framework is very similar to the calibration/registration approach proposed in [15, 21, 22] with novelties in the calibration process and in the range of applicability of the method. Indeed, our methodology has a more physical intuition as it is based on the interpolation of some calibration points rather than on more abstract interpolation functions. Moreover, the proposed applications range on a novel paradigm, where parametric time-dependent solutions widely vary, with large deformations of the solutions structures. To the knowledge of the authors, this is the first time that solutions with such a varying behavior were successfully tackled with a model order reduction technique.

In this work, we will focus on time-dependent hyperbolic problems, whose solutions are (quasi) self-similar: the formal definition of (quasi) self-similar solution will be given in Sect. 1.1. In particular, we want to study problems where multiple structures travel along the domain with different speeds. This is typical for hyperbolic problems, where shocks, rarefactions and other discontinuities are generated and travel along the domain. The novelties of this work, in comparison to the state of the art [22], lie in two key aspects. First, our optimization process operates independently of the solution structure, eliminating the need for shock detectors or similar tools. Secondly, our method demonstrates a broader range of applicability, encompassing problems featuring multiple shocks whose positions undergo significant variation, sometimes nearly colliding with each other.

The rest of the manuscript is structured as follows. In Sect. 1.1, we introduce the problems of interest and the definition of self-similarity. In Sect. 2, we define the calibration procedure and the optimization algorithms. In Sect. 3, we present some geometrical transformations that interpolate the calibrated points and allow to define the original problem onto a reference domain where all the structures do not move. In Sect. 4, we describe the combination of classical MOR techniques with the calibration process. In Sect. 5, we show the good performances of the proposed reduced order model (ROM) onto one and two dimensional parametric time–dependent problems and, in Sect. 6, we draw some conclusions. To improve the readibility of the manuscript, a table with the list of all the mathematical symbols used is provided at the end of the paper.

1.1 Motivation

We begin by introducing the problem of interest that we will tackle in this manuscript: we focus here on hyperbolic time–dependent conservation laws. As an example, we turn our attention to Euler equations, but the same framework can be applied to other conservation and balance laws. Let \(\Omega \subset {\mathbb {R}}^d\), \(d\geqslant 1\), be our physical domain. We restrict ourselves to rectangular domains of the type \(\Omega =[a^1,b^1]\times \dots \times [a^d, b^d]\), with \(a^i, b^i\in {\mathbb {R}}\) for \(i=1,\dots , d\). The generalization for more complex domains can be performed as in [33]. Let \([0,t_f]\subset {\mathbb {R}}\) be the time span of the problem and let \(\varvec{\mu }\in {\mathcal {P}}\subset {\mathbb {R}}^{s+1}\), \(s\geqslant 0\), be the collection of all parameters (including time). From now on, we will assume that \(s=0\) in the non parametric regime, i.e., \(\varvec{\mu }= t\) and \({\mathcal {P}}=[0, t_f]\), or \(s\geqslant 1\) in the parametric regime, i.e., \(\varvec{\mu }= (\mu ,t)\) and \({\mathcal {P}}={\mathcal {P}}_{\text {phys}}\times [0, t_f]\). The parameteric Euler equations of gas dynamics, in conservative form, read as follows: find the density \(\rho :{\mathcal {P}}\times \Omega \mapsto {\mathbb {R}}\), the momentum \({\varvec{m}} :{\mathcal {P}}\times \Omega \mapsto {\mathbb {R}}^d\) and the total energy \(E:{\mathcal {P}}\times \Omega \mapsto {\mathbb {R}}\) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t\rho + \nabla _{\varvec{x}} \cdot {\varvec{m}} = 0 & \text { in }{\mathcal {P}}\times \Omega ,\\ \partial _t{\varvec{m}} + \nabla _{\varvec{x}} \cdot \left( \frac{{\varvec{m}} \otimes {\varvec{m}}}{\rho } + p\,I\right) = 0 & \text { in }{\mathcal {P}}\times \Omega ,\\ \partial _tE + \nabla _{\varvec{x}} \cdot \left( \frac{{\varvec{m}}}{\rho }(E + p)\right) = 0 & \text { in }{\mathcal {P}}\times \Omega , \end{array}\right. } \end{aligned}$$
(2)

where \(\nabla _{\varvec{x}}\cdot \) is the divergence with respect to \(\varvec{x}\in \Omega \), \(I\in {\mathbb {R}}^{d\times d}\) is the identity matrix and the pressure p is defined through the following equation of state \( p = (\gamma -1)(E-0.5 |{\varvec{m}}|^2/\rho )\), with \(\gamma = 1.4\) being the adiabatic constant. System (2) is then completed by some proper initial conditions (IC) and boundary conditions (BC). We will consider as IC some Riemann problems both in one and two dimensional problems: the Sod shock tube problem [34], the 2D double Mach reflection problem [35] and the triple point problem. In these examples, the solution of (2) turns out to be self-similar, with features as shocks, contact discontinuities and rarefaction waves traveling in the physical domain.

Definition 1

Let \({\mathcal {R}}\subset {\mathbb {R}}^d\) be a reference domain, which is time (and parameter) independent. We call self–similar a solution manifold \({\mathcal {M}}\) for which there exists a reference solution \({\bar{u}}:{\mathcal {R}}\rightarrow {\mathbb {R}}^{d}\) and a transformation \(T^{-1}[\varvec{\mu }](\cdot ): \Omega \rightarrow {\mathcal {R}}\) such that we have \(u(\varvec{\mu })(T^{-1}[\varvec{\mu }](\varvec{x})) \approx {\bar{u}}(\varvec{x})\) for all \(\varvec{\mu }\in {\mathcal {P}},\) \(\forall \varvec{x}\in {\mathcal {R}}\). When this condition is not satisfied, but still all solutions in \({\mathcal {M}} = \lbrace u(\varvec{\mu }) \rbrace _{\varvec{\mu }\in {\mathcal {P}}}\) present the same features, with different values of the solution in between these features, we will call such manifold quasi–self–similar. More precisely, \({\mathcal {M}}\) is quasi–self–similar if there exists a transformation T such that the transformed solution manifold \(\hat{{\mathcal {M}}}:=\lbrace u(\varvec{\mu })(T^{-1}[\varvec{\mu }](\cdot )) \rbrace \) has a fast decay of the Kolmogorov N-width.

We start by considering a simple 1D Sod shock tube problem, in the non–parametric regime. Here, \(\Omega = [0,1]\), and the following initial data is considered:

$$\begin{aligned} \left[ {\begin{array}{*{20}c} \rho & u & p \\ \end{array} } \right] (t = 0) = \left\{ {\begin{array}{ll} {[\begin{array}{lll}1\quad & 0\quad & 1\end{array}]^{T} } & \quad {{\text {if }} x < 0.5,} \\ {[\begin{array}{lll}0.1\quad & 0\quad & 0.125\end{array}]^{T} } & \quad {{\text {if }} x > 0.5,} \\ \end{array} } \right. \end{aligned}$$

where \(u:\Omega \mapsto {\mathbb {R}}^d\) is the velocity. The initial conditions in conservative variables m and E can be derived using \(m=\rho u \) and \(E = p/(\gamma -1) + 1/2 |m|^2/\rho \). Figure 1 shows the density \(\rho (\varvec{\mu })\) for the Sod shock tube problem (left), and the corresponding modes (right), obtained by running a POD on the solution manifold \({\mathcal {M}}_{\rho }\): the solution of the Sod shock tube problem is exact, and its analytical expression has been taken from [34]. The density presents a shock, a contact discontinuity and a rarefaction wave that travel in the domain: as a consequence, the POD modes exhibit an highly oscillatory behavior, struggling to correctly capture the position of the moving features. Still, the solutions are self-similar as we just need to transport each feature onto reference positions to make the solutions essentially linearly dependent. Indeed, as observed in [11], the optimal transport for the density of the Sod 1D problem would lead to the exact solution, without the need for further ROM techniques. Nevertheless, we are interested also in higher dimensional problems with more complicated discontinuity structures. Hence, we will not proceed in the optimal transport direction.

2 Calibration of the Snapshots

We now present the calibration technique that we use to align the different features of our snapshots, to obtain a solution manifold with a faster decaying Kolmogorov N-width. The key of the proposed calibration is that it can be used to align different travelling features (shocks, contact discontinuities, rarefaction waves), without the need to know explicitly the exact location of these features, as opposed to, for example, what is assumed in [20, 22]. Moreover, we assume to calibrate the density \(\rho \) of the Euler system (2), other scalar quantities depending on the system unknowns, e.g., the entropy, can be equivalently used.

Fig. 1
figure 1

The density \(\rho \), solution of the Sod shock tube problem at different timesteps, and the corresponding POD modes (right)

Let \({\mathcal {R}}\) be the reference domain, similarly to what is used in the Arbitrary Lagrangian Eulerian (ALE) formalism [36], and let \(\Omega \) be the physical domain. For every \(\varvec{\mu }\in {\mathcal {P}}\subset {\mathbb {R}}^{s+1}\), we introduce a grid of \(M=\prod _{i=1} ^d M_i\) control points that are collected in the vector \(\varvec{w}(\varvec{\mu })\in \Omega ^M\) that we use for the calibration. These control points should lead the transformation to align the different features at different \(\varvec{\mu }\). Let \(\varvec{\alpha }=(\alpha _1,\dots ,\alpha _d)\in {\mathbb {N}}^{d}\) be a multi-index with \(\alpha _i =1,\dots , M_i\). We consider the control point \(\varvec{w}_{\varvec{\alpha }}\) to be tensor product of points in the interval \([a^i,b^i]\) for every dimension \(i=1,\dots ,d\), for example in 2D: \(\varvec{w}_{\varvec{\alpha }= (\alpha _1,\alpha _2)} = (\varvec{w}^1_{\alpha _1,\alpha _2},\varvec{w}^2_{\alpha _1, \alpha _2})\), see Fig. 2. Each control point \(\varvec{w}(\varvec{\mu })_{\varvec{\alpha }}\) can either belong to the physical domain, i.e., \(\varvec{w}(\varvec{\mu })_{\varvec{\alpha }}\in \mathring{\Omega }\), or to the boundary of the domain, namely \(\varvec{w}(\varvec{\mu })_{\varvec{\alpha }}\in \partial \Omega \). If \(\varvec{w}(\varvec{\mu })_{\varvec{\alpha }}\in \partial \Omega \), then this point is constrained for all \(\varvec{\mu }\) to the boundary hyperplanes where it belongs: this, in turn, sets a constraint on one (or both) of the coordinates \((\varvec{w}^1_{\alpha _1,\alpha _2},\varvec{w}^2_{\alpha _1, \alpha _2})\). The constrained coordinates are, hence, not to be optimized and we can discard them in the set of coordinates that we will optimize. Motivated by this, we introduce \(\varvec{\theta }(\varvec{w}(\varvec{\mu }))\in {\mathbb {R}}^{Q}\), that is the row vector of the Q free coordinates of the control points, with \(Q\leqslant d\times M\). This vector represents all the coordinates of the control points that are free to move during the calibration. In Fig. 2, the free coordinates of the control points are highlighted in red. We remark that there is a bijection between \(\varvec{w}\) and \(\varvec{\theta }\), by definition.

In order to align different features of our set of snapshots, we look for a geometrical transformation map \(T:\Omega ^M \times {\mathcal {R}}\mapsto \Omega \), such that the following properties hold true:

  • \(T[\cdot ] \in {\mathcal {C}}^1(\Omega ^M, {\mathcal {C}}^1({\mathcal {R}}, \Omega ))\);

  • \(\forall \varvec{\mu }\in {\mathcal {P}}\) and \(\forall \varvec{w}(\varvec{\mu })\in {\Omega ^M}\), \(\exists T^{-1}[\varvec{w}(\varvec{\mu })]:\Omega \mapsto {\mathcal {R}}\) such that

    $$\begin{aligned}&T^{-1}[\varvec{w}(\varvec{\mu })](T[\varvec{w}(\varvec{\mu })]({\hat{\varvec{x}}})) = {\hat{\varvec{x}}}\quad \forall {\hat{\varvec{x}}}\in {\mathcal {R}},\\&T[\varvec{w}(\varvec{\mu })](T^{-1}[\varvec{w}(\varvec{\mu })](\varvec{x})) = \varvec{x},\text { for all }\varvec{x}\in \Omega ; \end{aligned}$$
  • \(T^{-1}[\cdot ]\in {\mathcal {C}}^1(\Omega ^M, {\mathcal {C}}^1(\Omega ; {\mathcal {R}}))\).

The properties are imposed to setup an ALE formulation [21, 36]. Some possibilities for the geometrical transformation T have been presented in the literature over the past years: among the others we mention here translation maps, dilatation maps, polynomials and Gordon-Hall maps, see [15, 21, 36]. We will use some transformations based on Piecewise Cubic Hermite Interpolation Polynomials (PCHIPs) carefully described in Sect. 3. In order to use the transformation map T, one needs to find the calibration map \(\varvec{w}:{\mathcal {P}}\mapsto \Omega ^M\), such that:

  • \(\varvec{w}(\cdot )\in {\mathcal {C}}^1({\mathcal {P}}; \Omega ^M)\);

  • \(\rho (\varvec{\mu })( T[\varvec{w}(\varvec{\mu })]({\hat{\varvec{x}}})) {\approx } \overline{\rho }({\hat{\varvec{x}}})\), for all \({\hat{\varvec{x}}}\in {\mathcal {R}}\), \(t\in [0, t_{f}]\) and for all \(\varvec{\mu }\in {\mathcal {P}}\), where \(\overline{\rho }(\cdot )\) is a reference solution of choice.

Fig. 2
figure 2

Example of a control point grid in 2D with \(M_1=3\) and \(M_2=4\) on the reference domain (left) and on the physical domain (right). Note that the coordinates of the reference control points are the tensor product of unidirectional control points. We highlight in red the free coordinates of these control points

We are now ready to present a general calibration technique: the ultimate goal is to transform the solution manifold \({\mathcal {M}}_{\rho }\) so that the POD, applied to the transformed manifold, is more efficient. In order to achieve this goal, we need to perform the following steps. First of all, we introduce the reference density \(\overline{\rho }\), namely a solution of problem (2) for some \(\overline{\varvec{\mu }}\in {\mathcal {P}}\). Once \(\overline{\rho }\) has been chosen, we select M control points \(\overline{\varvec{w}}\in {\mathcal {R}}^M\). Now, for any \(\varvec{\mu }\in {\mathcal {P}}\) and \(\varvec{w}(\varvec{\mu })\in \Omega ^M\), we can define a geometrical transformation map \(T[\varvec{w}(\varvec{\mu })]:{\mathcal {R}}\mapsto \Omega \) that maps the reference control points in the reference domain onto the (parametric) control points in the physical domain, i.e.,

$$\begin{aligned} \begin{aligned} T[\varvec{w}(\varvec{\mu })] (\overline{\varvec{w}}_{\varvec{\alpha }}) = \varvec{w}_{\varvec{\alpha }}(\varvec{\mu }) \quad \forall \, \varvec{\alpha }. \end{aligned} \end{aligned}$$

Once \(T[\varvec{w}(\varvec{\mu })]\) has been defined, we can introduce the calibrated snapshot, which is the pullback of a solution \(\rho (\varvec{\mu })\) to the reference domain, i.e.,

$$\begin{aligned} {\hat{\rho }}(\varvec{\mu }; {\varvec{w}(\varvec{\mu })})(\cdot ):=\rho (\varvec{\mu })(T[\varvec{w}(\varvec{\mu })](\cdot )):{\mathcal {R}}\rightarrow {\mathbb {R}}. \end{aligned}$$

We want to stress that, numerically, we rely on two meshes, one on the physical and one on the reference domain. In our simulations, these two domains will coincide; this does not mean that each transformation map leads to a one-to-one correspondence between the degrees of freedom on the two domains. Hence, we will perform an interpolation of \(\rho \) to evaluate \(\hat{\rho }\) at its degree of freedom. In the numerical simulations, this procedure will bring an error of the first order of accuracy (as we will use FV approximations).

The map \(T[\varvec{w}(\varvec{\mu })]\) is identified by the control points \(\varvec{w}(\varvec{\mu })\), which are sought in order to minimize the following residual function:

$$\begin{aligned} \begin{aligned} R(\rho (\varvec{\mu }),\varvec{\theta }(\varvec{w}(\varvec{\mu })), \overline{\rho }) =&||{\hat{\rho }}(\varvec{\mu }; {\varvec{w}(\varvec{\mu })}) - \overline{\rho }||^2_{L^2({\mathcal {R}})} + \frac{\delta }{2}\left\Vert \partial _{\varvec{\mu }}\varvec{w}(\varvec{\mu })\right\Vert ^2_{\ell ^2({\mathcal {P}})}\\&+\frac{\alpha }{2} \max _{\varvec{x}\in \Omega } \left( \max \left\{ \left\Vert \nabla T[\varvec{w}(\varvec{\mu })](\hat{\varvec{x}})\right\Vert , \left\Vert \nabla T^{-1}[\varvec{w}(\varvec{\mu })](\varvec{x})\right\Vert \right\} \right) , \end{aligned} \end{aligned}$$
(3)

where \(\alpha \) and \(\delta \) are two penalty parameters user defined, and \(\partial _{\varvec{\mu }}\varvec{w}(\varvec{\mu })\) will be defined more in detail in the algorithmic section. The first term is the one we aim at minimizing, while the other two are regularization terms that penalizes discontinuities in time of the calibration points and that regularize the geometrical transformations. We are now ready to present the calibration technique in two different cases: the self-similar setting, and the quasi self-similar one.

2.1 Calibration in the Self–Similar Setting

To keep the presentation as general as possible, we consider here the parametric case, hence \(\varvec{\mu }=(\mu , t)\in {\mathcal {P}}={\mathcal {P}}_{\text {phys}}\times [0, t_f]\). We select a training set of physical parameters \(\lbrace \mu _1,\dots , \mu _{N_{\text {train}}}\rbrace = {\mathcal {P}}_{\text {phys}}^{\text {train}} \subset {\mathcal {P}}_{\text {phys}}\) and a training set of times \(\lbrace t_1,\dots , t_{N_\mu }\rbrace \) for each \(\mu \in {\mathcal {P}}_{\text {phys}}^{\text {train}}\). We then denote \({\mathcal {P}}^{\text {train}}:=\lbrace \varvec{\mu }= (\mu ,t): \mu \in {\mathcal {P}}^{\text {train}}_{\text {phys}} \text { and } t \in [t_1,\dots , t_{N_\mu }] \rbrace \subset {\mathcal {P}}\). For each \(\varvec{\mu }\in {\mathcal {P}}^{\text {train}}\), we compute the full order model solution \(\rho (\varvec{\mu })\). We then choose as reference solution \(\overline{\rho }=\rho (\bar{\varvec{\mu }})\): in our numerical tests we will choose \(\bar{\varvec{\mu }}~= (\mu _{N_{\text {train}}},t_f)\). In addition to this, we also fix the M reference control points \(\overline{\varvec{w}}\in {\mathcal {R}}^M\) as specified above. Before introducing the constrained minimization problem, let us recall that there exist a bijection between \(\varvec{\theta }(\varvec{w}(\varvec{\mu }))\) and \(\varvec{w}(\varvec{\mu })\). For this reason, in order to keep the notation light, from now on we will denote with \(\varvec{\theta }(\varvec{\mu })\) the free coordinates \(\varvec{\theta }(\varvec{w}(\varvec{\mu }))\) of the control points \(\varvec{w}(\varvec{\mu })\).

Now, for all \(\varvec{\mu }\) in the training set, we solve the following constrained minimization problem:

$$\begin{aligned} \varvec{\theta }^{\text {opt}}(\varvec{\mu }):=\min _{\varvec{\theta }\in {\mathbb {R}}^Q} R(\rho (\varvec{\mu }), \varvec{\theta }; \overline{\rho }), \end{aligned}$$
(4)

subject to the following constraints:

  • all the control points are within our physical domain: \(\varvec{w}_{\varvec{\alpha }}(\varvec{\mu })\in \Omega \) for all \(\varvec{\alpha }\) and for all training parameters \(\varvec{\mu }\in {\mathcal {P}}^{\text {train}}\);

  • \(\det J[\varvec{w}(\varvec{\mu })]>0\), where \(J[\varvec{w}(\varvec{\mu })]\) is the Jacobian of the map \(T[\varvec{w}(\varvec{\mu })]\). This constraint must be checked on a quite fine grid of the physical domain \(\Omega \), we have used the mesh grid;

  • for \(i=1,\dots , d\): if \(\alpha _j=\beta _j\) for all \(j\ne i\) and \(\alpha _i < \beta _i\), then \(\varvec{w}_{\varvec{\alpha }}^i<\varvec{w}_{\varvec{\beta }}^i\) (see Fig. 2), i.e., we never switch the order of the control points on each grid line.

We approximate \(\partial _{\varvec{\mu }}\) in (3) with the discrete derivative \(D_{\varvec{\mu }}\varvec{w}(\varvec{\mu })\) defined as:

$$\begin{aligned} D_{\varvec{\mu }}\varvec{w}^i_{\varvec{\alpha }}(\varvec{\mu }) = {\left\{ \begin{array}{ll} 0 & \text { if }\varvec{w}^i_{\varvec{\alpha }}(\varvec{\mu }) \text { is not a free coordinate of }\varvec{w}(\varvec{\mu }); \\ \frac{\varvec{\theta }_q(\varvec{\mu })-\varvec{\theta }^{\text {opt}}_q(\varvec{\mu }_{\text {neigh}}(\varvec{\mu }))}{\varvec{\mu }- \varvec{\mu }_{\text {neigh}}(\varvec{\mu })} & \text {with } q \text { s.t. } \varvec{w}^i_{\varvec{\alpha }}(\varvec{\mu }) = \varvec{\theta }_q(\varvec{\mu }). \end{array}\right. } \end{aligned}$$

In the previous equation, the neighboring parameter \(\varvec{\mu }_{\text {neigh}}\) is defined as follows:

$$\begin{aligned} \varvec{\mu }_{\text {neigh}}(\varvec{\mu }, {\mathcal {S}}) = \arg \min _{\varvec{\nu }\in {\mathcal {S}}} ||\varvec{\mu }-\varvec{\nu }||_{\ell ^2({\mathbb {R}}^{p+1})}, \end{aligned}$$

where \({\mathcal {S}} \subset {\mathcal {P}}^{\text {train}}\) will contain the parameters for which we have already computed the optimal \(\varvec{\theta }\). If the minimum is not unique, we take one of the minimizers. The definition of the discrete spatial gradient \(\nabla T\) in (3) will be specified in Sect. 3 for the specific transformation map we use.

Problem (4) is solved with the Sequential Least SQuares Programming (SLSQP) method that is available within the scipy.optimize.minimize library. We solve Problem (4) for the physical parameter \(\mu \in {\mathcal {P}}^{\text {train}}_{\text {phys}}\) for which the solution has more developed structures (the last in our tests): we start from the final timestep \(t_{N_{\mu }}\) and we proceed backwards in time. We then move onto solving Problem (4) for the neighboring parameters. In both physical and temporal parameters, the rationale is the following: we proceed form the solutions where the structures that we want calibrate are more developed and we proceed with nearest parameters until we solve the problem for all the training set \({\mathcal {P}}^{\text {train}}_{\text {phys}}\). The initial guess \(\varvec{\theta }^{(0)}(\varvec{\mu }_{(\ell ,j)})\) for \(\varvec{\mu }_{(\ell ,j)} = (\mu _{\ell }, t_j) \in {\mathcal {P}}^{\text {train}}\) is the optimal output of the minimization for the closest parameter already performed. In our tests it will be defined as:

$$\begin{aligned} \varvec{\theta }^{(0)}(\varvec{\mu }_{(\ell ,j)})= {\left\{ \begin{array}{ll} \varvec{\theta }^{\text {opt}}(\varvec{\mu }_{(\ell ,j+1)}) \text { if }j=1,\dots , N_{\mu _{\ell }}-1,\\ \varvec{\theta }^{\text {opt}}(\varvec{\mu }_{(\ell +1,j)}) \text { if }j=N_{\mu _{\ell }}. \end{array}\right. } \end{aligned}$$
(5)

Algorithm 1 shows the details of the procedure.

Algorithm 1
figure a

Calibration for Self–Similar Solution Manifold

The chosen strategy is not trivially favorable with respect to other ones and it is quite arbitrary. We have performed different trials in the tests that we have carried out, and we noticed differences in the sensitivity to this choice. Overall, taking as initial guess the optimal solution of a parameter close to the one we want to optimize is always a good strategy, but open questions remain on which reference configuration \(\overline{\rho }\) to choose and on how spread can the parameter of the training set be.

For the 1D test cases, the change in the choice of the initial guess was less relevant than for the 2D cases. Indeed, we could use initial guesses that are quite far away from the exact one and still obtain very accurate results. For the 2D cases, the choice of the initial guess, and how spread the training space can be is more relevant. For example, the timesteps used for the calibration and, hence, the distance between two consecutive parameters and the initial guess, had a strong impact. We therefore had to calibrate on timesteps that are not too far away one from the other, in order to obtain smooth and continuous results.

For the 1D test, we perform an analytical study of this choice where the exact solution is available: the outcome of this study is presented in Sect. 5.1.1.

2.2 Calibration in the Quasi–Self–Similar Setting

Fig. 3
figure 3

The density \(\rho (t=0)\) for three different values of the parameter \(\mu =[0.8943, 0.1075, 1.0906, 0.0572]\) (blue), \(\mu =[1.0286, 0.1031, 0.7358, 0.0706]\) (orange), and \(\mu =[1.2253, 0.1157, 1.1172, 0.1094]\) (green)

To motivate the need of a different algorithm for quasi–self–similar solutions, we focus now on the parametric version of the 1D Sod shock tube problem (2). In this example, the physical parameter \(\mu =(\mu ^0,\dots ,\mu ^3)\in {\mathcal {P}}_{\text {phys}}\subset {\mathbb {R}}^4\) represents the IC for the Euler problem:

$$\begin{aligned} \begin{bmatrix} \rho&u&p \end{bmatrix}^T(t=0; \mu ) = {\left\{ \begin{array}{ll} \begin{bmatrix} \mu ^0 & 0 & \mu ^1 \end{bmatrix}^T & \text { if } x< 0.5,\\ \begin{bmatrix} \mu ^2 & 0 & \mu ^3 \end{bmatrix}^T & \text { if } x> 0.5.\\ \end{array}\right. } \end{aligned}$$

In Fig. 3, we show some analytical solutions for the parametric 1D Sod shock tube, for different values of \(\mu \). Looking at Fig. 3 it is clear there is no straightforward choice for the reference solution \(\overline{\rho }\) that would lead to reasonable minimization problems, similar to the one presented in Algorithm 1. Indeed, we would minimize the \(L^2\) error between two solutions (\(\overline{\rho }\) and \({\hat{\rho }}(\varvec{\mu })\)) that may have very different heights at the boundaries. When different boundaries behaviors are present, the calibration procedure needs to be reformulated in a different way.

Instead of fixing a reference solution \(\overline{\rho }\), we consider a suitable linear reduced space \(V_{\text {POD}}\), thus introducing a new residual for the minimization problem:

$$\begin{aligned} \begin{aligned}&\min _{\varvec{\theta }({\varvec{\mu }})\in {\mathbb {R}}^{Q}} \Vert {\hat{\rho }}({\varvec{\mu }}; {\varvec{w}(\varvec{\mu }})) - \Pi _{V_{\text {POD}}}{\hat{\rho }}({\varvec{\mu }};{\varvec{w}(\varvec{\mu }})) \Vert _{L^2({\mathcal {R}})} + \frac{\delta }{2}\Vert \partial _{\varvec{\mu }}\varvec{w}(\varvec{\mu }) \Vert ^2_{\ell ^2({\mathcal {P}})}\\&\quad +\frac{\alpha }{2} \max _{\varvec{x}\in \Omega } \left( \max \left\{ \left\Vert \nabla T[\varvec{w}(\varvec{\mu })](\hat{\varvec{x}})\right\Vert , \left\Vert \nabla T^{-1}[\varvec{w}(\varvec{\mu })](\varvec{x})\right\Vert \right\} \right) , \end{aligned} \end{aligned}$$
(6)

where we recall that \(\varvec{\theta }(\varvec{\mu })=\varvec{\theta }(\varvec{w}(\varvec{\mu }))\in {\mathbb {R}}^Q\) is the vector of the free coordinates of the control points, and \(Q\leqslant d\times M\) is the total number of free coordinates in the calibration step. In Problem (6), \(\varvec{\theta }\) is constrained as in the previous section and \(\Pi _{V_{\text {POD}}}\) is the orthogonal projection onto a linear space \(V_{\text {POD}}\) obtained through a preliminary procedure that we will describe now shortly.

This minimization allows to overcome the issue of quasi–self–similar solutions thanks to the projection onto a reduced space. To carry out this projection, we first need to build a priori a suitable linear space \(V_{\text {POD}}\), which needs to capture the minimal amount of information on the solution manifold: we want to be able to approximate the solution manifold with a linear space of very low dimension.

We therefore introduce the matrix of the free coordinates \(\overrightarrow{\theta }\in {\mathbb {R}}^{N_{\text {few}}\times Q}\):

$$\begin{aligned} \overrightarrow{\theta }[i,:] = \varvec{\theta }(\tilde{\varvec{\mu }}_i), \quad i=1, \dots , N_{\text {few}}, \end{aligned}$$

where we recall once again that \(\varvec{\theta }(\tilde{\varvec{\mu }}_i)\) is the vector of the free coordinates of the control points \(\varvec{w}(\tilde{\varvec{\mu }}_i)\) associated to the parameter \(\tilde{\varvec{\mu }}_i\). The free coordinates can be selected through another optimization process carried out in the same spirit of (6) on the whole matrix \(\overrightarrow{\theta }\), minimizing the projection error over all the parameters \(\tilde{\varvec{\mu }}_1, \dots , \tilde{\varvec{\mu }}_{N_{\text {few}}}\), while updating the space \(V_{\text {POD}}(\overrightarrow{\theta })\) obtained by compressing the solutions on the reference domain for these parameters transformed using the calibration points given by \(\overrightarrow{\theta }\). Therefore, we solve the following constrained minimization problem:

$$\begin{aligned} \min _{\overrightarrow{\theta }\in {\mathbb {R}}^{N_{\text {few}}\times Q}} \sum _{i=1}^{N_{\text {few}}}\Vert {\hat{\rho }}(\tilde{\varvec{\mu }}_i; {\varvec{w}(\tilde{\varvec{\mu }}_i)}) - \Pi _{V_{\text {POD}}(\overrightarrow{\theta })}{\hat{\rho }}(\tilde{\varvec{\mu }}_i; {\varvec{w}(\tilde{\varvec{\mu }}_i)}) \Vert _{L^2({\mathcal {R}})}, \end{aligned}$$
(7)

with \(V_{\text {POD}}:=\text {POD}(\lbrace {\hat{\rho }}(\tilde{\varvec{\mu }}_i) \rbrace _{i=1}^{N_{\text {few}}}, N^{\text {POD}}_{\text {few}})\). This optimization process is of larger dimension with respect to the previous ones and it requires, for each residual evaluation, the computation of a POD over the \(N_{\text {few}}\) calibrated snapshots \({\hat{\rho }}(\tilde{\varvec{\mu }}_i; {\varvec{w}(\tilde{\varvec{\mu }}_i)})\), with a user defined number of modes \(N_{\text {few}}^{\text {POD}}\). The choice of \(N_{\text {few}}^{\text {POD}}\) is, in general, not simple. In our numerical tests, we adopted the following heuristics: we want to take into account all the different travelling features. For this reason, in the Sod test case, we have chosen \(N_{\text {few}}^{\text {POD}}=3\) to keep into account the different values that there might be between the rarefaction and the contact discontinuity and between the contact and the shock. \(N_{\text {few}}^{\text {POD}}\) should be kept as low as possible not to overload the minimization process. In the Sod test case, we performed a few tests with increasing \(N_{\text {few}}^{\text {POD}}\) and 3 was the first value where the minimization process was giving successful results. While this is an heuristic argument, it still provides a valid starting point for the numerical simulations. A more in depth analysis on the role of \(N_{\text {few}}^{\text {POD}}\) and on how to choose it in an automated way is envisioned as a future research direction. Again, we solve problem (7) with SLSQP using the same constraints defined in the previous section. This extra optimization step can be skipped when other techniques to detect interesting features can be used [22]. One possibility is the use of classical shock detection procedures to find the steepest points of the solutions and calibrating them [25]. We summarize the steps of the whole procedure in Algorithm (2).

Algorithm 2
figure b

Calibration for quasi–self–similar solution manifold

As an alternative to the whole projection onto \(V_{\text {POD}}\) we would also like to remark that, whenever shock detection techniques can be used, the quasi-self-similar setting can be efficiently handled by the convex displacement interpolation (CDI) [37].

3 The Geometrical Transformation Map

We now present the geometrical transformation map \(T[\varvec{w}^{\text {opt}}(\varvec{\mu })]:{\mathcal {R}}\mapsto \Omega \) used to define the calibrated snapshots. We start with the simpler case, namely the 1D setting, and later we consider the 2D case.

3.1 The 1D Setting

Let \(\varvec{w}^{\text {opt}}(\varvec{\mu })\) be the control points whose free coordinates are the solution to Problem (4): to interpolate the values \(\{(\overline{\varvec{w}}_{\varvec{\alpha }}(\varvec{\mu }), \varvec{w}^{\text {opt}}_{\varvec{\alpha }}(\varvec{\mu }))\}_{\varvec{\alpha }}\), we use monotone cubic \(C^1\) splines. These are the so-called PCHIPs (Piecewise Cubic Hermite Interpolating Polynomials) interpolators, available in the scipy Python library under the interpolate classes and the built-in function is called PchipInterpolator. By employing monotone cubic splines, we obtain a transformation function that preserves the monotonicity of the calibration map \(\varvec{w}(\cdot )\), guaranteeing its bijectivity and \(C^1\) smoothness if the calibration points are in the “right order”, as prescribed in Sect. 2. In Fig. 4, we can see an example of a PCHIP transformation applied to one of the snapshot calibrated on the detected features. On the right of the figure, the transformation is depicted and we can observe that it interpolates the points, it is \({\mathcal {C}}^1\) and it is very close to the identity on the boundaries, because we introduce two extra interpolation points outside the boundaries of the domain. This helps to keep the regularity of the transformation in the ALE formulation [21].

Fig. 4
figure 4

Example of PCHIP calibration: solution on physical domain with calibration control points (left), solution on reference domain with reference control points (center) and PCHIP transformation with all control points (right)

In this work, we do not focus on the ALE formulation on \({\mathcal {R}}\), nevertheless the PCHIPs allow to easily compute all the necessary ingredients. Indeed, they are polynomials and their derivatives and the inverse of their derivative is easy to compute. Moreover, the inverse of the transformation exists and it is unique in each point, hence, with a simple Newton method, we can easily recast the inverse function.

3.2 The 2D Setting

In the 2D setting we use tensor product of one dimensional PCHIPs, in order to exploit their properties for Cartesian geometries. We refer again to Fig. 2 to better understand the transformation map. We need to compute \(T[\varvec{w}^{\text {opt}}(\varvec{\mu })]:{\mathcal {R}}\mapsto \Omega \), such that

$$\begin{aligned} T[\varvec{w}^{\text {opt}}(\varvec{\mu })](\overline{\varvec{w}}^1_{\alpha _1,\alpha _2}, \overline{\varvec{w}}^2_{\alpha _1,\alpha _2})=(\varvec{w}^{1, \text {opt}}_{\alpha _1,\alpha _2}(\varvec{\mu }), \varvec{w}^{2, \text {opt}}_{\alpha _1,\alpha _2}(\varvec{\mu }))\quad \text { for } \alpha _i=1,\dots , M_i, \, i=1,2. \end{aligned}$$

Let \({\hat{\varvec{x}}}\in {\mathcal {R}}\) be a point with coordinates \({\hat{\varvec{x}}}=({\hat{x}}, {\hat{y}})\). We define the map:

$$\begin{aligned} T[\varvec{w}^{\text {opt}}(\varvec{\mu })]({\hat{x}}, {\hat{y}})&:= (T^x[\varvec{w}^{\text {opt}}(\varvec{\mu })]({\hat{x}}, {\hat{y}}), T^y[\varvec{w}^{\text {opt}}(\varvec{\mu })]({\hat{x}}, {\hat{y}})), \text { with }\\ T^x[\varvec{w}^{\text {opt}}(\varvec{\mu })]({\hat{x}}, {\hat{y}})&:= \sum _{\ell =1}^{M_2}{\gamma }^y_{\ell }({\hat{y}})P^x_{\ell }({\hat{x}}), \qquad T^y[\varvec{w}^{\text {opt}}(\varvec{\mu })]({\hat{x}}, {\hat{y}}) := \sum _{k=1}^{M_1}{\gamma }^x_k({\hat{x}})P^y_k({\hat{y}}). \end{aligned}$$

In the previous equations, we made use of the following quantities:

  1. 1.

    \(P^x_{\ell }\) is a PCHIP interpolating the points \(\lbrace \overline{\varvec{w}}^1_{\alpha _1, \ell }, \varvec{w}^{1,\text {opt}}_{\alpha _1,\ell }(\varvec{\mu }) \rbrace _{\alpha _1=1}^{M_1}\), where the control points \(\overline{\varvec{w}}^1_{\alpha _1, \ell } \) for \(\alpha _1=1,\dots ,M_1\) are on horizontal lines \({\hat{y}}= {\bar{y}}_\ell \) in the reference domain, see Fig. 2, namely for all \(\alpha _1=1,\dots ,M_1\) we have \(P^x_{\ell }(\overline{\varvec{w}}^1_{\alpha _1,\ell }) = \varvec{w}^{1,\text {opt}}_{\alpha _1,\ell }(\varvec{\mu });\)

  2. 2.

    \(P^y_k\) is a PCHIP interpolating the points \(\lbrace \overline{\varvec{w}}^2_{k,\alpha _2}, \varvec{w}^{2,\text {opt}}_{k,\alpha _2}(\varvec{\mu }) \rbrace _{\alpha _2=1}^{M_2}\), where the control points \( \overline{\varvec{w}}^2_{k,\alpha _2} \) for \(\alpha _1=1,\dots ,M_1\) are on vertical lines \({\hat{x}}= {\bar{x}}_k\) in the reference domain, see Fig. 2, namely for all \(\alpha _2=1,\dots ,M_2\) we have \(P^y_k(\overline{\varvec{w}}^2_{k,\alpha _2}(\varvec{\mu })) = \varvec{w}^{2,\text {opt}}_{k,\alpha _2}(\varvec{\mu });\)

  3. 3.

    \({\gamma }^y_{\ell }(\cdot )\) is a PCHIP interpolating the points \(\{\overline{\varvec{w}}^2_{\alpha _1,\alpha _2}, \delta _{\alpha _2,\ell }\}_{\alpha _2=1}^{M_2}\), being \(\delta _{\alpha _2,\ell }\) the Kronecker delta. By doing so, we obtain that \(T^x[\varvec{w}^{\text {opt}}(\varvec{\mu })]\) is a convex combination of the \(\lbrace P^x_{\ell }(\cdot ) \rbrace _{\ell =1}^{M_2}\) such that \(T^x[\varvec{w}^{\text {opt}}(\varvec{\mu })]({\hat{x}}, {\hat{y}}=\overline{\varvec{w}}^2_{\alpha _1, \alpha _2}) = P^x_{\alpha _2}({\hat{x}})\);

  4. 4.

    \({\gamma }^x_k(\cdot )\) is a PCHIP interpolating the points \(\{\overline{\varvec{w}}^1_{\alpha _1,\alpha _2}, \delta _{\alpha _1,k}\}_{\alpha _1=1}^{M_1}\), as before, leading to the property \(T^y[\varvec{w}^{\text {opt}}(\varvec{\mu })]({\hat{x}}=\overline{\varvec{w}}^1_{\alpha _1,\alpha _2}, {\hat{y}}) = P^y_{\alpha _1}({\hat{y}})\).

Notice that, ultimately, it holds that

$$\begin{aligned} T[\varvec{w}^{\text {opt}}(\varvec{\mu })](\overline{\varvec{w}}^1_{\alpha _1,\alpha _2}, \overline{\varvec{w}}^2_{\alpha _1,\alpha _2})= (P_{\alpha _2}^x(\overline{\varvec{w}}^1_{\alpha _1,\alpha _2}),P_{\alpha _1}^y(\overline{\varvec{w}}^2_{\alpha _1,\alpha _2})) = \varvec{w}^{\text {opt}}_{\alpha _1,\alpha _2}(\varvec{\mu }). \end{aligned}$$

Also in the 2D case, the Jacobian of the transformation, which is needed in the ALE formulation, is easily accessible, since all the terms are polynomials. Similarly to the 1D case, we can compute the Jacobian of the inverse of the transformation, using the inverse of the Jacobian of the transformation, provided that we can invert the map T.

Fig. 5
figure 5

Calibration points (\(M_1=5, M_2=4\)) on the physical domain \(\Omega =[0,4]\times [0,1]\) (left) and the transformation of horizontal and vertical lines on the physical domain (right). On the top calibration points that lead to an invertible map, on the bottom calibration points that respect the monotonicity in each line (left), but whose transformed horizontal and vertical lines cross multiple times (right) and, hence, whose transformation is not invertible

3.2.1 Invertibilty of T

The computation of the inverse of T is fundamental in many aspect of the algorithm: to display the function on the physical domain, to compute quantities and errors on the physical domain and to compute the inverse of the Jacobian for the ALE formulation.

Unfortunately, the so defined map T cannot be proven to be invertible, as it might happen that some of the vertical and horizontal lines cross each other multiple times in the physical domain, see Fig. 5 second line. To avoid this, we impose, in the optimization procedure for the calibration, to have positive determinant of the Jacobian of the transformation on the meshpoints of the reference domain, see the constraints in Sect. 2.1. This typically guarantees invertibility. We recall that, being PCHIPs polynomials, the computation of the Jacobian can be performed explicitly in each point. To compute the inverse of the transformation, we perform the following steps. We first apply the transformation map T to the elements of the Cartesian meshgrid of \({\mathcal {R}}\): these will be mapped to quadrilateral elements in \(\Omega \). Now, given a point \(\varvec{x}\in \Omega \), we can easily find to which of these quadrilaterals it belongs and with a Projective transformation (see the python module transform of the package skimage) we pull it back onto \({\mathcal {R}}\). Then, we use the found point as initial guess to find the solution of \(T[\varvec{w}^{\text {opt}}(\varvec{\mu })](\hat{\varvec{x}}) = \varvec{x}\) through a Newton type nonlinear solver (scipy.root).

4 Model Order Reduction with Calibration

We are now ready to perform the model order reduction step. In what follows the procedure is similar for the non parametric and the parametric setting: we will therefore present it for the latter case, for the sake of generality.

4.1 Learning the Calibration Map

Once the calibration procedure presented in Sect. 2 has been carried out, the map \(\varvec{w}:{\mathcal {P}}\mapsto \Theta \) is known only through the sample values \(\varvec{w}^{\text {opt}}(\varvec{\mu })\), for \(\varvec{\mu }\in {\mathcal {P}}^{\text {train}}\). To learn the calibration map \(\varvec{w}(\cdot )\) for any parameter \(\varvec{\mu }\), we employ an Artificial Neural Network (ANN) composed by several layers: an input layer \(\varvec{\mu }\in {\mathcal {Y}}^{(0)}={\mathcal {P}}\) where we pass the parameters of the problem of interest, L hidden layers \({\mathcal {Y}}^{(j)}\), \(j=1,\dots ,L\), and an output layer \({\mathcal {Y}}^{(L+1)}\), see Fig. 6. As output layer, we would like to obtain \(\varvec{\theta }\in {\mathbb {R}}^Q\), from which we can extract the calibration points \(\varvec{w}\). Keeping in mind the monotonicity constraints applied to the control points in the constrained minimization problems (4) and (6), we try to enforce this constraint in the ANN. It is not easy to strongly enforce such constraints, but we can force the output to be positive, using as final activation function a Softplus. Hence, we take as output layer of our ANN not directly \(\varvec{\theta }\), but the vector \(\varvec{v}\in {\mathbb {R}}^Q\) of the differences of the free coordinates with the previous ones (in 2D it is referred to the same line). Doing so, the positivity of \(\varvec{v}\) is equivalent to the monotonicity in each line of points. In one dimension, it is defined as \(v_i = w_i-w_{i-1}\) for \(i=2,\dots ,M_1-1\), while in two dimensions this operation is done in each horizontal or vertical line.

Fig. 6
figure 6

Example of the architecture of an ANN

Each layer \({\mathcal {Y}}^{(j)}\), \(j=1,\dots ,L\) is connected to the next and to the previous ones through affine maps \(\delta ^{(j)}:{\mathcal {Y}}^{(j)}\mapsto {\mathcal {Y}}^{(j+1)}\) and at every node a nonlinear activation function \(\zeta ^{(j)}:{\mathbb {R}}\mapsto {\mathbb {R}}\) is applied component–wise. We used the hyperbolic tangent in all the ANN except in the output layer where the Softplus function is used for the positivity of the outputs. On a training set, the learning process changes the weights \(\delta ^{(k)}\) minimizing the error between the output and the optimal calibration points.

To build and train the ANN, we used a Python based library EZyRB [38] that uses Adam method [39] to perform the minimization. More details on the architecture of the ANN for the different test cases will be provided in the numerical section.

4.2 POD-NN

In Sects. 2.1 and 2.2, we presented the calibration algorithm in the self-similar and in the quasi self-similar setting. Thanks to the calibration of the snapshots, we obtain the calibrated manifold \(\hat{{\mathcal {M}}}_\rho = \{{\hat{\rho }}(\varvec{\mu })(\cdot ), \quad \varvec{\mu }\in {\mathcal {P}}_{\text {train}}\}\), where \({\hat{\rho }}(\varvec{\mu })(\cdot )\) is the calibrated snapshot defined in Sect. 2. We now proceed with the compression of \(\hat{{\mathcal {M}}}_\rho \) by means of the Proper Orthogonal Decomposition (POD): we refer to [2] for more details. The dimension of the reduced basis can be chosen either setting a maximum number of basis \(N^{\text {max}}_{\text {POD}}\) or choosing the most relevant modes such that the discarded energy is smaller than a certain tolerance \(\tau _{\text {POD}}\). Once the POD has been carried out, we obtain a linear space \(V^{\rho }_N\) spanned by the N orthonormal reduced basis functions \(\{\hat{\Phi }_i\}_{i=1}^N\) on the reference domain \({\mathcal {R}}\). \(V^{\rho }_N\) should now represent with a good accuracy any element of the calibrated solution manifold \(\hat{{\mathcal {M}}}_{\rho }\).

4.3 Online Solution by Means of ANN

In this work, we mainly focus on the calibration procedure and the offline phase. Hence, we will use a non–intrusive approach for the reconstruction of the online solutions. Let \(\varvec{\mu }\in {\mathcal {P}}\) be a parameter of choice: the goal is to construct a linear approximation \(\rho _N(\varvec{\mu })\) of the snapshot \(\rho (\varvec{\mu })\). It should be clear by now that, since we are dealing with advection dominated problems, this is not a simple task within the standard MOR setting. However, in Sects. 2 and 3, we presented a calibration technique that allows us to obtain a linear space \(V^{\rho }_N\) that approximates with good accuracy the calibrated manifold \(\hat{{\mathcal {M}}}_\rho \): we are therefore able to construct a linear approximation of the calibrated solution of interest \({\hat{\rho }}(\varvec{\mu })\) in the reference domain \({\mathcal {R}}\). This means that we can approximate \({\hat{\rho }}(\varvec{\mu })\) with

$$\begin{aligned} {\hat{\rho }}(\varvec{\mu })\approx {\hat{\rho }}_N(\varvec{\mu }):=\sum _{i=1}^N\underline{{\hat{\rho }}}_N^i(\varvec{\mu })\hat{\Phi }_i, \end{aligned}$$

where \(\underline{{\hat{\rho }}}_N^i(\varvec{\mu }) = ({\hat{\rho }}(\varvec{\mu }), \hat{\Phi }_i)_{L^2({\mathcal {R}})}\) for \(i=1,\dots , N\), being \((\cdot ,\cdot )_{L^2({\mathcal {R}})}\) the \(L^2\) scalar product on the reference domain. In order to find the vector \(\underline{{\hat{\rho }}}_N(\varvec{\mu })\) one can adopt two alternative ways: an intrusive approach, by means of a Galerkin projection of the high order algebraic system onto the reduced space, or a non-intrusive approach by means of an ANN. If the standard Galerkin projection setting is adopted, the online system and the reconstruction of the online solution is carried out within an ALE formalism [21, 36, 40]: the original problem of interest, formulated over \(\Omega \), has to be re-written into a problem formulated over the reference domain \({\mathcal {R}}\). In this approach, a hyper-reduction procedure [41, 42] will be necessary to tackle the nonlinearities of the problem and of the transformation map. This approach is currently under investigation, and it will be part of a future extensions.

An alternative to the intrusive approach is represented by the use of ANN, in the spirit of Sect. 4.1. We consider the map \(\Pi _N:{\mathcal {P}} \mapsto {\mathbb {R}}^N\)

$$\begin{aligned} \begin{aligned} \Pi _N(\varvec{\mu }):= \underline{{\hat{\rho }}}_N(\varvec{\mu }) = [\underline{{\hat{\rho }}}_N^1(\varvec{\mu }),\dots ,\underline{{\hat{\rho }}}_N^N(\varvec{\mu })]^T. \end{aligned} \end{aligned}$$

We make use of an ANN to learn the \(L^2\) projection map \(\Pi _N\): to train the map, we employ the set of input samples \(\varvec{\mu }\) for \(\varvec{\mu }\in {\mathcal {P}}^{\text {train}}\), and the set of output samples \(\{\underline{{\hat{\rho }}}_N(\varvec{\mu })\text { for }\varvec{\mu }\in {\mathcal {P}}^{\text {train}}\}\). In this algorithm, we do not use the optimal calibrated points obtained with the optimization process, but we use the ones predicted by the ANN of Sect. 4.1. Doing so, any systematic error in the online calibration should be already taken into account while performing the \(L^2\) projection and automatically corrected by this approach. Moreover, it is also possible to use different training sets for the calibration optimization procedure and the model order reduction ones. More details on the architecture of the ANN employed will be provided in the numerical section.

4.4 Offline–Online Splitting

We summarize in this section the operations that should be performed in the offline part (i.e. only once for every new parametric problem one wishes to consider) and the online operations that should be performed to predict the solution for a new parameter.

Offline phase: FOM simulations for all parameters in the training set, choice of the reference calibration points, optimization procedure to find the calibration points for a training set, learning of the regression map for the calibration points for all parameters, transformation of the FOM solutions onto the reference domain, computation of the RB, learning of the regression for the coefficients-to-(reduced coefficient) map.

Online phase: given a new parameter evaluate the regression of the calibration points to know the geometrical transformation, evaluate the regression for the RB coefficients to get the solution on the reference domain, combine the two maps to get the solution on the physical domain.

5 Numerical Results

We now present some numerical results to validate the proposed methodology. We will consider different time dependent test cases: the Sod shock tube problem in 1D, in the non parametric and in the parametric setting, already introduced in Sects. 2.1 and 2.2, respectively. To further test the performance of our methodology, we subsequently consider a 2D problem, namely the double Mach reflection (DMR) problem, again in the non parametric and in the parametric setting. To conclude, some numerical results for the non parametric triple point problem will be shown. For all presented tests, the FOM consists of high order solutions obtained with an explicit Finite Volume discretization with WENO reconstruction of order 5, with explicit time discretization given by the optimal SSPRK54, with CFL coefficient 0.8 and Rusanov numerical flux.

5.1 Non-Parametric Sod Shock Tube Problem in 1D

We consider Problem (2) introduced in Sect. 1.1, where the physical domain is \(\Omega ={\mathcal {R}}=[0,1]\). The number of spatial degrees of freedom is 1500 and this leads to computational costs of around 2 min using a Fortran code [43] on a Intel(R) Xeon(R) CPU E3-1245 v5 @ 3.50GHz. In both cases, we use Dirichlet BC as the waves do not exit the domain before the final time. The details of all the relevant quantities are presented in Table 1.

Table 1 Test details for the 1D Sod shock tube problem, non parametric (Nonparam) and parametric (Param) setting
Fig. 7
figure 7

FOM simulation of Sod 1D problem non parameteric at times 0.04 (green), 0.1 (orange) and 0.16 (blue), on the original domain (top) and calibrated on the reference domain (bottom)

Figure 7 shows some snapshots for the density \(\rho \), the moment m and the energy E at three different times of the simulation: it is clear that the structures of all three the components of the solution present discontinuities that travel in the domain at the same locations.

We carry out the calibration technique proposed in Algorithm 1: we choose as reference solution \(\overline{\rho }\) the density \(\rho (t=0.16)\). We then choose \(M=4\) control points \(\overline{\varvec{w}}_1=0.2,\,\overline{\varvec{w}}_2=0.4,\,\overline{\varvec{w}}_3=0.6,\,\overline{\varvec{w}}_4=0.8\) equispaced in the reference domain \({\mathcal {R}}=[0,1]\). The calibrated solutions (using the ANN to forecast the calibration points) are shown in Fig. 7: the main features of the solutions, namely the shock, the contact discontinuity and the rarefaction wave are well aligned with the reference solution. The details of the quantities required to carry out the calibration step are listed in Table 2. We point out that the calibration step and the training of the ANN have been carried out on the training set \({\mathcal {P}}^{\text {train}}\subset [0.01, 0.16]\subset [0, 0.2]\) sampled with 25 equispaced parameters. The reason for excluding the first timesteps from the training is that the minimization is tricky during the first timesteps: indeed we have a transition phase, during which all the features are in the same point, leading to non invertible maps. An alternative way to overcome this difficulty could be to restore to local reduced basis spaces [44], or to use a FOM approach for the first timesteps. The final times are excluded to test the extrapolatory performances of the ROM.

Fig. 8
figure 8

Sod 1D: Eigenvalue decay of the PODs (normalized to have \(\lambda _1=1\)) non parametric (left) and parametric (right)

Table 2 Calibration of the 1D Sod shock tube problem, non parametric and parametric setting
Table 3 Architecture and details of the calibration-NN and the POD-NN in both non parametric and parametric setting
Fig. 9
figure 9

Sod 1D non parametric: The first modes obtained by compressing the original manifolds (Eulerian on the top) and the calibrated manifolds (ALE on the bottom) for \(\rho \) (left), m (center) and E (right), with POD with \(\tau _{\text {POD}}=10^{-4}\) and \(N^{\text {max}}_{\text {POD}}=7\)

Fig. 10
figure 10

Sod 1D non parametric: Online approximation of the density \(\rho \) at times 0.04, 0.12 and 0.2 (left to right). Top row: Eulerian ROM simulations with \(N=7\) modes on \(\Omega \). Central row: ALE ROM simulations on \(\Omega \) with \(N=3\) modes. Bottom row: ALE ROM simulations on \({\mathcal {R}}\) with \(N=3\) modes

Figure 8 (left) shows the eigenvalue decay of the POD for both calibrated (ALE, in blue) and original (Eulerian, in red) approaches. We can see that, differently from the Eulerian approach, for the calibrated approach the first eigenvalue is much more relevant than all the others and the Kolmogorov n–width decay is much faster. Figure 9 shows the behavior of the first modes obtained by compressing with a POD the non-calibrated manifolds, for the three conservative variables \(\rho \), m and E. We remark that also here we consider the FOM solutions for \(t\in {\mathcal {P}}^{\text {train}}\), thus excluding the initial and the final times from the compression. The modes are highly oscillatory, because they struggle to correctly represent the positions of the three discontinuities in the domain. Figure 9 also shows the POD modes obtained by running a POD on the calibrated manifolds: after the calibration the oscillations in the modes are much milder and focused on the discontinuity locations. Figure 10 shows some POD-NN results with \(N=7\), without calibration, for the density \(\rho \) at different times t (including the extrapolatory regime at \(t=0.2\)). The comparison is made between the FOM solution \(\rho \), its \(L^2\) projection onto the reduced space generated by the first \(N=7\) modes of the non-calibrated manifold (Eulerian modes) and the online reconstruction obtained using a linear combination of said modes, with coefficients that are predicted by an ANN. We can see that the Eulerian approach fails to correctly reproduce the calibrated FOM solutions: in particular, the standard MOR struggles to capture the correct position of the discontinuities, and it shows some oscillations in the online approximation that are most likely due the highly oscillating nature of the Eulerian modes themselves. Figure 10 also shows some POD-NN simulations obtained after the calibration procedure (computational cost of prediction of both calibration points and ROM coefficients below 0.001s); here we use \(N=3\) modes as the POD algorithm stops earlier for the imposed tolerance. We can see that we obtain a very good approximation of the calibrated snapshots in the reference configuration \({\mathcal {R}}\), i.e., \({\hat{\rho }}\), and in the physical domain \(\Omega \), i.e. \(\rho \), with the main features correctly reproduced by the online solution. We stress the fact that \(t=0.2\) is outside the training interval \({\mathcal {P}}^{\text {train}}\): in this case, the positions of the shock, the contact and the rarefaction wave have been slightly misplaced by the online model, hence, the approximation is not as great as in the interpolatory regime. All the details on the architecture of the ANN used to learn the calibration map and to predict the online solution are summarized in Table 3.

5.1.1 Validation of the Calibration Strategy

Fig. 11
figure 11

Calibration error for Sod 1D at different times (parameter \(\varvec{\mu }\)) using different initial guess times (parameter \(\bar{\varvec{\mu }}\)), for the FV solutions. Left: \(\overline{\rho }=\rho (\bar{\varvec{\mu }})\), \(\varvec{w}^{(0)}(\varvec{\mu })=\varvec{w}^{ex}(\bar{\varvec{\mu }})\), error measure (8). Right: \(\overline{\rho }=\rho (\bar{\varvec{\mu }})\) and \(\bar{\varvec{w}}=\varvec{w}^{(0)}(\varvec{\mu }) = \lbrace 0.2,0.4,0.6,0.8\rbrace \) for all \(\varvec{\mu }\), error measure (9)

Fig. 12
figure 12

Calibration error for Sod 1D at different times (parameters) using different calibration strategies, for the FV solutions. Left: exact waves calibration as initial guess and right for equispaced initial guess calibration points. Error measure: (8) (left) and (9) (right)

In this section, we focus on the two following aspects: the initial guess considered in the calibration, for each parameter, and the order of the parameters for the calibration. To validate our methodology, we perform two types of tests, described in the following paragraphs. We recall that, since we are in the non-parametric case here, the parameter \(\varvec{\mu }\) represents only the time t. Again, we also remark that we focus here on the density \(\rho \).

Reference solution choice in the calibration

We are interested in performing the calibration for every parameter, by changing the reference calibration points \(\bar{\varvec{w}}\) through the free coordinates \(\bar{\varvec{\theta }}\), the reference solution \(\bar{\rho } = \rho (\bar{\varvec{\mu }})\) and the initial guess \(\varvec{\theta }^{(0)}(\varvec{\mu })\) in the minimization problem (4). We then evaluate the calibration error: the chosen measures for estimating this error will be clarified later on in this paragraph.

Exact characteristics calibration reference points. For this test, we know the exact solution of the problem (2), and therefore the exact location of the rarefaction and the discontinuities of the density function. To perform the calibration, we choose a reference parameter \(\bar{\varvec{\mu }}\in {\mathcal {P}}\), and we consider \(\overline{\rho }=\rho (\bar{\varvec{\mu }})\) and \(\bar{\varvec{\theta }}=\varvec{\theta }^{ex}(\bar{\varvec{\mu }})\), where \(\varvec{\theta }^{ex}(\bar{\varvec{\mu }})\in {\mathbb {R}}^4\) is the vector of the exact locations of the beginning and end of the rarefaction wave, the contact and the shock discontinuities for \(\rho (\bar{\varvec{\mu }})\). We then carry on the calibration, for every \(\varvec{\mu }\in {\mathcal {P}}^{train}\): the initial guess \(\varvec{\theta }^{(0)}(\varvec{\mu })\) will be \(\varvec{\theta }^{ex}(\bar{\varvec{\mu }})\). We study what happens if we change \(\bar{\varvec{\mu }}\): the measure that we consider to estimate the calibration error in this case is the 1-norm of the difference between \(\varvec{\theta }^{ex}(\varvec{\mu })\) (the vector of exact locations of beginning and end of the rarefaction wave, the contact and the shock discontinuities for the \(\rho (\varvec{\mu })\)) and \(\varvec{\theta }^{opt}(\varvec{\mu })\) (the output of the calibration for the parameter \(\varvec{\mu }\)), namely

$$\begin{aligned} || \varvec{\theta }^{opt}(\varvec{\mu })-\varvec{\theta }^{ex}(\varvec{\mu })||_1. \end{aligned}$$
(8)

This choice allows us to measure how far our output is from the actual location of the discontinuities and rarefaction of the solution. The result is reported in Fig. 11 (left). Using the initial guess close to the final time results in the largest errors when calibrating early time solutions, while the opposite seem less problematic and optimally one could choose a reference solution close to the initial time to have the least amount of error for all the considered optimized parameters. It is also clear from this test that the best choice would be to choose the initial guesses close to the parameter that we are calibrating minimizes the error, i.e. staying close to the diagonal. And this is what we are actually doing in presented strategy in Sect. 2.1 in (5).

Equispaced calibration reference points. For this second test, we assume we do not know the exact solution, nor the wave locations. In this case, we choose as reference calibration points \(\bar{\varvec{\theta }} = \lbrace 0.2,0.4,0.6,0.8 \rbrace \) equispaced points in [0, 1]: these points will be fixed for all the reference parameters \(\bar{\varvec{\mu }}\) we will use and they will also be used as initial guess of the calibration process. We then choose \(\overline{\rho }=\rho (\bar{\varvec{\mu }})\), and we analyse the results of the calibration when we change \(\bar{\varvec{\mu }}\). The measure chosen to study the error of the calibration (since we do not know \(\varvec{\theta }^{ex}(\varvec{\mu })\)) is the following:

$$\begin{aligned} ||{\hat{\rho }}(\varvec{\mu }, \varvec{w}(\varvec{\theta }^{opt}(\varvec{\mu }))) - \Pi _{\overline{\rho }}{\hat{\rho }}(\varvec{\mu }, \varvec{w}(\varvec{\theta }^{opt}(\varvec{\mu })))||_2^2, \end{aligned}$$
(9)

where \(\Pi _{\overline{\rho }}\) is the projection onto the linear space spanned by the chosen reference solution. The results are reported in Fig. 11: again, we can see that choosing a reference solution close to the parameter of interest leads to smaller errors and that using one reference solution the optimal choice would be something close to \(\bar{\varvec{\mu }} \approx 0.1\).

Order in which we perform the calibration

Table 4 Details for the order calibration test, where we analyze the calibration output according to the order (in time) in which we perform it

Motivated by the results shown in Fig. 11, we now use as initial guess the calibration points found for the closest parameter already computed. We will compare the different orders described in Table 4. In particular, we will try to start from the final time or the initial one and we will use either all 100 timesteps we have in the training set or only one every 10 of these. Finally, we also compare it with a fixed initial guess strategy. As we can observe in Fig. 12, starting with the final solution as reference solution, we obtain considerably lower errors all along the time domain, almost independently on how many parameters we sample (this will not be true for more complicated tests in 2D), with respect to using the initial solution as a reference one. The authors are anyway surprised by how the error using the initial solution as reference is not too large. Also the fixed initial guess at the final time produces interesting results.

In 2D tests, we cannot perform similar comparison, as no exact calibration could be performed. We can undoubtedly say that in 2D being close to the initial guess is of more importance than in 1D, hence, in general, it is better to have a large enough training set.

5.2 Parametric Sod Shock Tube Problem in 1D

Fig. 13
figure 13

Sod 1D parametric: The first modes obtained by compressing the original manifolds (Eulerian on the top) and the calibrated manifolds (ALE on the bottom) for \(\rho \) (left), m (center) and E (right), with POD with \(\tau _{\text {POD}}=10^{-4}\) and \(N^{\text {max}}_{\text {POD}}=7\)

Fig. 14
figure 14

Sod 1D parametric: Online approximation of the density \(\overline{\rho }\) at times 0.04, 0.12 and 0.2 (left to right), for \(\rho _L=1.047937,\, \rho _R=0.122810,\, p_L=1.203980,\, p_R=0.144468\). Top row: Eulerian ROM simulations on \(\Omega \) with \(N=7\) modes. Central row: ALE ROM simulations on \(\Omega \) with \(N=4\) modes. Bottom row: ALE ROM simulations on \({\mathcal {R}}\) with \(N=4\) modes

Fig. 15
figure 15

Sod 1D parametric: Online approximation of the density \(\overline{\rho }\) at times 0.04, 0.12 and 0.2 (left to right), for \(\rho _L=1.08827,\, \rho _R=0.149654,\, p_L=1.193154,\, p_R=0.078459\). Top row: Eulerian ROM simulations on \(\Omega \) with \(N=7\) modes. Central row: ALE ROM simulations on \(\Omega \) with \(N=4\) modes. Bottom row: ALE ROM simulations on \({\mathcal {R}}\) with \(N=4\) modes

We now consider the parametric version of the Sod problem, already introduced in Sect. 2.2. We recall that we consider \(\mu =(\rho _L, \rho _R, p_L, p_R)\in {\mathcal {P}}_{\text {phys}}\subset {\mathbb {R}}^4\). All the details for the numerical simulations are provided in Table 1. Also in this case, the FOM solutions have been obtained employing the same FV discretization. In the parametric setting, we generate the training space \({\mathcal {P}}_{\text {phys}}^{\text {train}}\) using \(N_{\text {train}} = 16\) randomly selected parameters \(\mu \) from \({\mathcal {P}}_{\text {phys}}\). Again, we consider the training time interval [0.01, 0.16] discretized with around 45 times for each physical parameter.

Figure 13 shows the first modes for the three components \(\rho \), m and E, without calibration (Eulerian approach) and with calibration (ALE approach), respectively. Also in the parametric case, we can notice that the Eulerian modes are highly oscillating, similarly to the non-parametric test case: the calibration helps to significantly mitigate this phenomenon. To further validate this, we show in Fig. 8 (right) a comparison between the rate of decay of the eigenvalues obtained with a POD on the non-calibrated (red) and calibrated (blue) manifolds. The calibration results in an improvement in the rate of decay and we clearly observe that, in comparison to the non parametric case, the decay is slower and we need more basis functions to represent our solution manifold. All the details for the numerical implementation of the calibration procedure are summarized in Table 2.

Figures 14 and  15 represent the FOM, the \(L^2\) projection on the reduced space and the POD–NN online approximation for \(\rho \), for two parameters in the test set. We plot both the Eulerian ROM and the ALE one, for the latter both in the physical \(\Omega \) and in the reference domain \({\mathcal {R}}\). The online approximations are obtained with \(N=4\) modes. As we can see, in both cases the Eulerian ROM is struggling to correctly capture the positions of the discontinuities, and it provides an approximated solution that exhibits some non-negligible oscillations, most likely due to the oscillating nature of the Eulerian modes themselves. The results provided with the calibration are much more accurate, since the MOR is now able to correctly represent the positions of the discontinuities, and it does not present any oscillations in the approximations. There are minor flaws in the extrapolatory regime and in the early times, still keeping the quality of the reduced solution very high.

All the details on the architecture of the ANN used to learn the calibration map and to predict the online solution are summarized in Table 3.

5.3 Non-Parametric DMR Problem in 2D

We now consider a 2D test case, namely the Double Mach Reflection (DMR) problem [35]. Let \(\Omega = [0, 4]\times [0,1]\): we consider the Euler Eq. (2), in the time interval [0, 0.25], with the following IC

$$\begin{aligned}&{\left\{ \begin{array}{ll} (\rho _L, u_L, v_L, p_L) = ( 8, 8.25 \cos (\beta ) , - 8.25 \sin (\beta ), 116.5 ) & \varvec{x}\in \Omega _L(\beta ,t=0)\\ (\rho _R, u_R, v_R, p_R) = ( 1.4, 0 , 0, 1 ) & \varvec{x}\in \Omega _R(\beta ,t=0) \end{array}\right. } \end{aligned}$$
(10)
$$\begin{aligned}&\Omega _L(\beta ,t) = \left\{ \varvec{x}\in \Omega : x<\frac{1}{6} + \tan (\beta ) y + \frac{10}{\cos (\beta )}t \right\} , \qquad \Omega _R(\beta ,t)=\Omega \setminus \Omega _L(\beta ,t), \end{aligned}$$
(11)

with \(\beta =\frac{\pi }{6}\). The BCs are assigned through ghost cells as

$$\begin{aligned} (\rho , u,v,p)={\left\{ \begin{array}{ll} (\rho _L, u_L,v_L,p_L), & \text {if }x=0 \text { or }(y=1 \text { and } \varvec{x}\in \Omega _L(\beta ,t)),\\ (\rho _R, u_R,v_R,p_R), & \text {if } y=1 \text { and } \varvec{x}\in \Omega _R(\beta ,t),\\ (\rho _{in}, u_{in},-v_{in},p_{in}),& \text {if }y=0 \text { (wall BC)},\\ (\rho _{in}, u_{in},v_{in},p_{in}),& \text {if }x=4 \text { (outflow)}, \end{array}\right. } \end{aligned}$$
(12)

where \(\cdot _{in}\) denotes the value inside the domain at the corresponding boundary cell.

Table 5 Calibration of the 2D DMR problem, non parametric and parametric setting
Fig. 16
figure 16

DMR: Eigenvalue decay of the PODs (normalized to have \(\lambda _1=1\)), non parametric on the left, parametric on the right

Fig. 17
figure 17

FOM solution for \(\rho \) for DMR non parametric at times 0.05 (top), 0.15 (center) and 0.25 (bottom) in the physical domain \(\Omega \) (left) and, after calibration, in the reference configuration \({\mathcal {R}}\) (right). We mark on the plots the control points and the Cartesian grid that links them in the reference domain and its image through T on the physical one. We plot in white 20 contour lines at equispaced values between 1 and 25

Figure 17 (left column) represents the FOM snapshots for the density \(\rho \) at three different times of the numerical simulation; here, the same FV scheme has been employed at the FOM level on a mesh of \(2400\times 600\) cells (computational time of 5 days), then downsampled to a mesh of \(240\times 60\) cells to perform the offline phase (including calibration) in reasonable computational times. We retain 500 time samples in [0, 0.25] of which we include 100 in the training set of the calibration procedure (every \(\Delta t =0.0025\)) and 45 in the training set of the reduced algorithms all in [0.02, 0.2].

Fig. 18
figure 18

DMR non parametric: Error in time of reduced methods with different number N of modes

Figure 16 (left) shows the rate of decay of the eigenvalues returned by the POD on \({\mathcal {M}}_{\rho }\) (red): also for this test case we have a solution manifold with a slowly decaying Kolmogorov n–width, due to the fact that the shock moves inside \(\Omega \). We therefore perform a calibration procedure, using Algorithm 1 and the 2D geometrical transformation map T introduced in Sect. 3.2: all the details for the calibration step are summarized in Table 5. Figure 17 (right column) shows the outcome of the calibration for the density \(\rho \), at different times: here the snapshots are represented in the reference configuration \({\mathcal {R}}\) (computational time for forecasting the calibration points around 0.05s). With the calibration procedure, we obtain an improvement in the rate of decay of the eigenvalues, as it is shown in Fig. 16 (blue line). To conclude, in Fig. 18 we show the behavior in time of the approximation error, between the FOM solution and the online solution (computational time to evaluate the NN for the ROM coefficients 0.001s), with or without calibration, according to the number N of modes used. Both errors have been computed in the physical domain \(\Omega \) and are defined as:

$$\begin{aligned} \frac{||\rho (t) - \rho _N(t)||_{L^2(\Omega )}}{||\rho (t)||_{L^2(\Omega )}}\quad \text { and } \quad \frac{||\rho (t) - {\hat{\rho }}_N(t)\circ T^{-1}[\varvec{w}^{\text {opt}}(t)]||_{L^2(\Omega )}}{||\rho (t)||_{L^2(\Omega )}}. \end{aligned}$$
(13)

As we see in Fig. 18, the error of the online approximation (with calibration) does not go below a certain lower bound, even increasing the number of bases. We recall that, in the calibrated setting, we are interpolating the solutions to perform the transformations: for this reason, we believe that, after a certain number N of modes, the interpolation error dominates the global error, and this leads to the plateau that one can observe in the figure. We notice that with around 30 basis for the POD in the Eulerian framework we achieve errors that are comparable with the ALE solutions.

Fig. 19
figure 19

DMR non parametric: ROM solutions for \(\rho \) in \(\Omega \) at times 0.05 (top), 0.15 (center) and 0.25 (bottom). Left column: ALE ROM solution with \(N=2\). Right column: Eulerian ROM solution with \(N=30\). We plot in white 20 contour lines at equispaced values between 1 and 25

Nevertheless, the qualitatively comparison of the Eulerian and ALE approach at the ROM level with \(N=2\) for the ALE approach and \(N=30\) for the Eulerian approach depicted in Fig. 19 is still in favor of the ALE approach. Indeed, the Eulerian ROM shows an oscillatory behavior that deteriorates the shape of the solution, the shock position and the flat areas, which are not anymore flat. On the contrary, the ALE ROM solutions are very similar to the FOM ones and they preserve all the original features even with a much smaller reduced basis. So, even if the \(L^2\) errors of the two approaches are comparable, the quality of the two solutions is very different.

5.4 Parametric DMR Problem

In this section, we consider the parametric version of the 2D DMR problem: the physical parameter is the angle \(\beta \) introduced in Sect. 5.3; the physical parameter interval is \({\mathcal {P}}_{\text {phys}}=[0.1,0.675]\), and the time interval is [0, 0.2]. Also in this case, the FOM snapshots have been obtained with the same FV scheme, on a mesh \(600\times 150\) (computational time of 2 h each) then downsampled to \(200\times 50\) for reduction of computational time of the offline phase. In the training set, we include for each physical parameter 51 snapshots every \(\Delta t=0.004\), and we use all the snapshots for the calibration.

Fig. 20
figure 20

DMR parametric FOM solution for \(\rho \) at times \(t=0.096\), \(t=0.148\) and \(t=0.2\) (top to bottom) in the physical domain \(\Omega \) (left) and, after calibration, in the reference configuration \({\mathcal {R}}\) (right). We mark on the plots the control points and the Cartesian grid that links them. Parameter values \(\beta =0.225\) (top) and \(\beta =0.675\) (bottom). We plot in white 20 contour lines at equispaced values between 1 and 25

Fig. 21
figure 21

DMR parametric: Error in time of reduced methods with different number N of modes. Parameter in test set \(\beta =0.675\)

Figure 20 shows some snapshots, for two different values of \(\beta \) and at different times, before and after the calibration: all the details for the calibration procedure are summarized in Table 5. We also depicted the control points grid on the reference domain and its transformation onto the physical one, showing how the tracking of the interesting point is done and how much distortion we can get with such transformations. As we can see from Fig. 16, also in the parametric case the calibration procedure improves significantly the rate of the decay of the eigenvalues returned by the POD and hence, ultimately, the Kolmogorov n-width of the problem under consideration. In Fig. 21, we plot the behavior of the relative error on the physical domain, as explained in Sect. 5.3 in (13), varying time and for different number of modes used in the reduced spaces. On the left, we plot the error between the FOM solution and the \(L^2\) projection onto the reduced space; on the right, we have the error obtained using the POD-NN to predict the online solution. We see that both the Eulerian and the ALE projection errors improve as we increment the number of POD modes, with the Eulerian being always much larger. In the POD-NN error, on the other side, the decay of the error is slower and it seems to stagnate at some bottleneck values, in particular for the ALE case. That is why we aim at extending this work in the future with a hyper-reduced Galerkin projection approach, to reintroduce some mathematical rigorousness hoping to decrease the online error. Finally, in Fig. 22 we represent the online solutions for \(\beta =0.225\) and \(\beta =0.675\), both with the Eulerian and the ALE approach with \(N=6\). Similarly to what happens in the non parametric test case, the Eulerian approach struggles to reproduce the FOM solution, providing an approximation that sometimes even loses the main features (the shape of the solution, the shocks, the flat areas). On the contrary, with the ALE approach, the online approximation preserves all these features. The two parameters shown validate the ability of this ROM approach to work in strongly nonlinear parametric context, where the parameters changes the solution’s feature geometry, the values of the solution and vaguely the structure of the features. On the other hand, we remark that this approach works only for quasi-self-similar solutions, where we can recognize a similar structure along the parameter domain.

Fig. 22
figure 22

POD-NN solutions with \(N=6\) of \(\rho \) for DMR parametric at times \(t=0.096\), \(t=0.148\) and \(t=0.2\) (top to bottom) in the physical domain \(\Omega \). Left column: with the calibration of the manifold. Right column: without calibration. Parameter values \(\beta =0.225\) (top) and \(\beta =0.675\) (bottom). We plot in white 20 contour lines at equispaced values between 1 and 25

5.5 Triple Point Non Parametric

The applications of the proposed algorithm are various, we have showed some self-similar solutions, but this class includes a huge amount of test problems: for example, many hypersonic problems where a shock or multiple shocks are present in the solution as airfoil simulations, water waves propagation or acoustic waves. We want to solve another test here that involves a more complicated solution structure, which is a triple point shock interaction test. We consider a physical domain \(\Omega =[0,7]\times [0,3]\). The initial conditions are:

$$\begin{aligned} {\left\{ \begin{array}{ll} (\rho _W, u_W, v_W, p_W) = (1,20,0,1) & \varvec{x}\in [0,1]\times [0,3],\\ (\rho _{NE}, u_{NE}, v_{NE}, p_{NE}) = (0.125,0,0,0.1) & \varvec{x}\in [1,7]\times [1.5,3], \\ (\rho _{SE}, u_{SE}, v_{SE}, p_{SE}) = (1,0,0,0.1) & \varvec{x}\in [1,7]\times [0,1.5]. \end{array}\right. } \end{aligned}$$
(14)

Boundary conditions are transmissive on the right, Dirichlet with state \((\rho _W, u_W, v_W, p_W)\) on the left and reflective at the top and bottom of the domain. Final time is set to \(t_f=0.25\). We solve the problem on a grid \(2800\times 1200\) and then we downsample it to \(280\times 120\) for the reduction of computational time in the offline phase. We have used in the training set of both calibration and reduced algorithms \(N_\mu =100\) snapshots at \(\Delta t=0.0025\).

Fig. 23
figure 23

FOM (first and second line), ALE POD-NN (third line) and Eulerian POD-NN (bottom) solutions with \(N=7\) of \(\rho \) at times \(t=0.1275\) (left) and \(t=0.25\) (right) in the physical domain \(\Omega \). The first line shows also the calibration points. We plot in white 20 contour lines at equispaced values between 1 and 15

In Fig. 23, we show the solutions of the calibration points, FOM, ALE ROM and Eulerian ROM at times \(t=0.1275\) and \(t=0.25\). The ROM solutions are obtained with a reduced basis of dimension \(N=7\). In the first FOM plots, we also plot the \(M_1\cdot M_2=8\cdot 6\) optimal calibration points (the one at the final time are essentially the reference one) that vaguely surround the most dynamical area. Again, the ALE ROM performs much better than the Eulerian ROM in catching the right position of the waves and to sharply represent them. On the other side, there is a slight mistake in the calibration in catching the most right shock that is represented by a vertical contour line, which is not perfectly represented in the ALE ROM.

Fig. 24
figure 24

Triple point non parametric: Error in time of reduced methods with different number N of modes

In Fig. 24, we observe that also here the error of the ALE approach is much smaller than the Eulerian, but that the interpolation error is probably a lower bound for the ALE ROM that keeps it from decreasing with the increasing of the reduced basis dimension. On the other side, for the Eulerian ROM the error decays increasing the reduced basis dimensions, even if with irregularity due to the spurious oscillations. All the parameters not specified are as in the DMR non-parametric case.

6 Conclusions

We presented a novel, optimization-based calibration technique suited for hyperbolic conservation laws with (quasi) self-similar solutions that present multiple travelling structures, such as discontinuities. We combined the calibration technique with an ANN based Model Order Reduction, in order to obtain a non intrusive approximation setting that is able to provide satisfying results both in the non parametric and in the parametric framework, without the use of implicit shock tracking techniques, which additionally translates into a much more limited computational effort during the offline phase. To test the proposed methodology and to show its broad range of applicability, we considered three time-dependent problems of interest: the 1D Sod shock tube problem (non parametric and parametric), the 2D DMR problem (non parametric and parametric) and the non parametric triple point problem. In all of our tests, we have shown the benefits of using the proposed calibration based MOR: this is confirmed not only by the comparison on the rate of decay of the eigenvalues returned by the PODs, but also by the behavior in time of the relative \(L^2\)-errors (in the physical domain \(\Omega \)) obtained with the two approaches. To conclude, a qualitative comparison on the FOM solutions and the ROM solutions (with and without the calibration approach) is provided, in order to highlight that, by using a smaller number of modes, our proposed methodology is able to correctly capture all the important features of the full order solutions. Indeed, classical ROMs produce oscillations, smear the shocks and cannot preserve flat areas, while the presented calibrated version does, even in the context of multiple intersecting shocks and waves (such as in the triple point test).

We also showed the robustness of the calibration algorithm with respect to the choice of the reference solution \(\overline{\rho }\), the initial guess \(\varvec{\theta }^{(0)}(\varvec{\mu })\) and the order with which we perform the calibration: all the tests have been performed for the 1D problem, for which an analytical solution is available. The results show that the calibration algorithm provides good results, almost independently on the choice of the reference control points. Nevertheless, we are aware that a more in depth study has to be carried out on the number of control points to choose: we envision this as a future development of the proposed work. The replacement of the Neural Networks with a purely ALE approach for the online system is a work in progress and a future extension of this present work. At the time being, the strongest limitations of our method to get more physically complicated solutions are two: the fact that the domain \(\Omega \) needs to be rectangular, and the fact that the configuration of the features should not vary too much, in a topological sense. We intend to address both points in the future, introducing techniques that can map non-polyhedral shapes into rectangles [33], and using local ROMs for different parameter/time zones.

The proposed approximation setting is based on the use of piecewise cubic Hermite interpolating polynomials (or on some tensorial product of them), and works well with rectangular domains and Cartesian meshes: the extension of this approach to more complex geometries and other kinds of meshes (i.e. triangular ones) is envisioned as another future direction of this work. We also remark that, so far, we only worked with FV approximations of the full order solution. We expect to generalize the whole methodology to other discretization techniques.