1. Introduction
Large-scale aluminum alloy workpieces are the key components of equipment manufacturing in aerospace, new energy industry (such as wind power generation), national defense, and military industry, etc. They are commonly employed in aircraft keel and fuselage load-bearing frames, wind turbine generator frameworks, and end rings of missiles and rockets. These applications demand strict quality control and performance requirements for aluminum alloy materials. Large-scale aluminum alloy workpieces that have undergone effective quenching thermal treatment exhibit significantly enhanced hardness, strength, toughness, and corrosion resistance which can meet the requirements of high-performance, lightweight, and environmentally friendly products of designers and engineers. Therefore, the research on large-scale aluminum alloy workpieces holds immense significance across multiple industries.
The quenching thermal treatment process is a critical step to improve the performance of aluminum alloy workpieces and ensure their safety and reliability [
1]. The large-scale vertical quenching furnace is the key equipment in the thermal treatment process. To achieve qualified product quality, the uniformity of temperature distribution in the quenching furnace must be maintained within ±3 °C during the thermal treatment process, which is a great challenge to the control of temperature uniformity in large-scale temperature fields. The real-time temperature field distribution of aluminum alloy workpieces, as the most critical feedback information for the furnace temperature control system, is the core problem of quenching thermal treatment process research.
The large-scale vertical quenching furnace is 31 m in height and 3.5 m in radius, as shown in
Figure 1. From the outside to the inside, there is the furnace wall, heating chamber, working chamber, and aluminum alloy workpiece. The workpiece is suspended in the center of the workpiece via a hook. The furnace wall consists of slag wool and other insulation materials to reduce heat loss. The heating chamber, equipped with electric heaters, evenly surrounds the lateral wall of the heating chamber from top to bottom. This so-called multi-zone heating method can effectively increase temperature uniformity. The wall of the working chamber, made of stainless steel, aims to further improve the uniformity of the workpiece temperature. Moreover, two high-power ventilators are symmetrically installed at the bottom of the furnace to force air circulation, which enhances convective heat transfer and accelerates the temperature rising velocity. From the structural analysis of the quenching furnace, it can be known that obtaining the real temperature field of the workpiece presents several challenges: (1) The workpiece is suspended in the center of the furnace, and with only a limited number of temperature sensors installed on the inner wall of the working chamber, there exists a significant discrepancy between the measured temperature and the actual workpiece temperature; (2) Multi-zone heating manner and forced air circulation convection are employed in the furnace, which give rise to the coexistence of different heat exchange methods in the thermal treatment process, leading to a strong coupling phenomenon within the furnace; (3) As the temperature increases, hot air in the furnace gradually escapes, causing continuous changes in the furnace’s atmosphere.
Temperature uniformity is a crucial factor in determining the product quality of thermal treatment processes. Numerous research efforts have focused on developing methods for achieving temperature uniformity during thermal treatment processes [
2,
3]. A temperature prediction model for aluminum alloy workpieces, based on a simplified zone method, has been developed to achieve temperature uniformity distribution in the quenching furnace [
4]. In conjunction with the zone method, a heat conduction model for billets affected by radiation and convection was constructed, adopting a suitable convective heat transfer empirical coefficient. This method significantly improves the uniform temperature rise of billet in a large reheating furnace [
5]. Peck [
6] presented a novel reduced-order transfer-function-based mathematical model for temperature uniformity of a gas powered industrial box furnace, demonstrating that fine adjustments to the controller can reduce temperature rise time while maintaining compliance with Temperature Uniformity Survey (TUS) standards. Xu [
7] established a three-dimensional turbulent flow fluid–structure interaction heat transfer model for pit furnaces. By analyzing the flow field and temperature field, the study provides a wall function method to improve the annealing temperature uniformity. Han [
8] introduced a comprehensive full-scale industrial reheating furnace model, simulating fuel combustion, flow field, temperature distribution, and slab heating process while emphasizing the significance of skid buttons and dislocated skids. Model predictions align with industrial measurements, demonstrating its applicability for analyzing furnace temperature uniformity distribution. Based on the zone method of radiation analysis, two- and three-dimensional mathematical models for the simulation of the transient thermal behavior in a large bloom reheating furnace were established. The study confirms the feasibility and practicality of zone modeling for integration into a model-based furnace temperature uniformity control system [
9]. These studies have significantly contributed to our understanding of temperature uniformity research.
To obtain the workpiece temperature in the thermal treatment process by constructing a mechanism model of the temperature field in the furnace, partial differential equations (PDEs) that describe the thermal treatment process can be solved by numerical methods [
10,
11,
12,
13]. The finite element method (FEM) is a numerical technique for solving complex problems in engineering and physics by discretizing large domains into smaller elements. It approximates solutions for partial differential equations (PDEs) by considering the behavior of each element and assembling them into a global system, thus facilitating the analysis of structural, thermal, and fluid dynamics phenomena with increased accuracy and flexibility. Ifis [
14] developed an annealing thermal treatment temperature field model for billets and employed FEM for temperature prediction. To analyze the influence of flow field on transient temperature field and smoke emissions, the FEM was employed to explain the dynamic heat transfer effects in various furnace atmospheres influenced by burner angle, preheating temperature, and air–gas ratio [
15]. In Refs. [
16,
17], dynamic boundary conditions within the furnace were described using UDF (User-Defined Function) in the FLUENT platform, enabling the calculation of the entire temperature field and providing guidance for energy-saving production in the thermal treatment process. However, to ensure calculation accuracy, numerical simulation methods can be time-consuming [
18,
19]. To address this issue, the extrapolation method has been applied to the finite element solution of the transient temperature field model, resulting in significantly improved solution speed and accuracy [
20]. The aforementioned methods for solving PDEs of thermal treatment process discretize the solution domain into numerous grids; therefore, coarse grids yield quicker yet less accurate results, while fine grids provide greater accuracy but are slower. Complex PDE systems, such as large-scale thermal treatment processes with stringent temperature uniformity requirements, typically require very fine discretization. As a result, it is very challenging and time-consuming for traditional solvers.
In recent years, the rapid advancement of machine learning has spurred research into utilizing neural network models for solving PDEs [
21,
22,
23]. One approach is data-driven PDE solutions. By using labeled data as input and the exact PDE solution, a neural network can capture nonlinear relationships between labeled data, thereby constructing a data-driven model representing the PDE. A long-proposed, data-driven feed-forward neural network called PDE-NET, which uses Euler discretization for time derivative terms and approximates differential operators via constrained convolution kernels, achieves approximation of partial differential equation systems. Then, the author upgraded PDE-NET to version 2.0, capable of revealing the PDE’s analytical form with minimal prior knowledge while predicting long-term dynamic behavior [
24]. Liu [
25] discussed the application of a fully connected neural network in function approximation and proposed a universal solver for basic differential equations, which mainly used automatic differentiation to solve the initial and boundary conditions of PDEs. Chen [
26] conducted learning and modeling in physical space, using measurement data as nodal values for a deep neural network to approximate unknown PDEs. E. and Han [
27] introduced a novel deep learning method for solving high-dimensional PDEs that employs deep learning to approximate gradient operators based on PDE discrete schemes. However, data-driven PDE solution methods are limited by labeled data availability, posing challenges in obtaining sufficient data in some production processes. Transfer learning is a machine learning technique in which a model leverages knowledge gained from one domain to another related domain. By doing so, the model can efficiently learn and adapt to the target task more quickly, benefiting from the experience gained in the source task [
28]. In process engineering, transfer learning can be applied to soft sensor design [
29]. In Ref. [
30], the authors present a domain adaptation extreme learning machine (DAELM) inspired by transfer learning to create a simple soft sensor model for multi-grade chemical processes. An efficient model selection strategy is also developed to refine parameters. By transferring information across operating conditions, DAELM improves prediction accuracy and expands the prediction domain. A new online learning method called JITLTT-MWtr for soft sensor design in modern process industries was proposed in ref. [
31]. By combining task transferred just-in-time learning (JITL) and a moving window (MW) learner within a transductive learning setting, the approach leverages transfer learning to effectively address drifts in operating conditions and process characteristics. Easy to implement and robust, JITLTT-MWtr demonstrates high prediction accuracy on multiple datasets, showcasing its potential for industrial applications. Transfer learning reduces training time and dependency on large amounts of labeled data, making it increasingly popular in the industrial application of machine learning techniques in the process engineering domain.
The second approach for solving PDEs involves physics-driven neural networks. This method, which is not constrained by labeled data, exhibits strong generalization capabilities. Sun [
32] developed a physics-constrained, data-free deep neural network (DNN) solution for PDEs. By incorporating governing PDEs into the DNN loss and enforcing the initial and boundary conditions through “hard” boundary enforcement, this method effectively enhances the intelligent solution of PDEs under physical constraints. Cai [
33] designed an unsupervised deep learning-based numerical approach for approximating PDE solutions via compositional construction, utilizing a first-order system least squares as the loss function to optimize the parameters of the developed neural network. Nonetheless, existing physics-driven methods without labeled data struggle to effectively address unsteady state and source–sink issues.
Utilizing structured prior information to construct a neural network model that incorporates both data and physical information, a physics-informed solution method for PDEs is presented [
34]. Physics-informed neural networks (PINNs) have emerged as a powerful paradigm for solving real-life PDE problems [
35]. In Ref. [
36], a temperature field reconstruction algorithm is proposed based on a PINN-based temperature field inversion method. This method accurately reconstructed the temperature field with limited observations. Combined with a transfer learning strategy and a coefficient matrix condition number based position selection of observations method, the training process was accelerated, and the robustness of the reconstructed model was improved. Zobeiry [
37] developed a PINN to solve the heat transfer PDE, and an adaptive normalizing scheme is proposed to reduce errors in loss terms simultaneously. This method has the near real-time simulation capability of problems with any given boundary conditions and is able to predict heat transfer even outside its training zone. In Refs. [
38,
39], PINN is used to estimate the overall thermal distribution of a transformer. However, the model is one dimensional along the transformer height. In Ref. [
40], a PINN is applied to solve the broader aspects of the boundary layer two dimensional flow equations with unbounded domains. For higher-dimensional problems, PINNs are extended to realize three-dimensional fluid temperature and fin temperature prediction without solving the energy equation [
41]. Kissas [
42] also employed PINNs to recover the entire three-dimensional velocity flow field given four-dimensional flow magnetic resonance imaging data. These studies demonstrate PINN’s potential for solving high-dimensional thermal behavior problems. However, the loss function of PINNs, a fixed weighted combination of the PDE residual term, boundary loss term, and observed data loss term, is susceptible to the weighted combination of competitive multiple loss functions [
43]. To improve the accuracy and generalization performance of PINN, it is crucial to optimize loss weight settings.
Based on the analysis of heat transfer mechanism in the quenching furnace, this paper contributes to establish a 3D transient temperature field model of large aluminum alloy workpiece and develop an MCO-PINN to realize soft sensing of the established 3D transient large-scale temperature field model. Firstly, an MLP network is constructed as a surrogate model for solving the PDE of temperature field model, and a Gaussian activation function is chosen to ensure Gaussian distribution in the output of the MCO-PINN. Secondly, the residual of PDE, boundary and initial conditions, and certain measured data are encoded into each loss function of the model. By establishing a Gaussian probability distribution model of each loss function, combined with maximum likelihood estimation, the consistency optimization of the weight of each loss is realized, and the reliability of model prediction is further improved. Thirdly, to improve the convergent speed of the network and expedite training, this study introduces an AIV-ECC algorithm, which rapidly determines the parameters of the MCO-PINN activation function, effectively enhancing the generalization performance of the network, reducing sensitivity to initial values, and increasing the approximation capabilities of the network.
The remainder of the paper is structured as follows. The detailed introduction of the MCO-PINN-based soft sensing method for the 3D temperature field of aluminum alloy workpieces is described in
Section 2. The validity and accuracy of the proposed soft sensing method are demonstrated in
Section 3 by using the industrial experiment conducted on a 31 m large-scale vertical quenching furnace. The discussion and conclusions are included in
Section 4 and
Section 5.
2. Methods
This section presents the details of the soft sensing framework for the proposed MCO-PINN-based 3D transient temperature field of a large-scale aluminum alloy workpiece, which consists of the establishment of the 3D transient temperature field of the workpiece, the construction of MCO-PINN, and the AIV-ECC algorithm for training optimization.
2.1. 3D Transient Temperature Field Modeling for Large-Scale Aluminum Alloy Workpiece
During the thermal treatment process, the workpiece is first loaded through the furnace door within 3 to 5 min. Then, the thermal treatment process begins. The thermal treatment process for aluminum alloy workpieces represents an unsteady state heat transfer process, encompassing the internal heat conduction within the workpiece, heat convection between the hot air inside the furnace and the workpiece surface, and heat radiation between the inner wall of the working chamber and the workpiece surface.
Figure 2 depicts the schematic diagram of heat exchange in the furnace. In different periods of the thermal treatment process, the heat transfer mechanism is mainly described as
- (1)
Temperature rising period: During the initial stage of heating, the multi-zone electric heaters operate at full power, rapidly increasing the temperature of the working chamber wall through heat conduction. At this time, there is a significant amount of air inside the furnace, and the ventilators force the air to circulate in the furnace. The heated air transfers heat to the working chamber wall and the workpiece surface, resulting in intense convective heat transfer. Simultaneously, the inner wall of the heating chamber radiate heat onto the working chamber wall, which then radiates heat onto the workpiece surface. After receiving the radiant heat from the inner wall of the working chamber and the convective heat from heated air, the temperature of the workpiece surface rises, and the heat is transferred to the interior through heat conduction, thus realizing the overall temperature rise of the workpiece.
- (2)
Transition period: At the later period of the temperature rising period, when the furnace temperature approaches the set temperature value, the temperatures of the chamber walls and the workpiece surfaces gradually rise; consequently, the heat radiation effect is rapidly intensified. Simultaneously, the pressure inside the furnace escalates, causing hot air to consistently leak from the furnace door, which results in a substantial decrease in air density and a rapid decline in the heat convection effect.
- (3)
Temperature holding period: When the temperature in the furnace reaches the temperature set value, the heat treatment enters the temperature holding period. The workpiece temperature is controlled and fluctuates at the temperature set temperature for hours to ensure complete phase change. Generally, the axial temperature distribution uniformity inside the furnace should be within ±3 °C. The idling phenomenon of the ventilators appears within the furnace, making the convective heat exchange nearly negligible. During this period, the surface of the workpiece only receives radiative heat transfer from the working chamber wall, which then transfers inward through heat conduction.
Therefore, it is evident that during the temperature rising and transition periods that the hot air in the furnace transfers heat to the workpiece surface through convective heat transfer, while the inner wall of the working chamber transfers heat to the workpiece surface through radiative heat transfer. The heat is then conducted into the interior of the workpiece. In the temperature holding period, the inner wall of the working chamber transfers heat to the workpiece surface through radiative heat transfer, and then, the heat from the workpiece surface transfers to the inside of the workpiece through heat conduction.
To accurately characterize this process, taking into account the cylindrical geometric structure of both the furnace and the workpiece, a 3D transient temperature field model of a large aluminum alloy workpiece is established as
where
,
,
, and
are the 3D temperature field, density, specific heat capacity, and thermal conductivity of the aluminum alloy workpiece, respectively.
,
and
represent the radial, circumferential, and axial directions of the workpiece, respectively.
is the heat flux imping on the workpiece surface.
describes the boundary surface of the workpiece.
and
are the convection and radiation heat fluxes impinging on the workpiece surface
.
and
are the initial temperature and height of the workpiece.
is the computational domain occupied by the workpiece. By solving Equation (1), the temperature values of any position and any time on the exterior or interior of the workpiece during the thermal treatment process can be obtained.
Considering the pronounced regional characteristics of the multi-zone heating method employed in the quenching furnace, wherein the heat fluxes of each heating zone significantly affect the temperature of the workpiece surface closest to them, the outer boundary surface of the workpiece is subdivided into nine sub-surfaces corresponding to the seven thermocouple installation zones on the inner wall of the working chamber. As illustrated in
Figure 3, the nine sub-surfaces comprise the top end surface
, the seven cylindrical toruses
to
, and the bottom end surface
, respectively. The boundary surface of the workpiece
is described as
where
is the heat flux imping on the
th sub-surface, which constitutes the convective heat flux
and the radiative heat flux
.
represents the average measurement temperatures of seven thermocouples.
is the upper bound temperature when the heat flux imping on the workpiece surface is only provided by convection heat transfer, and
represents the lower bound temperature when the heat flux imping on the forging surface is only provided by radiation heat transfer. The convective heat flux
of the
th zone is calculated by the Dittus–Boelter equation, and is expressed by
where
and
.
is the convection coefficient of the
th sub-surface. For the
th zone,
is the average temperature of the inner wall of this zone,
is the Prandtl number, and
is the radius of the working chamber.
,
,
,
,
, and
represent the thermal conductivity, density, dynamic viscosity, specific heat capacity, Reynolds number, and average flow velocity of the air in the
th cylindrical zone. According to the technique setting,
. The nonlinear relationship of physical parameters with temperature is obtained through a five-degree polynomial fitting and can be described by Equation (4) [
44].
The radiation heat flux of the
th sub-surface is
, which is obtained by
and
are the area and net radiation heat flux of the
th cylindrical torques.
can be calculated by
where
is the area vector of the torus,
is the surface emissivity,
describes the blackbody emissive power, and
is the net radiation flux on the lateral surface of the workpiece.
and
are the direct exchange area and total exchange area between surface to surface, where the center of the two infinitesimal sections on the surface are
and
.
and
are the angles of the light ray and the perpendicular to the surface section.
and
are the heights of the infinitesimal surfaces. According to Equation (6), total exchange area
is the key to obtain the net radiative heat flux
.
is only concerned with the geometry of the furnace, which is independent of temperature, so it is only required to be calculated once.
2.2. Solution of 3D Transient Temperature Field Model of Workpiece Based on MCO-PINN
The proposed framework for MCO-PINN is depicted in
Figure 4. Firstly, a multi-layer MLP model is employed to construct a deep neural network with an output
, which serves as a surrogate model for the solution of the 3D transient temperature field model for the workpiece. The input
of MCO-PINN is the spatial and temporal information (
,
,
,
), while the output
is the temperature information
(
,
,
,
). The residual of the PDE, boundary and initial conditions, and certain measured data are set as constraints and encoded into the loss functions
,
,
, and
of the deep neural network for training. During training, a Gaussian probability distribution model is established for each loss function, combined with the maximum likelihood estimation method, and the weight of each loss (
and
) is adaptively adjusted. This approach transforms the global minimization of the loss objective function for the network into a process of synchronously and consistently minimizing multiple losses, thereby balancing the backpropagation gradient size of the residual term, boundary loss term, and data loss term in the loss function. As a result, gradient disappearance and explosion are averted, and the MCO-PINN solution accuracy is improved. Finally, using the powerful nonlinear black box approximation, massive data fitting, and fast calculation abilities of deep neural networks, the problems faced by traditional numerical methods in solving high-dimensional PDE model, such as low solution accuracy, difficult inversion of model parameters, high data sensitivity, poor adaptation ability of new data, and time-consuming solution in large-scale domains are solved. Specifically:
The solution of the PDE model (1) is
in
, where
. Then, the PDE and boundary conditions shown in Equation (1) can be rewritten into the general form as
The MLP model of
full connection layers is constructed as
and
where
and
are the weights and biases at the
th layer. Let the set of all weight matrices and bias vectors be the following
Apparently, the size and depth of the deep neural network are and .
To evaluate the deviation between the output
of the depth neural network and the constraints of the PDE model, a loss function is defined as
with
where
, and
describe the weight of each loss.
are the number of residual points of the initial, boundary and experimental measured values of the PDE model, respectively.
represents a set of predefined points to measure the matching degree between the output
of the deep network and the PDE model.
is the parameter vector to be identified in the model.
To achieve multi-loss consistency optimization of the MCO-PINN, assume that the multiple outputs of the network conform to a Gaussian distribution, with the mean value of maximum likelihood estimation being
.
is an approximation of the true value
of the solution to the PDE model. The variance of maximum likelihood estimation is
which is an uncertainty parameter and described as
The maximum likelihood function is used to optimize the uncertain parameter, yielding
Taking into account that the mean of the maximum likelihood estimation
is affected by the boundary and initial values and the residual of the PDE model, Equation (15) can be extended to a multi-output neural network and is expressed as
Therefore, the optimization objective of loss function of MCO-PINN can be written as follows:
where
represents the adaptive weight of each loss term.
with
demonstrate the total weights of loss terms. The goal of MCO-PINN training is to find the best model parameters
and appropriate adaptive weights
to minimize the loss
.
2.3. Training Optmization of MCO-PINN
To achieve multi-loss consistency optimization in MCO-PINNs, an assumption must be satisfied; that is, the various network outputs should conform to Gaussian distribution requirements. To comply with this stringent assumption, the radial basis activation function has been chosen as the activation function of the network. This selection ensures that the output of MCO-PINN exhibits a Gaussian distribution, which is achieved through the linear superposition of Gaussian functions.
represents the activation function of the
th hidden node in the
th hidden layer which is described as
The center and width of the activation function serve as structural parameters of MCO-PINN, which can be predetermined through pre-training. Generally, high-precision clustering algorithms can efficiently uncover the intrinsic relationships within training sample data, enabling the determination of MCO-PINN activation function parameters and effectively enhancing the network’s generalization performance. However, traditional distance-based clustering algorithms necessitate multiple iterations, leading to reduced efficiency and heightened sensitivity to initial values. Consequently, the AIV-ECC algorithm, outlined as follows, is proposed.
An AIV-ECC algorithm is introduced to explore the inherent clustering features and correlation of input samples, which is designed to enhance the network’s capacity for accurately representing the fundamental characteristics of the underlying data. By determining a clustering number
, the input sample matrix
is represented as
where
is the rotation matrix.
is a
matrix which represents the samples in the
th cluster, and
is the number of samples in the
th cluster. The correlation square sum cost function of the partition
of the matrix
is expressed as
where
is the average of the samples in the
th cluster. Introducing a vector
, and the Hilbert–Schmidt norm
, Equation (20) is rewritten as
is the projection matrix. We set an orthogonal matrix
of
as
Equation (21) is simplified as
It is evident that the optimal approach for partitioning
into
clusters can be achieved by employing the partition strategy, which entails minimizing the correlation square sum cost function
in Equation (23). Given that
is determined by the sample space, the problem of
is considered as an equivalent optimization problem described as
According to the Ky Fan theorem, when dealing with a real and symmetric matrix
with eigenvalues
and eigenvectors
, Equation (25) holds constant under the constraint of
.
The optimal matrix is , and is any orthogonal matrix.
To quickly obtain
and
, we can assume that each submatrix
corresponds to an optimal class. Given this assumption, Equation (26) holds.
The largest eigenvector of
is set as
, and
is constructed as
According to the
theorem, by ignoring the higher order term, Equation (28) can be deduced and expressed as
is an
orthogonal matrix. Expanding Equation (28) as the following
Equation (28) indicates that the optimal categorization approach for input sample matrix
can be determined through the orthogonal linear transformation of matrix
, which consists of the first
maximum eigenvectors of matrix
. The QR decomposition of matrix
can be used to facilitate the orthogonal transformation of matrix
to
.
where
represents a permutation matrix,
is an
orthogonal matrix, and
is a
upper triangular matrix. Based on the calculation of
, the cluster discrimination matrix
is calculated by
The cluster category for each data vector can be determined by identifying the index of the element with the largest absolute value in the corresponding column of matrix
. Subsequently, we can obtain the sample set
for any category
and compute the cluster center vector
using the following equation:
where
represents the number of elements in set
. The width of the cluster
is defined as the dispersion degree of each cluster sample relative to the cluster center vector and is calculated by
The detailed steps of AIV-ECC algorithm are as follows:
Step 1: For the existing samples
, randomly re-sample N times to form the bootstrap samples
and calculate the sum of within-class distances of each bootstrap sample according to the following equation. Then, the statistics of N bootstrap samples of these N times are obtained and arranged from small to large.
where
. Initialize
;
Step 2: Since the N bootstrap statistics are normally distributed, the confidence interval under the confidence degree can be obtained ;
Step 3: The data sample X is clustered by ECC clustering algorithm to obtain the clusters of data sample, the clustering center vector matrix , and the clustering width vector at this time;
Step 4: Given the confidence level , if of the clustered samples under classes is within the confidence interval of , then end the AIV-ECC algorithm; otherwise, .
Step 5: Repeat Step 3 and Step 4.
The pseudo code for MCO-PINN training solution is as follows in Algorithm 1:
Algorithm 1 Train MCO-PINN |
1: Data: Historical training sample set ; Training sample sets satisfy the true value, initial value, boundary value, and parameter identification value of the 3D transient workpiece temperature field model ; 2: Initialize network depth: Cluster the sample set with the AIV-ECC algorithm and determine the depth of the network as the clustering number ; Assign the training samples as ; 3: Initialize network size: Cluster with the AIV-ECC algorithm and determine the number of hidden nodes in the th layer as the number of clusters , the clustering center vector matrix and the cluster width vector ; Initialize the radial basis activation functions for all hidden nodes; 4: Initialize training parameters: Randomly initialize network parameters and weights , and ; 5: Input: Input current sample vector ; 6: Construct Gaussian probability model: Construct Gaussian probabilistic models with mean given by the output of PINNs and the adaptive weight collection ; 7: Train: Use K gradient descent iterations to update the parameters and as For - (a)
(Equation (17)) based on the maximum likelihood estimation. - (b)
Tune the adaptive weight via Adam optimizer to maximize the probability of meeting constraints. - (c)
Update the network weight
via Adam optimizer. End for 8: Output The best model with parameters and the final adaptive weight . |
3. Results
Industrial experiments were conducted in a 31 m large-scale vertical quenching furnace. The specification of the experimental aluminum alloy workpiece is 360X35, with the alloy state 6061T6511 and batch number CJ2025; the gradient temperature setpoints are 465 °C, 500 °C, and 530 °C. The physical structure of the workpiece is 0.3 m in radius and 30 m in length. To obtain the surface and interior workpiece temperature, thermocouples were installed at different positions on the surface and inside of the experimental workpiece. The sketch of the installation location of the thermocouples is shown in
Figure 5. To collect more temperature information from the workpiece using a limited number of thermocouples, seven thermocouples from Zone 1 to Zone 7 were installed in the industrial experiment. In the sketch, these zones are abbreviated as Z1–Z7. Thermocouples Z1, Z2, Z3, and Z5 are situated on the workpiece surface, while Z4 and Z6 are positioned at half of the workpiece radius. Z7 is placed at the center of the workpiece. The cross-sectional diagram depicts the distance of each thermocouple from its corresponding location to the center of the workpiece.
In this experiment, 3030 datasets were collected at a sampling interval of 10 s. Temperature data for parameter identification of the 3D temperature field model were obtained by installing thermocouples on the experimental workpiece, and there are 510 sets of measured temperature data. Therefore, the training set for is the 3030 sets of internal thermocouple measurements of the workpiece, and each is represented as . is the initial workpiece temperature and it is a fixed temperature set. The training set for is the 3030 sets of thermocouple measurements of the workpiece surface, and each is represented as . is used to identify the parameters . Therefore, the datasets specially used for model parameter identification in the experiment are 510 sets of thermocouple measurements, and each data vector is represented as .
The simulation test platform executes the system on a machine equipped with an NVIDIA RTX 4090 GPU and Windows OS. The developed MCO-PINN was implemented using Python. The MCO-PINN contains 6 hidden layers with 128, 128, 64,64, 32, and 32 nodes in each layer. There are 3540 training samples were used, encompassing 295 batches with a batch size of 12, resulting in 8850 iterations and 30 epochs. To ensure that the ANN model has good generalization ability, we utilized five-fold cross-validation.
Figure 6 compares the iteration convergence curves of the error functions for both PINN and MCO-PINN. During the model training process, for the PINN network, we artificially assigned a fixed weight vector
, remaining constant throughout the training process. However, within the MCO-PINN, due to its adaptive weight adjustment, we designated an initial value for the weight vector as
. As the weights adaptively adjusted, the network training loss stabilized; notably, this occurred when the network training iterations exceeded 8000 steps. The fluctuation ranges of the various weight values were as follows:
,
,
, and
. As observed in
Figure 6a, during the first 1000 iterations for the PINN, the PDE residual loss and initial condition loss increase while boundary loss and data loss decline. As the number of iterations increases, reaching 8000 steps, the PDE loss, IC loss, BC loss, and data loss drop to (10
−2, 10
−4, 10
−3, 10
0), respectively.
Figure 6b illustrates that employing the MCO-PINN method results in consistent declines for all loss functions due to the proposed consistency optimization method. It is particularly notable after 3000 iterations when rapid, consistent convergence is achieved, with PDE loss, IC loss, and BC loss all converging to levels below 10
−5. Data loss rapidly converges initially. As the iteration step increases, it does not display rapid declines, ensuring the structural stability of the PDE. Based on the iteration convergence curves presented in both
Figure 6a,b, the MCO-PINN not only exhibits faster convergence speed than PINNs, accelerating the solution speed of the network, but also demonstrates superior accuracy relative to PINNs.
To verify the effectiveness of MCO-PINN in soft sensing of 3D temperature field, MCO-PINN, PINN, and FEM are employed to solve the 3D temperature field model of the workpiece, and the workpiece temperature curves are shown in
Figure 7. As the highly uniform distribution of the large-scale temperature field inside the furnace is a critical factor for ensuring the effectiveness of quenching thermal treatment, the temperature distribution of the seven measurement points in the workpiece exhibits strong consistency, with only minor deviations. To provide a more intuitive and clear comparison of the results, the paper divides the seven zones of the workpiece into three representative positions (upper, middle, and lower) for regional comparison. The temperature curves during the rapid heating and transition periods are illustrated in
Figure 7.
Observing the enlarged images during the temperature rising period in
Figure 7a,b, it is evident that the furnace temperature increases rapidly by 15 °C within the 50 s, and noticeable temperature gradients emerge across various zones of the workpiece. During this period, abundant hot air circulates within the furnace, causing intense convective heat exchange, which leads to rapid heating. As the air heats up, it rises and accumulates in the upper and middle parts of the furnace. Although the ventilators at the furnace bottom force air circulation, because the quenching furnace is as high as 31 m, Zone 1 is close to the top of the furnace, and Zone 7 is located at the lower part of the furnace and close to the bottom door; the temperature from Zone 1 to Zone 7 presents a gradient decline distribution of temperature. During this period, MCO-PINN can accurately predict the temperature of the workpiece, although both the PINN and FEM methods exhibit larger prediction errors—particularly for the FEM method, which struggles to provide effective predictions during non-steady-state thermal processes, resulting in substantial deviations from actual measured temperatures. However, as the temperature rises and furnace pressure increases, heated air gradually overflows from the bottom furnace door and the top crane suspension hole, slowing down the temperature rise. As the furnace approaches a near-vacuum state, the process transitions from rapid heating to the temperature holding period. The temperature profiles for each zone during the temperature transition period are also depicted in
Figure 7a,b, where the maximum temperature increase within 100 s is less than 5 °C. Although the prediction errors for PINN and FEM slightly improve with reduced furnace temperature change rates, the MCO-PINN method still demonstrates the highest prediction accuracy for temperature. The experimental results demonstrate that the proposed MCO-PINN method maintains a high level of temperature prediction accuracy under complex heat exchange conditions, exhibiting robustness in its performance.
Figure 8 presents the temperature profiles of the aluminum alloy workpiece calculated by different models during the gradient temperature holding period. According to the quenching process, to fully integrate the alloy components into the aluminum matrix during the temperature holding period, gradient temperature setpoints are set as 465 °C, 500 °C, and 530 °C for this specific type of workpiece. As shown in
Figure 8, it can be found that during this temperature holding period, the furnace is primarily characterized by radiant heat exchange, with a relatively uniform heat exchange mode and stable furnace conditions. The maximum temperature difference across various zones within the furnace is maintained within ±3 °C. The MCO-PINN method continues to accurately track the measured workpiece temperatures. Although the prediction accuracy of the FEM method is better than that of non-steady heat transfer, the prediction accuracy of the FEM method is still inferior to that of the PINN method, and the MCO-PINN method is still the highest in precision.
To clearly illustrate the prediction accuracy of different solution methods,
Figure 9 displays the absolute error for each method. It is observed from
Figure 9 that the prediction accuracy of MCO-PINN can be maintained within ±1.5 °C, satisfying the quenching process requirement of a ±3 °C uniform temperature distribution inside the workpiece. The prediction error range of FEM ranges from 7.5 °C to −7.9 °C, while the PINN method ranges from 4.9 °C to −7.7 °C, both failing to meet the ±3 °C uniform temperature distribution requirement. Moreover, large error fluctuations are observed during the rapid heating at 100 s and during the gradient heating at 1700 s and 2400 s within the temperature holding period. This result indicates that the FEM method is highly susceptible to changes in heat exchange conditions, resulting in unstable prediction accuracy. The PINN method also exhibits significant errors under unstable heat exchange situations. However, the MCO-PINN method demonstrates smaller fluctuations under these conditions. Therefore, the accuracy and stability of the MCO-PINN method are notably superior to those of the PINN and FEM methods.
Figure 10 presents boxplots illustrating the temperature prediction accuracies of MCO-PINN, PINN, and FEM methods. The temperature accuracy is evaluated by the R
2 index. R
2 (R-squared) is a measure to describe the degree of interpretation of independent variables to dependent variables in statistical models. The line within the box indicates the median of the data, while the box itself encloses 50% of the dataset, reflecting the fluctuation in prediction accuracy for each method. The upper and lower edges of the box represent the maximum and minimum values of the dataset, respectively. As depicted in
Figure 10, the temperature prediction accuracy of MCO-PINN fluctuates between 0.98 and 1 across the seven zones, substantially outperforming the other methods. The height of the MCO-PINN box is relatively low, centered around 0.9, with the smallest difference between the upper and lower edges, indicating that its overall prediction accuracy is relatively stable. The FEM method exhibits the largest difference between its box edges, suggesting greater fluctuations in temperature prediction accuracy. Therefore, compared to PINN and FEM methods, the MCO-PINN method demonstrates superior recognition and stability. It maintains high computational accuracy throughout the entire thermal treatment process, even accurately predicting complex states with dramatic heat exchanges. This efficiency effectively provides rapid and reliable feedback information for controlling uniform temperature distribution in the quenching furnace, contributing to the improvement of aluminum alloy workpiece quality.
To further validate the performance of MCO-PINN in solving transient 3D temperature field models in the time dimension,
Figure 11 displays the temperature curves of seven thermocouple measurement points on the workpiece over time, calculated using various methods. The workpiece extends 30 m in length, with the axis denoting position; 0 represents the installation position of Zone 1 thermocouple at the furnace top, and 30 corresponds to the installation position of Zone 7 thermocouple at the furnace bottom. The temperature distributions at different time steps for each thermocouple, calculated using MCO-PINN, PINN, and FEM, are shown in
Figure 11. As illustrated in
Figure 11, the MCO-PINN method exhibits high temperature prediction accuracy throughout the entire thermal treatment process, displaying the best alignment with the temperature values at the seven measurement points. The PINN and FEM methods show noticeable fluctuations in prediction accuracy during the temperature rising period, such as 200 s, 400 s, and 600 s, with the most significant temperature variations observed in Zones 1 and 7. Moreover, during high-temperature stages when the temperature exceeds 450 °C, both MCO-PINN and FEM methods achieve high-precision workpiece temperature predictions due to the stable heat exchange, with improved prediction accuracy compared to that of low-temperature stages. However, when temperatures fall below 450 °C, significant deviations between FEM predictions and actual temperature measurements are observed at the upper parts of the workpiece, such as positions 0 m and 6 m, as well as the lower sections at 21 m and 30 m. These locations, which are close to the top and bottom of the furnace, are more susceptible to external environmental influences. Nevertheless, MCO-PINN maintains high-precision temperature predictions for the workpiece, remaining virtually unaffected by the complexity of heat exchanges within the furnace and external environmental factors.
To comprehensively compare the computational performance of different solution methods,
Table 1 provides statistical indicators of the performance obtained using FEM, PINN, and MCO-PINN. MRE and RRMSE denote the mean relative error and the relative root-mean-squared error, respectively. The 1.5% hit rate and 0.2% hit rate indicate that the relative error of temperature prediction during the temperature rising and holding periods is below 1.5% and 0.2%, respectively. The statistics reveal that the MRE range for FEM is 0.277–0.655, 0.151–0.332 for PINN, and 0.028–0.141 for MCO-PINN. Regarding RRMRE, the minimum value is 0.005 for FEM and 0.002 for PINN, but MCO-PINN has the lowest minimum value of 0.001, and its maximum is only 0.003. The 1.5% hit rate and 0.2% hit rate for MCO-PINN can reach 0.999 and 0.998, respectively, surpassing those of PINNs and the FEM. The number of iterations also indicates that the MCO-PINN iteration speed is considerably faster than PINNs and FEM. These data suggest that MCO-PINN not only has the highest computational accuracy but also boasts the fastest computation speed. Therefore, MCO-PINN can facilitate real-time, high-precision predictions of large-scale 3D temperature fields.