1. Introduction
As a globally utilized construction material, concrete is essential for both edifices’ structural integrity and durability [
1]. However, its limited ability to withstand high tensile loads and its propensity to become brittle often result in crack formation, presenting significant drawbacks in engineering applications [
2]. Hence, steel fiber reinforced concrete (SFRC) has emerged as a robust enhancement material. By incorporating short steel fibers into traditional concrete, SFRC markedly improves the tensile strength, flexural resilience, and impact resistance. This modified concrete addresses conventional concrete’s inherent brittleness and poor crack resistance, which can compromise the structural durability and safety [
3]. The steel fibers, uniformly distributed within the matrix, effectively connect and inhibit crack propagation, thereby significantly enhancing the mechanical properties and toughness of the concrete. Particularly well-suited to structures that are exposed to high-stress situations and dynamic loading conditions, SFRC finds extensive application in bridge engineering, thereby bolstering these critical infrastructures’ structural integrity and resilience [
4].
In civil engineering, scrutinizing and evaluating concrete structure damage poses intricate challenges. In various diagnostic approaches, acoustic emission (AE) technology is recognized as an increasingly preferred non-destructive testing method, noted for its exceptional sensitivity and ability to assess conditions in real time. This technique identifies micro-cracks and other defects in the concrete by recording the acoustic signals emitted under states of stress and structural failure [
5]. Consequently, this approach establishes a robust foundation for continuous structural health monitoring and accurate damage evaluation, thereby enhancing the reliability of assessments related to structural integrity.
In practical applications, AE signals are often compromised by various noise interferences during transmission, including environmental noise, electromagnetic disturbances, and instrumental noise from measuring equipment [
6]. Such noise markedly reduces the signals’ quality and analytic precision. Consequently, effectively eliminating noise components becomes a pivotal challenge in processing AE signals. Conventional signal processing techniques include wavelet packet analysis [
7], the Hilbert–Huang transform (HHT) [
8], and empirical mode decomposition (EMD) [
9]. For instance, wavelet packet analysis effectively distinguishes between the signal and noise across varied frequency bands through a multi-scale analysis [
10]. Donoso et al. observed that this method successfully extracts signals within specific frequency bands [
11].
Nonetheless, under conditions of complex noise, the performance of wavelet packet analysis can be less than optimal, and it is susceptible to parameter selection and adjustment [
12]. The Hilbert HHT represents a productive methodology for addressing non-linear and non-stationary signals. Huang et al. demonstrated its superiority in managing random and non-linear signals by integrating EMD with the Hilbert spectral analysis [
13]. However, the practical deployment of this technology requires significant expertise and experience and is characterized by substantial computational complexity. Furthermore, Hussain et al. confirmed its potential in addressing non-linear and non-stationary signals in their study on applying HHT to diagnose mechanical faults [
14]. The method of empirical mode decomposition makes the signal processing more versatile by adaptively decomposing signals. Wu and Huang noted that while EMD presents certain advantages in noise reduction and feature extraction, its performance remains constrained in high-noise environments [
15]. Meanwhile, Dragomiretskiy et al. introduced the ensemble empirical mode decomposition (EEMD), which enhances the decomposition results by introducing white noise. Despite this advance, the methodology still presents computational challenges [
16].
Although these traditional methods exhibit some effectiveness under specific conditions and contexts, they show limited robustness in multi-noise environments and are characterized by a lower level of automation. Consequently, these methodologies struggle to satisfy the stringent requirements of high precision and real-time signal analysis required by complex engineering applications.
In recent years, with the advancement of artificial intelligence and machine learning technologies, deep learning (DL) methodologies have shown significant potential in signal processing [
17]. Deep learning autonomously learns features within signals through multiple layers of non-linear transformations, enabling a more accurate capture and extraction of pertinent information from signals than traditional methods. Furthermore, deep learning approaches offer increased automation, significantly reducing the reliance on manually set parameters and expert knowledge. They can adaptively handle complex and variable noise environments, enhancing the accuracy and robustness of signal processing.
For other signal processing, deep learning methods have achieved considerable success. For instance, in image processing, convolutional neural networks (CNNs) effectively extract and recognize image features through mechanisms of local receptive fields and shared weights. A seminal work by Krizhevsky et al. demonstrated in their AlexNet model employed deeper convolutional kernels to enhance the precision and efficiency of image recognition [
18]. Furthermore, the introduction of deep residual networks (ResNet) by He et al. extended the depth of convolutional neural networks, significantly enhancing their performance in image recognition tasks [
19].
In the domain of speech processing, recurrent neural networks (RNNs) and their variants, such as long short-term memory networks (LSTMs), have shown robust capabilities in managing temporal data, particularly in speech recognition and text generation. Hannun et al. introduced a significant speech recognition model that incorporated LSTM structures, significantly enhancing both the performance and robustness of speech recognition systems [
20]. Similarly, Graves et al. implemented a recurrent neural network-based connectionist temporal classification (CTC) method in speech recognition, substantially improving the recognition accuracy [
21]. In natural language processing, the transformer model, based on the attention mechanism, employs self-attention to analyze and understand large-scale textual data efficiently. Introduced by Vaswani et al., the transformer model, with its entirely attention-based architecture, has shown an exceptional performance in tasks like machine translation and text generation [
22]. Furthermore, the BERT model (bidirectional encoder representations from transformers), proposed by Devlin et al. (2018), leverages a bidirectional encoder framework to enhance the performance of natural language processing tasks dramatically [
23].
In addition, deep learning has enjoyed numerous successful deployments in signal denoising. By employing deep learning-based autoencoder technologies, which meld multiple non-linear modules, input data are transformed into higher-order, abstract representations. This complex process automatically captures the intricate intrinsic relationships among data features, facilitating robust denoising [
24]. Autoencoders, a classic algorithm within deep learning, excel at isolating and extracting valuable signals from data saturated with noise [
25,
26,
27]. Typically comprising two principal components—the encoder and the decoder—autoencoders function by the encoder learning to represent the features of valuable signals within the input data and mapping these to an abstract feature space. Subsequently, the decoder reconstructs these abstract features and projects them back into the original data space, effectively attenuating the noise [
28]. Within seismic data processing, researchers such as Chen [
29] have developed an unsupervised sparse autoencoder that is specifically tailored for diminishing random noise in stacked seismic datasets. This sparse autoencoder has demonstrated exceptional denoising capabilities, which is particularly evident in its deployment across certain seismic datasets.
Similarly, Saad and Chen [
30] proposed a deep denoising autoencoder model that initially leverages synthetic data for pre-training, followed by unsupervised random noise attenuation in real-field data. Song et al. developed a deep convolutional autoencoder neural network for denoising purposes, which is structured into distinct encoding and decoding frameworks. The encoding framework, composed of convolutional and pooling layers, is adept at extracting features from seismic data while concurrently mitigating noise [
31]. Huang et al. introduced a novel denoising methodology termed A-SK22, which is an encoder–decoder network structure based on a hybrid attention mechanism with symmetric skip connections [
32].
Consequently, this manuscript delineates a deep learning-based approach to denoise AE waveform signals in concrete in order to leverage the autonomous feature extraction and robust fitting capabilities inherent in deep learning models. This approach seeks to enhance the noise reduction performance of AE signals. Ultimately, the enhancement provides more accurate and reliable data support for the health monitoring of concrete structures.
2. Denoising Models
2.1. Traditional Denoising Models
- (1)
Wavelet Packet Model (WPM)
Wavelet packet denoising is a sophisticated signal-processing technique that is used to analyze non-stationary signals through multi-level decomposition. This technique comprises three main steps: signal decomposition, thresholding, and reconstruction.
The process begins by decomposing the signal
f(
t) into wavelet coefficients using the wavelet transform. This transform adapts the mother wavelet function
ψ(
t) with scaling (
a) and shifting (
b) factors:
These coefficients are organized hierarchically through wavelet packet decomposition, distributing the signal information across frequency bands. This arrangement allows for a detailed analysis by dividing the signal’s energy in both the time and frequency dimensions. By choosing suitable wavelet basis functions and decomposition levels, the technique distinguishes valuable information from noise, enhancing the signal clarity.
Once the decomposition is complete, thresholding selectively eliminates or reduces the wavelet coefficients to achieve denoising. Smaller coefficients often represent noise, while larger ones contain crucial signal information. Standard techniques include soft and hard thresholding:
Soft Thresholding: This method sets the coefficients below the threshold to zero and linearly reduces those above it, providing smooth noise attenuation.
Hard Thresholding: This technique sets the coefficients below the threshold to zero without altering the others, effectively handling impulsive noise.
Adaptive thresholding methods dynamically adjust the thresholds for optimal noise reduction, enhancing the technique’s effectiveness across varying signal conditions.
The final step is reconstructing the signal by applying an inverse transformation to the processed wavelet coefficients:
This reconstruction reassembles the coefficients to approximate the original signal, minimizing new noise or distortion.
- (2)
Hilbert Transform Model (HTM)
The Hilbert transform is an effective method for noise reduction, extending an actual signal across the frequency spectrum to extract phase information and enhance denoising. By applying the Hilbert transform, a real signal
x(
t) becomes an analytic signal. This is calculated by:
Here,
j is the imaginary unit, which helps to expand the signal’s spectrum and facilitates the extraction of the instantaneous amplitude and frequency, which is particularly beneficial for signals with narrow frequency components. Next, thresholding manipulates these instantaneous characteristics. The instantaneous amplitude
A(
t) and frequency
ω(
t) are calculated as:
where
ϕ(
t) is the phase of
z(
t).
Thresholding is applied to adjust these characteristics, reducing noise by setting limits on the amplitude and frequency values:
The processed signal is reconstructed using the inverse Hilbert transform, aiming to retain the original signal information while minimizing new distortions. The primary advantage of this method is preserving the signal’s information while analyzing its instantaneous amplitude and frequency. However, the Hilbert transform struggles with complex signals containing multiple frequency components, limiting its effectiveness in broadband signals.
The complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) model is an advanced signal processing method introduced by Dragomiretskiy et al. [
16]. This method enhances the original empirical mode decomposition (EMD) by incorporating white noise to improve the stability and precision of the decomposition results. CEEMDAN is particularly effective in handling non-stationary and non-linear signals through a process involving signal decomposition, thresholding, and reconstruction. By incorporating white noise, CEEMDAN enhances the stability and precision of empirical mode decomposition (EMD), producing consistent results across multiple decompositions.
Initially, CEEMDAN decomposes a given signal
x(
t) using a series of white noise sequences
wi(
t). The decomposition is expressed as:
where
is the
j-th intrinsic mode function of the
i-th white noise series, and
is the residual. Averaging the results from multiple white noise series yields stable IMFs:
These IMFs represent distinct frequency components, partitioning the signal’s energy across the time and frequency domains.
Next, the decomposed IMFs undergo thresholding to eliminate noise. Techniques such as soft and hard thresholding are applied. For each IMF component, soft thresholding is expressed as:
Hard thresholding is represented as:
The threshold is chosen based on the signal’s characteristics and noise level, often determined through experimentation and tuning. Adaptive thresholding can further optimize noise reduction.
Finally, the processed IMFs are summed to reconstruct the denoised signal:
2.2. Unsupervised Skip Autoencoder Model (USAM)
In this paper, after thoroughly analyzing and comparing various denoising techniques, we introduce an unsupervised skip autoencoder network model designed for signal denoising. USAM is a distinct class of neural network architectures that do not rely on labeled training data. Instead, they employ self-learning mechanisms to compress and reconstruct data by capturing the high-dimensional features of the input. This paradigm is particularly adept at processing signals affected by substantial noise. This model, employing internal encoder and decoder mechanisms, learns to reconstruct a clear signal from noisy inputs. At the same time, its multi-layered design ensures the capture of subtle nuances that are present in complex signals.
During the training phase, we employ deep image prior (DIP) and the U-Net structure [
33] to enhance the spatial correlation of proper signals within the network. In the encoding and decoding processes, skip connection blocks are integrated based on the characteristics of the AE data to prevent the leakage of valuable signals and thus maintain an optimal performance in preserving them. Additionally, these skip connections enhance further feature extraction. The feature extraction blocks, composed of fully connected layers, normalization layers, activation function layers, and dropout layers, are principally engaged in extracting data characteristics, compressing noisy AE signal data, and reconstructing denoised AE data.
The encoding process in the fully connected layer can be described as follows:
where
Wei and
bei, respectively, represent the weight matrix and bias of the
i-th layer’s fully connected layer during the encoding process.
aei denotes the output of the
i-th fully connected layer, while
Pi signifies the input data.
Following the fully connected layer, a normalization layer is placed as the second layer in each feature extraction block to prevent the internal covariate shift. Throughout the iterative processes within the network, updating the parameters in lower layers can alter the distribution of the data in higher layers. Such disparities between the input and output data distributions can lead to cumulative errors, reducing the network’s fitting accuracy. By incorporating a normalization layer, training speeds can be enhanced, and the phenomenon of gradient vanishing can be mitigated.
The output from the normalization layer is then conveyed to a ReLU non-linear activation function layer, which can be expressed as follows:
Following the activation layer, a dropout layer is introduced. The dropout layer is a widely utilized regularization technique aimed at preventing or mitigating the issue of model overfitting by randomly turning off a subset of neurons during the training phase. This method helps to enhance the generalization of the model to new data.
After each encoding block abstracts features from the AE data, multiple decoding blocks are employed to reconstruct the AE data. The architecture of the decoder mirrors that of the encoder, and the final output of each decoder block can be expressed as follows:
Wdi and bdi denote the weight matrix and bias for the i-th layer in the decoding process. Qi represents the input data for the i-th decoder block.
Subsequently, the feature extraction blocks are coupled with skip connections to form a symmetric end-to-end network model, which concludes with a fully connected layer and a linear activation function layer, outputting a reconstructed waveform signal that matches the dimensional attributes of the original AE waveform signal.
The network architecture is depicted in
Figure 1. The model, principally comprising encoder, decoder, and skip connection (SK) blocks, is designed symmetrically. Each encoder and decoder block incorporates eight hidden layers, which include two fully connected layers, two normalization layers, two non-linear activation function layers, and two dropout layers. During the encoding phase, the model employs five encoder blocks to extract waveform features from the noise data, with the layer neurons in the fully connected layers decrementing sequentially from 64 to 32 to 16 to 8, and finally stabilizing at 8. In the decoding phase, four decoder blocks are utilized to reconstruct the denoised data, with the neurons in the fully connected layers of the decoders increasing in reverse order from 8 to 16 to 32 to 64. Additionally, the number of neurons in each skip connection block corresponds to the neuron count in the encoder blocks at the same level.
To preserve optimal training outcomes and prevent overfitting, an optimization strategy employing an early stopping mechanism is adopted. Training ceases automatically, and the best model parameters are saved when the loss on the validation set does not improve for ten consecutive epochs. The optimizer used is the Adam optimizer, enhancing the training efficiency and effectiveness of the model.
4. Analysis of Denoising Outcomes Based on Simulated Signals
4.1. Simulated Signals
The noise characteristics of the simulated signals differ from those of the actual AE signals. Noise in the real AE signals is more diverse and closely related to factors such as material properties and environmental conditions. However, broadband noise testing in simulated signals can provide an initial assessment of the adaptability of denoising algorithms in various noise environments.
The spectral characteristics of both the pure and noisy simulated AE signals were analyzed using FFT to compare their frequency components.
Figure 2 presents the time-domain waveforms and spectral diagrams of the simulated AE signal and the noisy AE signal. As shown in
Figure 2c, both the noise-free and noise-containing AE signals exhibit a “double-peak” phenomenon in their spectral graphs. However, the principal frequency of the pure AE signal is significantly lower than that of the noise-inclusive signal, with notable amplitude “oscillations” occurring at higher frequency ranges in the noisy signal. Additionally, the waveforms of the noisy AE signals are considerably more chaotic compared with their pure counterparts, demonstrating the adverse impact of noise on extracting critical parameters, such as ring-down counts, rise time, and the energy from AE signals. This interference complicates the accurate analysis of AE signals, making it more difficult to assess the expansion and characteristics of microcracks in concrete, and posing significant challenges for subsequent evaluations.
4.2. Analysis of Denoising
4.2.1. Analysis of Wavelet Packet Denoising
Through the empirical selection of wavelet packet base functions for the AE signals, the Daubechies3 wavelet function (‘db3’) was chosen as the primary wavelet packet function due to its compact support, orthogonality, regularity, and near symmetry. Using the ‘db3’ wavelet packet, the AE signals are decomposed into four levels. Noise reduction is then achieved through an adaptive method that optimally decomposes the wavelet packet levels and applies an adaptive hard thresholding function, effectively filtering out noise while preserving the essential signal characteristics.
Figure 3a presents a comparison between the pure signal and its denoised counterpart, while
Figure 3b contrasts the noisy signal with the denoised version.
Figure 3c shows the spectral graphs of the pure signal after applying FFT, and
Figure 3d compares the spectral graphs of the pure and denoised signals after FFT. The figures reveal that the wavelet packet decomposition method has not effectively eliminated noise. As shown in
Figure 3a, the difference between the original pure signal and the denoised signal is evident. Ideally, the denoised signal should closely match the pure signal, preserving its essential characteristics and details. However, the results indicate that the denoised signal still contains significant noise components, as shown by the pronounced fluctuations in the waveform. This suggests that the wavelet packet decomposition method was ineffective in fully removing noise, and the denoised signal does not fully restore the purity and clarity of the original signal.
Additionally, as shown in
Figure 3c,d, the amplitude of the denoised signal at specific frequencies has not significantly decreased; in some instances, the peaks are comparable to or even exceed those of the original signal. This indicates that the denoising process was ineffective in sufficiently removing noise and may have inadvertently weakened valuable signal components or introduced new interferences. These findings highlight the limitations of the wavelet packet decomposition method in this case, as it struggles to restore the pure signal’s structure and details effectively.
As shown in
Figure 3b, the comparison between the noisy signal and its denoised counterpart further emphasizes the persistence of high-frequency noise and clutter, which hampers accurate signal recognition and analysis. Ideally, wavelet packet denoising should significantly reduce noise interference, resulting in a more uniform and smoother waveform. However, the visualization reveals that the denoised signal still contains substantial noise, falling short of the desired clarity. This confirms the suboptimal performance of the current denoising approach, demonstrating that it does not sufficiently enhance the signal quality.
4.2.2. Analysis of Hilbert Transform Denoising
The subsequent diagram illustrates the impact of the Hilbert transform method (HTM) denoising treatment, featuring two key comparative visualizations. As shown in
Figure 4a, the comparison between the pure signal and the denoised signal is presented, while
Figure 4b contrasts the noisy signal with its denoised counterpart. After applying the Hilbert transform for denoising, it is clear that low-frequency noise components have been effectively reduced, leading to significant noise reduction in specific frequency bands and a notable enhancement in the signal’s overall smoothness. However, it is important to note that the Hilbert transform did not completely eliminate the high-frequency noise components. This residual high-frequency noise introduces minor fluctuations and distortions in the finer details of the signal. While the improvements in noise reduction are evident, the remaining noise still slightly impacts the signal’s overall integrity.
While the HTM offers clear advantages in noise reduction, particularly in eliminating low-frequency noise, its limitations in filtering high-frequency noise leave residual noise components in the high-frequency regions. This residual noise can negatively impact the overall analysis and application of the signal, reducing the effectiveness of the denoising process.
The expected smoothness and continuity are absent in the spectral graphs, as shown in
Figure 4d. Instead, they reveal discontinuities and abnormal protrusions, indicating that the Hilbert transform did not sufficiently separate the signal from the noise. This inadequacy results in unnatural amplitude variations at specific frequency points, underscoring a critical limitation of the Hilbert transform as a denoising tool, particularly in applications where precise boundary recognition and a comprehensive smoothing of the frequency spectrum are crucial.
4.2.3. Analysis of CEEMDAN Denoising
As shown in
Figure 5, after undergoing CEEMDAN denoising, a simulated AE signal is decomposed into several intrinsic mode functions (IMFs), consisting of nine significant IMF components. A detailed analysis of these components reveals that each IMF exhibits distinct frequency characteristics and temporal variations, highlighting the complexity of the original signal. The characteristics of each IMF can be observed through both time-domain representations and spectral graphs, providing deeper insights into the signal’s intricate structure.
According to the graphical results, the first three IMFs are dominated by high-frequency noise. This noise is marked by irregular frequency patterns and large amplitude fluctuations, often caused by external environmental factors or equipment vibrations. These initial IMFs contribute significantly to noise interference, complicating the signal processing and analysis that follows. Therefore, they are identified as the primary sources of noise in the denoising process.
To remove these disturbances while retaining the essential signal content, the CEEMDAN method filters out the first three IMFs. In contrast, IMFs 4 through 9 exhibit lower frequencies and more stable fluctuations, suggesting that they capture the meaningful information from the original signal, rather than noise. The examination of these lower-frequency IMFs shows that they accurately represent the true physical phenomena and dynamics of the signal. As a result, IMFs 4 to 9 are selected as the foundation for signal reconstruction.
By combining IMFs 4 through 9, a denoised AE signal is reconstituted. This method preserves essential information from the original signal while greatly reducing the noise interference, providing a clearer and more accurate basis for further analysis and processing. This demonstrates the effectiveness of the CEEMDAN approach in improving the clarity and reliability of both signal decomposition and reconstruction.
Figure 6a provides a visual comparison between the original signal and the signal processed using CEEMDAN denoising, while
Figure 6b contrasts the noisy signal with its denoised version. The figures demonstrate that, although the CEEMDAN method effectively reduces most of the high-frequency noise, it shows limitations in addressing low-frequency noise. In
Figure 6a, the method successfully eliminates much of the high-frequency interference, resulting in a smoother and clearer signal. However, some low-frequency fluctuations persist, leaving the waveform less refined than desired, preventing the signal from reaching an ideal level of purity.
In
Figure 6b, the CEEMDAN denoising method effectively suppresses high-frequency noise, significantly improving the waveform in those regions. However, some fluctuations remain in the low-frequency areas, indicating that the noise removal is not entirely complete. This highlights the method’s strength in reducing high-frequency noise but also points to the need for further refinement in addressing low-frequency noise.
In the spectral graphs of
Figure 6c,d, the post-denoising reduction in amplitude across certain frequency areas indicates that CEEMDAN has effectively removed some of the noise components. However, the insufficient reduction at specific frequency points suggests that not all the noise has been fully eliminated. Additionally, discontinuities or abnormal peaks remain, reflecting possible over-suppression that could distort the signal or cause the loss of important information. This highlights a key area for improvement in the CEEMDAN method, specifically in balancing noise reduction with signal integrity across the full frequency spectrum.
4.2.4. Analysis of USAM
Using the mean squared error (MSE) loss function during network training,
Figure 7 demonstrates the effectiveness of noise reduction achieved through the USAM method, with two key comparative diagrams.
Figure 7a compares the pure signal with the denoised signal after processing, and
Figure 7b contrasts the noisy signal with its denoised version. A detailed analysis of these diagrams clearly highlights the USAM’s excellent performance in effectively eliminating noise from the signals.
In
Figure 7a, the pure signal, representing an ideal state free from noise interference, is shown alongside the denoised signal. The pure signal’s waveform is smooth with well-defined details. After denoising through the USAM method, the restored signal closely resembles this purity. The waveform of the denoised signal appears smoother, with the noise disturbances effectively reduced. Notably, the process maintains the signal’s key features and details, indicating that the USAM method has strong noise reduction capabilities while preserving the essential information within the signal.
Figure 7b contrasts a noisy signal, filled with high- and low-frequency noise components that complicate the waveform and obscure valuable information, with the signal after undergoing denoising treatment. Through the application of USAM, most noise components in the noisy signal are effectively identified and filtered out. The denoised waveform appears significantly clearer and smoother, with reduced interference, greatly improving the signal’s readability and accuracy.
In the spectral graphs of
Figure 7c,d, the amplitude at several frequency points in the denoised signal is significantly lower compared with the pure signal, suggesting that the denoising algorithm successfully identified and reduced the noise components. While the amplitude near the signal’s primary frequency components is also reduced, the useful information remains intact, demonstrating that the noise reduction process effectively removes noise without compromising the core content of the signal. Additionally, the denoised signal appears smoother, with no abnormal spikes or discontinuities, further indicating that the algorithm preserves the signal’s continuity and integrity during processing.
After the denoising process, a significant reduction in the signal’s amplitude is observed, with the denoised signal reaching 73.64% of the amplitude of the pure signal. This reduction highlights the considerable effect that noise removal can have on the amplitude characteristics of the signal, reflecting the efficiency of the denoising process in filtering out noise while still preserving the essential signal features.
During the denoising process, the conventional mean squared error (MSE) loss function may be inadequate due to the presence of outliers in the real-world signal data. MSE tends to disproportionately amplify the effect of outliers by squaring large errors, allowing them to exert an outsized influence on the overall training process. This can undermine the robustness of the model, as these outliers can skew the optimization process and negatively affect the model’s final performance, leading to less reliable denoising results.
To overcome this limitation, we propose using the Welsch loss function, a robust alternative designed to improve the model’s handling of outliers. The Welsch loss, widely employed in statistical applications to manage anomalies, offers superior suppression of outliers compared with the MSE loss function. This feature minimizes the influence of outliers on the training process, thereby enhancing the model’s resistance to noise and improving the overall performance in denoising tasks.
The Welsch loss function measures the difference between the predicted and true values using a non-linear function, with a hyperparameter that controls the sensitivity of the loss function to errors. The formula for the Welsch loss function is as follows:
where
N represents the number of samples, while
yture,i and
ypred,i correspond to the true value and the predicted value of the
i-th sample, the hyperparameter
, set at a value of 1.3, respectively.
As illustrated in
Figure 8, the frequency amplitude graph highlights the results of the denoising process using the Welsch loss function during training. Compared with earlier methods, the amplitude of the denoised signal has increased to approximately 79.43% of the original signal’s amplitude. In the frequency domain, the denoised signal (shown in orange) retains the key amplitude characteristics in the low-frequency regions while effectively suppressing noise in the high-frequency segments. This improvement significantly enhances the signal-to-noise ratio, resulting in a clearer and more accurate signal representation.
This result highlights the Welsch loss function’s robust handling of outliers during the signal denoising process, as well as its effectiveness in suppressing high-frequency noise. Its ability to preserve essential signal characteristics while reducing unwanted noise demonstrates the Welsch loss function’s exceptional utility in advanced signal processing applications. The improved clarity and integrity of the denoised signal emphasize the Welsch loss function’s superior performance in achieving a higher fidelity, making it a powerful tool for enhancing the quality of signal representation.
4.3. Analysis of the SNR Value
Before applying the noise reduction methods, the original signal had a signal-to-noise ratio (SNR) of 0.9873502 dB, indicating significant noise content and reduced signal clarity. The wavelet packet denoising model was initially employed, leading to a slight increase in the SNR to 1.22879 dB. Although this improvement suggests some level of noise reduction, the modest SNR increase highlights the wavelet packet technique’s limited ability to substantially enhance the overall signal quality.
Subsequently, we assessed the Hilbert transform noise reduction model. After processing, the SNR achieved a significant gain, reaching 8.33811 dB, a marked improvement from the baseline. Compared with the wavelet packet denoising method, the Hilbert transform exhibited a superior performance in filtering noise and preserving signal fidelity, leading to a substantial enhancement in the overall signal quality.
Further analysis includes the CEEMDAN denoising model, which raised the SNR to 10.26474 dB. This improvement underscores CEEMDAN’s advanced capability in noise reduction, significantly enhancing the signal quality. Compared with the Hilbert transform, CEEMDAN achieved a more substantial SNR gain, demonstrating its superiority in managing complex noise environments and providing a more effective solution for improving signal clarity.
Lastly, the analysis of the USAM denoising model revealed a signal-to-noise ratio (SNR) gain of 10.743391 dB. Although this SNR improvement appears minor compared with CEEMDAN, SNR is only one of many metrics used to assess noise reduction performance. To provide a more comprehensive evaluation, we also analyzed the waveforms and frequency spectra after FFT transformation, as shown in the figures. Specifically,
Figure 6a displays the waveform before and after CEEMDAN noise reduction, while
Figure 6d shows the corresponding frequency spectrum. In contrast,
Figure 8a presents the waveform after applying the USAM noise reduction model, and
Figure 8b shows the corresponding frequency spectrum after FFT transformation.
From these figures, it is evident that, although the USAM model produced only a moderate SNR improvement, its performance in both the time and frequency domains is superior. The waveform in
Figure 8a is smoother, with significantly less residual noise, and the frequency spectrum in
Figure 8b shows effective noise band reduction while retaining essential signal components. This indicates that the USAM model excels at distinguishing signals from noise, resulting in enhanced signal quality and interpretability.
Therefore, while the SNR gain may seem modest, the improvements in both the waveform and spectrum achieved by the USAM model are more significant and intuitive. This demonstrates the great potential of the USAM approach in noise suppression, as it is capable of finely analyzing and eliminating noise components, leading to a substantial improvement in the overall signal quality.
In summary, this comparative analysis highlights the performance of various denoising methods in improving the SNR from an initial 0.9873502 dB to varying degrees. The wavelet packet approach had a limited effect, the Hilbert transform performed well, CEEMDAN proved notably effective, and USAM demonstrated superior noise reduction capabilities. Thus, selecting the appropriate noise reduction model is essential for enhancing the signal quality, especially in complex noise environments, where deep learning technology, such as USAM, shows the greatest potential and promise.