Open AccessArticle

The Intelligent Diagnosis of a Hydraulic Plunger Pump Based on the MIGLCC-DLSTM Method Using Sound Signals

Liqiang Ma

^1,2

Anqi Jiang

^3,* and

Wanlu Jiang

^1,2

Hebei Provincial Key Laboratory of Heavy Machinery Fluid Power Transmission and Control, Yanshan University, Qinhuangdao 066004, China

Key Laboratory of Advanced Forging & Stamping Technology and Science Ministry of Education of China, Yanshan University, Qinhuangdao 066004, China

School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China

Author to whom correspondence should be addressed.

Machines 2024, 12(12), 869; https://doi.org/10.3390/machines12120869 (registering DOI)

Submission received: 13 October 2024 / Revised: 23 November 2024 / Accepted: 27 November 2024 / Published: 29 November 2024

(This article belongs to the Section Machines Testing and Maintenance)

Download

Browse Figures

Figure 1
Cepstral feature extraction flowchart. "> Figure 2
Distribution of Mel filter bank. "> Figure 3
Distribution of inverse Mel filter bank. "> Figure 4
Distribution of Gammatone filter bank. "> Figure 5
LSTM network architecture. "> Figure 6
DLSTM network schematic. "> Figure 7
Flow chart of intelligent diagnosis method of hydraulic plunger pump based on sound signals. "> Figure 8
Hydraulic plunger pump fault simulation test bench. "> Figure 9
Hydraulic plunger pump experimental setup diagram. 1—Oil tank; 2, 24—filter; 3—vane pump; 4, 25—gate valve; 5, 13—flow meter; 6, 15—pressure gauge switch; 7, 16—pressure gauge; 8, 18—relief valve; 9—hydraulic plunger pump; 10—accelerometer; 11—sound level meter; 12—check valve; 14—pressure sensor; 17, 22—accumulator; 19—solenoid valve; 20—electro-hydraulic servo valve; 21—hydraulic cylinder; 23—check throttle valve. "> Figure 10
Physical images of faulty components in hydraulic plunger pump: (a) swash plate wear; (b) slipper wear; and (c) loose slipper. "> Figure 11
Time–domain waveform and power spectrum of hydraulic plunger pump sound signals: (a) normal; (b) swash plate wear; (c) slipper wear; and (d) loose slipper. "> Figure 12
Four types of cepstral features in different states: (a) normal; (b) swash plate wear; (c) slipper wear; and (d) loose slipper. "> Figure 13
Average classification accuracy of ten trials. "> Figure 14
Confusion matrices for different features: (a) MFCC; (b) IMFCC; (c) MICC; (d) MIGCC; (e) MILCC; and (f) MIGLCC. "> Figure 15
Performance comparison of different diagnostic methods. "> Figure 16
Principles or network structures of various methods: (a) SVM; (b) 1D-CNN; and (c) RNN. "> Figure 17
Performance comparison of LSTM networks with different layer numbers. "> Figure 18
t-SNE feature visualization: (a) original data; (b) MIGLCC features; (c) LSTM1 layer; (d) LSTM2 layer; (e) FC layer. "> Figure 19
CWRU bearing fault test bench. "> Figure 20
Time–domain waveform and power spectrum of CWRU bearing vibration signals: (a) normal; (b) inner race fault; (c) outer race fault; and (d) rolling element fault. "> Figure 21
Confusion matrix of CWRU bearing data diagnosis results. "> Figure 22
t-SNE feature visualization before and after CWRU bearing diagnosis: (a) original data; (b) MIGLCC-DLSTM classifies data. "> Figure 23
Servo motor fault test bench. "> Figure 24
Servo motor test system schematic. "> Figure 25
Time–domain waveform and power spectrum of servo motor pressure signals: (a) normal; (b) servo valve internal leakage; (c) spring breakage; (d) quick-closing solenoid valve throttling orifice blockage; (e) internal oil leakage; and (f) external oil leakage. "> Figure 26
Confusion matrix of servo motor data diagnosis results. "> Figure 27
t-SNE feature visualization before and after servo motor diagnosis: (a) original data; (b) MIGLCC-DLSTM classifies data. ">

Versions Notes

Abstract

To fully exploit the rich state and fault information embedded in the acoustic signals of a hydraulic plunger pump, this paper proposes an intelligent diagnostic method based on sound signal analysis. First, acoustic signals were collected under normal and various fault conditions. Then, four distinct acoustic features—Mel Frequency Cepstral Coefficients (MFCCs), Inverse Mel Frequency Cepstral Coefficients (IMFCCs), Gammatone Frequency Cepstral Coefficients (GFCCs), and Linear Prediction Cepstral Coefficients (LPCCs)—were extracted and integrated into a novel hybrid cepstral feature called MIGLCCs. This fusion enhances the model’s ability to distinguish both high- and low-frequency characteristics, resist noise interference, and capture resonance peaks, achieving a complementary advantage. Finally, the MIGLCC feature set was input into a double layer long short-term memory (DLSTM) network to enable intelligent recognition of the hydraulic plunger pump’s operational states. The results indicate that the MIGLCC-DLSTM method achieved a diagnostic accuracy of 99.41% under test conditions. Validation on the CWRU bearing dataset and operational data from a high-pressure servo motor in a turbine system yielded overall recognition accuracies of 99.64% and 98.07%, respectively, demonstrating the robustness and broad application potential of the MIGLCC-DLSTM method.

Keywords:

hydraulic plunger pump; sound signals; hybrid cepstral features; double layer long short-term memory network; intelligent diagnosis

1. Introduction

As a pivotal component of hydraulic systems, hydraulic plunger pumps are extensively utilized in fields such as aerospace, mechanical manufacturing, and construction machinery [1,2,3]. However, due to the harsh working conditions they endure, such as high pressure, high speed, and heavy loads, the likelihood of faults significantly increases [4,5]. When faults occur, they can lead to equipment damage and increased maintenance costs at best and jeopardize the safety of personnel at worst [6,7,8]. Therefore, ensuring the normal operation of hydraulic plunger pumps and enhancing equipment reliability necessitates timely and accurate fault diagnosis, which is of paramount importance.

At present, the field of fault diagnosis for hydraulic plunger pumps has garnered extensive attention and in-depth research from numerous scholars worldwide, with most methods focusing on vibration signal analysis [9,10,11,12,13]. However, the acquisition of vibration signals requires contact measurement, and sensor placement is often restricted, making it inconvenient in certain situations. In contrast, sound signal acquisition is more convenient and can be achieved non-contact. When a hydraulic plunger pump malfunctions, its sound pressure level inevitably changes [14]. By analyzing and processing sound signals, sensitive features related to specific faults can be effectively extracted, enabling a fault diagnosis of the hydraulic plunger pump. Zhu et al. proposed a fault detection method for hydraulic plunger pump sound signals based on particle swarm optimization-enhanced convolutional neural networks (PSO-CNN) [15]. Ugli et al. introduced a diagnostic method based on the automatic optimization of a one-dimensional convolutional neural network (1D-CNN) structure using a genetic algorithm, achieving high recognition accuracy for axial hydraulic plunger pump sound signals [16]. Zhang et al. developed a fault diagnosis method for hydraulic pumps based on the transfer of a ResNet-50 model using average spectrogram histograms of voiceprints [17]. Tang et al. proposed an adaptive CNN-based fault diagnosis method for hydraulic plunger pumps using acoustic images, which demonstrated high accuracy and robustness [18].

As a common feature in audio signal analysis, cepstral coefficients [19,20] effectively capture spectral information, making them suitable for classification and recognition tasks. Among them, Mel Frequency Cepstral Coefficients (MFCCs) [21,22], one of the most widely utilized acoustic features, excel at extracting low-frequency information and are thus highly effective in capturing spectral details of sound signals. Sun et al. proposed a fault diagnosis method based on MFCCs and a transfer learning CNN network, which accurately identified the operational states of a water supply pump [23]. However, its limited resolution in high-frequency regions often results in the loss of critical details of high-frequency fault patterns, thereby impacting diagnostic accuracy. The Inverse Mel Frequency Cepstral Coefficients (IMFCCs) improve high-frequency feature extraction, enhancing the recognition of high-frequency fault modes. Zhang et al. introduced an equipment sound recognition method that integrates MFCC and IMFCC features, achieving high average recognition rates and accuracy, though its susceptibility to environmental noise weakens its robustness [24]. Gammatone Frequency Cepstral Coefficients (GFCCs) [25], which align with human auditory characteristics, demonstrate strong noise resistance under noisy conditions. Hu et al. developed a hybrid feature extraction method combining MFCCs and GFCCs with wavelet decomposition, classified via a CNN network, which enabled an accurate recognition of helicopter audio signals [26]. However, this approach has limited sensitivity to low-frequency signals, and CNN networks offer constrained temporal modeling capabilities [27,28,29,30]. Linear Prediction Cepstral Coefficients (LPCCs) emphasize the resonant properties of audio signals. Ding et al. conducted multi-dimensional feature extraction on audio data by combining MFCCs and LPCCs, then reduced and normalized the features with PCA, applying a Support Vector Machine (SVM) for fault classification, which successfully diagnosed CNC machine tool faults [31]. Despite its effectiveness in identifying certain mechanical faults, this method is somewhat limited under multi-fault scenarios and complex noise conditions, and SVM’s computational complexity poses challenges for large-scale data processing [32,33].

To more comprehensively characterize the features of hydraulic plunger pump sound signals and fully leverage the advantages of various cepstral coefficients, this study employs a fusion of MFCCs, IMFCCs, GFCCs, and LPCCs to create a hybrid cepstral feature known as MIGLCCs. An MFCC primarily focuses on the spectral characteristics of sound signals with high resolution in the low-frequency range [34,35,36,37], an IMFCC emphasizes high-frequency details [38,39,40], a GFCC exhibits robust noise resistance [41,42,43,44], and an LPCC reflects the resonant peak characteristics of sound signals [45,46,47]. This fusion aims to enhance the representational capability of hydraulic plunger pump sound signals by complementing the strengths of each cepstral coefficient. To date, no one has applied a method combining this hybrid cepstral feature and a double layer long short-term memory (DLSTM) network to the diagnosis of hydraulic plunger pump sound signals. Therefore, this paper proposes an intelligent diagnosis method for hydraulic plunger pumps based on MIGLCC-DLSTM using sound signals. The primary contributions are as follows:

(1) A novel fused feature, an MIGLCC, based on four classical cepstral features (MFCC, IMFCC, GFCC, and LPCC), is proposed for the first time. An MIGLCC significantly enhances the representation of high- and low-frequency features while improving noise resistance and formant capture capability. By fully leveraging the complementary strengths of multiple cepstral features, it provides a comprehensive and precise feature description framework for the intelligent diagnosis of sound signals.

(2) Deep learning techniques are introduced through the design of a double layer long short-term memory (DLSTM) network, optimizing the classification model’s training process. Incorporating a Dropout layer optimization strategy effectively reduces overfitting risk and enhances model generalization. By integrating MIGLCC features with the DLSTM network, the MIGLCC-DLSTM intelligent diagnosis method achieves efficient modeling of time-series information in sound signals while maintaining low model complexity. This approach demonstrates exceptional diagnostic accuracy, making it particularly suitable for real-time fault diagnosis in industrial scenarios.

(3) The method consistently achieves high diagnostic accuracy in evaluating the operating states of hydraulic plunger pumps under various working conditions, underscoring its practicality in complex industrial environments. Additionally, validation using the open-source CWRU bearing dataset and actual steam turbine high-pressure servo motor state monitoring data confirms the outstanding generalization capability of the MIGLCC-DLSTM method, showcasing its broad application potential across diverse industries.

The subsequent chapters of this paper are arranged as follows: Section 2 introduces various cepstral coefficient feature extraction methods and reviews the fundamental principles of LSTM networks. Section 3 experiments on hydraulic plunger pumps and analyzes the effectiveness of the proposed method based on the experimental results. Section 4 presents the extended application experiments of the process. Section 5 concludes with the final findings.

2. Introduction to Basic Knowledge

2.1. Extraction of Different Cepstral Features

2.1.1. MFCC

Mel Frequency Cepstral Coefficients (MFCCs) are cepstral parameters designed based on the auditory perception characteristics of the human ear, originally applied in speech recognition [35]. By simulating the ear’s sensitivity to different sound frequencies, MFCCs perform a nonlinear transformation of the spectrum, efficiently capturing the features of sound signals using a triangular filter bank. The relationship between Mel frequency and physical frequency is expressed in Equation (1):

f_{M e l} = 2595 \times \lg (1 + \frac{f}{700})

(1)

where

f

represents the physical frequency in Hz. The flowchart of the MFCC feature extraction process is depicted in Figure 1, and the extraction procedure is as follows.

The pre-emphasis enhances signal clarity and recognizability by balancing spectral energy, amplifying high-frequency components, and reducing spectral envelope slope.

The framing adapts to the non-stationary nature of the signal by dividing the long-duration signal into short-time windows to exploit its short-time stationary characteristics.

The windowing improves the stability and accuracy of the spectral analysis by applying window functions (such as Hamming or Hanning windows) to reduce spectral leakage.

Regarding the Fast Fourier Transform (FFT), after applying FFT to the signal, the energy distribution over the spectrum is obtained, as shown in Equation (2):

X (k) = \sum_{n = 0}^{N - 1} x (n) e^{\frac{- j 2 π n k}{N}}, 0 \leq k \leq N - 1

(2)

where

N

is the number of data points per frame.

The Mel Filter Bank is composed of multiple triangular filters, as depicted in Figure 2. Each triangular filter represents a frequency range on the Mel frequency axis, with the center frequencies equally spaced on the Mel scale. The frequency response of the triangular filter is given by Equation (3):

H_{m} (k) = \{\begin{array}{l} \begin{array}{l} 0 & k < f (m - 1) \end{array} \\ \begin{array}{l} \frac{k - f (m - 1)}{f (m) - f (m - 1)} & f (m - 1) \leq k \leq f (m) \end{array} \\ \begin{matrix} 1 & k = f (m) \end{matrix} \\ \begin{matrix} \frac{f (m + 1) - k}{f (m + 1) - f (m)} & f (m) < k \leq f (m + 1) \end{matrix} \\ \begin{array}{l} 0 & k > f (m + 1) \end{array} \end{array}\}

(3)

where m is the filter index,

f (m)

is the center frequency, and k is a constant.

The logarithmic energy output of the Mel filter bank is given by Equation (4):

s (m) = \ln (\sum_{k = 0}^{N - 1} | X (k) |^{2} H_{m} (k)), 0 \leq m \leq M

(4)

where

M

is the number of triangular filters.

The Discrete Cosine Transform (DCT) converts high-dimensional Mel filter bank energy data into low-dimensional MFCCs, effectively extracting key features of the signal while reducing data dimensionality and enhancing feature independence. The calculation is shown in Equation (5):

M_{F C C} (l) = \sum_{m = 0}^{M - 1} s (m) \cos [\frac{π l (m - 0.5)}{M}], l = 1, 2, \dots, L

(5)

where

l

represents the MFCC order, ranging from 1 to 13. In this study, L is set to 12, consistent with the order of IMFCC, GFCC, and LPCC features [48].

2.1.2. IMFCC

The computation process of Inverse Mel Frequency Cepstral Coefficients (IMFCCs) is the inverse transformation of MFCCs [39]. Due to the dense distribution of the Mel filter bank in the low-frequency region and sparse distribution in the high-frequency region, the MFCC primarily focuses on the low-frequency spectrum of the signal, resulting in lower accuracy for mid-to-high-frequency spectrum calculations. In contrast, an IMFCC uses inverse Mel filters arranged oppositely, as depicted in Figure 3. An IMFCC effectively captures the spectral information of the mid-to-high-frequency parts of the signal, enhancing the resolution and representational capability of feature parameters in these frequency ranges. The relationship between inverse Mel frequency and physical frequency is expressed in Equation (6),

f_{I M e l} = 2146.1 - 1127 \times \lg (1 + \frac{4000 - f}{700})

(6)

where

f

represents the physical frequency in Hz.

The IMFCC feature extraction process is largely similar to that of MFCCs, as illustrated in the flowchart in Figure 1. The computation method for IMFCCs is given by Equation (7):

I_{M F C C} (l) = \sum_{m^{'} = 0}^{M^{'} - 1} s (m^{'}) \cos [\frac{π l (m^{'} - 0.5)}{M^{'}}], l = 1, 2, \dots, L

(7)

where

l

denotes the IMFCC order, consistent with the MFCC feature order, with

L

set to 12.

m^{'}

represents the index of the inverse Mel filter,

M^{'}

is the number of inverse Mel filters, and

s (m^{'})

is the logarithmic energy value output by the filter bank.

2.1.3. GFCC

The extraction process for Gammatone Frequency Cepstral Coefficients (GFCCs) is similar to that of MFCCs, with the primary difference being that an MFCC uses a triangular filter bank based on the Mel scale, whereas a GFCC employs a Gammatone filter bank based on the equivalent rectangular bandwidth (ERB) scale [42], as shown in Figure 4. Compared to the Mel filter bank, the Gammatone filter bank, which is based on the characteristics of the cochlear basilar membrane, offers superior noise resistance and robustness. The impulse response is given by Equation (8):

g_{i} (t) = A t^{m^{″} - 1} e^{- 2 π b_{i} t} \cos (2 π f_{i} + φ_{i}), t ⩾ 0, 1 ⩽ i ⩽ M^{″}

(8)

where

A

is the filter gain,

f_{i}

is the filter center frequency,

φ_{i}

is the phase,

m^{″}

is the Gammatone filter index,

M^{″}

is the number of Gammatone filters, and

b_{i}

is the Gammatone filter’s attenuation factor. The attenuation factor

b_{i}

influences the rate at which the current filter weakens the impulse response and is related to the center frequency, as shown in Equations (9) and (10):

b_{i} = 1.019 E R B (f_{i})

(9)

E R B (f_{i}) = 24.7 \times (\frac{4 . 37 \times f_{i}}{1000} + 1)

(10)

where

E R B (f_{i})

is the equivalent rectangular bandwidth of the frequency.

The GFCC feature extraction flowchart is illustrated in Figure 1. The computation method for a GFCC is given by Equation (11):

G_{F C C} (l) = \sum_{m^{″} = 0}^{M^{″} - 1} g_{i} (m'') \cos (\frac{π l (2 m^{″} - 1)}{2 M^{″}}), l = 1, 2, \dots, L

(11)

where

l

denotes the GFCC order, consistent with the orders of MFCC and IMFCC features, with

L

set to 12.

M^{″}

represents the number of filters, and

g_{i} (m^{″})

is the logarithmic energy value output by the filter bank.

2.1.4. LPCC

Linear Prediction Cepstral Coefficients (LPCCs) utilize linear prediction techniques to analyze and model the vocal tract response characteristics of sound signals [46]. This method effectively captures the resonant peaks and harmonic relationships within the signal, exhibiting high sensitivity to signal variations. LPCC feature parameters are generated by recursively calculating Linear Prediction Coefficients (LPCs) using an all-pole model, as described in Equation (12):

L_{P C C} (l) = \{\begin{array}{l} a_{1} \\ \begin{array}{l} a_{l} + \sum_{k = 1}^{l - 1} \frac{k}{l} c_{k} a_{l - k} 1 < l \leq p \\ \sum_{k = 1}^{l - 1} \frac{k}{l} c_{k} a_{l - k} l > p \end{array} \end{array}\}

(12)

where

a_{1}, a_{2}, \dots, a_{p}

represents the p-order LPC feature vector and

l

denotes the LPCC order, which is consistent with the orders of the previously mentioned MFCC, IMFCC, and GFCC features, resulting in 12-order LPCC feature parameters.

2.1.5. Hybrid Cepstral Feature MIGLCC

Each type of cepstral coefficient offers distinct advantages: an MFCC effectively captures the key spectral features of a signal, with heightened sensitivity to low-frequency components; an IMFCC provides higher resolution for the mid-to-high-frequency ranges; a GFCC demonstrates robust performance in noisy environments; and an LPCC excels at capturing the resonant peaks and harmonic characteristics of a signal. Therefore, this study employs MFCCs, IMFCCs, GFCCs, and LPCCs as the foundational cepstral features. These features are fused through dimensional unification [48] and vector concatenation, followed by normalization, to create a hybrid cepstral feature termed MIGLCC (MFCC and IMFCC and GFCC and LPCC), as shown in Equation (13):

\begin{matrix} MIGLCC = [(M_{1}, M_{2}, \dots, M_{12}), (I_{1}, I_{2}, \dots, I_{12}), \\ (G_{1}, G_{2}, \dots, G_{12}), (L_{1}, L_{2}, \dots, L_{12})] \end{matrix}

(13)

The MIGLCC feature extraction process is illustrated in Figure 1.

2.2. LSTM Network

Long short-term memory (LSTM) networks, introduced by Hochreiter and subsequently designed and improved by Graves [49], are a variant of RNNs that address issues such as memory limitations, gradient vanishing, and gradient exploding encountered in traditional RNNs when processing long sequences [50]. By incorporating gating units, LSTM networks are adept at retaining information over extended time intervals, making them highly suitable for handling long-span time series data. The LSTM network design includes three independent gating units, which enhance its ability to learn from long sequence data. The LSTM network architecture, as shown in Figure 5, comprises four components: the forget gate, input gate, memory cell, and output gate. Here,

x_{t}

represents the input value at the current time step,

h_{t}

represents the hidden state at the current time step, and

c_{t}

represents the cell state information at the current time step [51].

The forget gate outputs a value between 0 and 1 based on the current input

x_{t}

and the hidden state

h_{t - 1}

from the previous time step, indicating the extent to which information from the memory cell should be retained. A forget gate output of 0 means complete forgetting of the previous information, while an output of 1 means complete retention. The specific calculation is given by Equation (14) [52]:

f_{t} = σ (w_{f} [h_{t - 1}, x_{t}] + b_{f})

(14)

where

f_{t}

is the forget gate output value,

σ

is the Sigmoid function,

w_{f}

is the forget gate weight matrix, and

b_{f}

is the forget gate bias term.

The input gate filters and weights the input data to determine which information should be stored in the memory cell. The calculation is given by Equation (15):

i_{t} = σ (w_{i} [h_{t - 1}, x_{t}] + b_{i})

(15)

where

i_{t}

is the input gate output value,

w_{i}

is the input gate weight matrix, and

b_{i}

is the input gate bias term.

The memory cell, as the core component of the network, stores and transmits information, enhancing the network’s ability to capture long-term dependencies within the sequence. The state information from the previous time step

c_{t - 1}

is multiplied by the forget gate output

f_{t}

to retain important information from the past. Simultaneously, the new candidate state information

{\tilde{c}}_{t}

at the current time step, multiplied by the input gate output

i_{t}

, generates new memory information for the current time step. These combined form the updated state information

c_{t}

. The calculations are given by Equations (16) and (17) [53]:

{\tilde{c}}_{t} = \tanh (w_{c} [h_{t - 1}, x_{t}] + b_{c})

(16)

c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ {\tilde{c}}_{t}

(17)

where

w_{c}

is the weight matrix,

b_{c}

is the bias term,

\tan h

is the hyperbolic tangent activation function, and

⊙

denotes element-wise multiplication.

The output gate uses the Sigmoid function to calculate the output value

o_{t}

, which, when multiplied by the state information

c_{t}

processed by the

\tan h

function, yields the final output value

h_{t}

after being filtered by the output gate. The specific calculations are given by Equations (18) and (19):

o_{t} = σ (w_{o} [h_{t - 1}, x_{t}] + b_{o})

(18)

h_{t} = o_{t} ⊙ \tanh c_{t}

(19)

where

o_{t}

is the output gate output value,

w_{o}

is the output gate weight matrix, and

b_{o}

is the output gate bias term.

3. Hydraulic Plunger Pump Simulation Experiment

This paper presents an intelligent diagnosis method for hydraulic plunger pump based on MIGLCC-DLSTM using sound signals, by combining the hybrid cepstral features MIGLCC extracted from the pump’s sound signals with the DLSTM network to construct the MIGLCC-DLSTM diagnostic model. The DLSTM network structure is illustrated in Figure 6. This method enables intelligent diagnosis of hydraulic plunger pumps based on sound signals, with the process flow depicted in Figure 7. The specific implementation steps were as follows:

Step 1: Collect monitoring signals from the hydraulic plunger pump in various states and save the collected data into the computer.

Step 2: Divide the collected sound data into training and testing sets, and label the sound data corresponding to different states of the hydraulic plunger pump.

Step 3: Pre-process the training and testing sets by applying pre-emphasis (with a coefficient of 0.97) and framing with a Hamming window.

Step 4: Extract different cepstral features—MFCC, IMFCC, GFCC, and LPCC—where l represents the order of each cepstral feature, which was set to 12 in this study.

Step 5: Fuse the features through vector concatenation, normalize them, and generate the hybrid cepstral feature MIGLCC (MFCC and IMFCC and GFCC and LPCC).

Step 6: Establish the DLSTM network model, initializing the DLSTM network weight parameters using a Gaussian distribution.

Step 7: Define the model learning rate, number of iterations, and batch size. Input the feature data into the designed DLSTM network model.

Step 8: Train the MIGLCC-DLSTM diagnostic model using the hybrid cepstral feature MIGLCC data extracted from the training set.

Step 9: Test and evaluate the trained model using the MIGLCC feature data extracted from the testing set.

Step 10: Calculate evaluation metrics such as the accuracy of the model on the testing set and comprehensively assess the model’s performance.

3.1. Construction of Experimental Setup and Signal Acquisition

The experiment was conducted on a hydraulic plunger pump fault simulation test bench, as shown in Figure 8, with the system’s operating principles illustrated in Figure 9. A domestically produced AWA5661 precision pulse sound level meter was chosen to capture acoustic signals. This device, equipped with a condenser microphone with a sensitivity of 40 mV/Pa and a frequency response range of 10~16,000 Hz, is well suited for capturing the acoustic characteristics of the hydraulic plunger pump in operation. The sound level meter is designed to convert sound pressure levels into AC voltage signals, which are then input into a computer via a data acquisition card for accurate monitoring and analysis.

During the experiment, the sound signal collection process may be influenced by background noise from sources such as dynamic and transmission noise. This is especially true when the sensor is positioned farther from the sound source, where equipment friction noise may be masked by reverberation, resulting in decreased signal stability. To address this, the experiment employs a near-field measurement approach, positioning the sensor within 0.5 m of the plunger pump and approximately 0.75 m above the ground to capture the primary acoustic signals of the hydraulic plunger pump’s operation. Based on these measurements, the point with the highest sound pressure level around the plunger pump was selected as the main measurement location, where the sound level meter was suspended. A windscreen was also installed to effectively reduce ambient wind noise interference, ensuring a high signal-to-noise ratio in the collected acoustic data.

In addition to using a sound level meter to capture acoustic signals during the hydraulic plunger pump’s operation, pressure and acceleration sensors were also employed to monitor the pressure at the pump’s outlet and the vibrations of the pump casing. These signals were synchronously collected and recorded on a computer, though this study will not delve into these aspects in detail. Once collected via a data acquisition card, the sensor signals were transferred to the computer for monitoring, recording, and further processing. The primary components used in the hydraulic plunger pump fault simulation test, along with their specific models and parameters, are detailed in Table 1.

To ensure the simulated faults accurately reflect actual fault conditions encountered in hydraulic plunger pumps, common faults were replicated through controlled fault injection, as outlined in Table 2. This simulation included typical fault states, such as swash plate wear, slipper wear, and loose slipper. During this process, operators introduced faults by physically intervening with key components of the hydraulic plunger pump under preset fault conditions, recording the real-time state of the faulted components. Photographs of the corresponding faulted components are shown in Figure 10.

During the experiments, the system pressure was set to 5 MPa, the sampling frequency was set to 10 kHz, and each sampling duration was 10 s. The collected data for normal and three fault conditions were stored in the computer for subsequent validation of the intelligent fault diagnosis algorithm.

3.2. Experimental Data Partitioning

To validate the effectiveness of the proposed intelligent diagnosis method, MIGLCC-DLSTM, the collected sound data from the hydraulic plunger pump experiment were analyzed. The time–domain waveforms and power spectra of the sound signals in different states are depicted in Figure 11. Although there are observable differences in the fluctuations and power spectrum distributions of sound signals across various states, these differences are not readily distinguishable merely by observing the time–domain waveforms and power spectra, making it challenging to accurately identify different fault types of the hydraulic plunger pump.

To accurately analyze the local features and dynamic changes within the signals while enhancing the efficiency of the analysis process and the model’s applicability, it is necessary to appropriately segment the collected sound data. When dividing continuous time-series data into individual segments, each segment should fully cover one rotational period of the hydraulic plunger pump. Hence, each data segment should contain at least N sampling points, where N = 60 f_s/n_p, with f_s being the sampling frequency and n_p being the pump speed. Therefore, in this experiment, 1024 sampling points constituted 1 data segment, resulting in 97 data segments for each of the normal and three fault states, totaling 388 data segments across four states. The dataset was divided into training and testing sets based on different partitioning ratios to comprehensively evaluate the model’s performance. The specific partitioning method is shown in Table 3.

3.3. MIGLCC Feature Extraction

In this experiment, the frame length was set to 256 sampling points (25.6 ms, typically 10–30 ms) [54], with an overlap region of 50% (12.8 ms, typically around 10 ms) [54] and a pre-emphasis coefficient of 0.97, and the Hamming window was applied. During the extraction of individual cepstral coefficient features, each data segment was divided into 7 frames, and the order L for different cepstral coefficients was set to 12, meaning that each frame’s feature vector had 12 dimensions. Consequently, a set of 7 feature samples, each with 12 dimensions, was extracted from each data segment. The different cepstral coefficient features under normal and three fault conditions are shown in Figure 12.

As observed in Figure 12, for data of the same state type of the hydraulic plunger pump, the four different single cepstral coefficient features reflect various cepstral coefficient changes under the same frame and the same feature dimension. This indicates that each cepstral coefficient feature captures different aspects of the same sound signal of the hydraulic plunger pump, highlighting the diversity and complementarity of these cepstral coefficient features. This indirectly suggests the potential and necessity of fusing these features to enhance the performance of hydraulic plunger pump sound signal analysis. Moreover, although subtle variations in the fluctuations of identical individual cepstral coefficients were observed across the data from the four distinct operational states of the hydraulic plunger pump—some of which may be attributed to noise or temporal instability—these features still exhibited a notable sensitivity to changes in the sound signals. Therefore, to enhance the stability and robustness of the analysis results, it was necessary to fuse the individual cepstral coefficient features for comprehensive analysis.

The hybrid cepstral feature MIGLCC used in this experiment comprised MFCC, IMFCC, GFCC, and LPCC, unified in dimension [48] and concatenated into vectors, followed by normalization. Each frame’s feature vector dimension was thus 48, resulting in a set of 7 feature samples, each with 48 dimensions, extracted from each data segment.

3.4. Network Parameter Configuration

In this experiment, we utilized the deep learning framework based on PyTorch (version 1.12.0+cu113) with Python 3.9 as the programming language. The runtime environment was configured using Anaconda, and PyCharm was employed as the integrated development environment. The operating system used was Windows 11, with 16.0 GB of memory, a 12th Gen Intel(R) Core(TM) i5-12500H CPU, and an NVIDIA GeForce RTX 3050 Laptop GPU. The experiment implemented a DLSTM network (subsequent sections include a sensitivity analysis of the parameters, where the impact of different network layers on model performance was examined, ultimately confirming the use of a two-layer LSTM network) [22]. In addition to the number of network layers, other critical parameters included the learning rate, the number of hidden units, dropout rate, batch size, and number of epochs. The specific configurations are detailed in Table 4.

3.5. Analysis of Experimental Results

3.5.1. Performance Comparison of Diagnostic Models with Different Data Partition Ratios

To compare the three different partitioning methods of the aforementioned dataset, the proposed MIGLCC-DLSTM method was employed for diagnosis. During the experiment, the input to the DLSTM network consisted of the extracted MIGLCC feature data from the hydraulic plunger pump sound signals. The dimensions of the input and output data for each layer of the network are detailed in Table 5. To ensure the accuracy of the test results, each test was repeated 10 times under the same parameter conditions. The average overall accuracy of the 10 tests, along with the precision, recall, and F1 score under the Macro criterion, were used as evaluation metrics. The results are presented in Table 6.

As observed, when the proportion of the training set in the dataset decreased from 70% to 20%, the overall accuracy of the MIGLCC-DLSTM model in diagnosing the normal state and the three fault states of the sound signal decreased from 99.41% to 98.88%, and the F1 score dropped from 0.9940 to 0.9887. This indicates that the MIGLCC-DLSTM model’s ability to capture overall data characteristics diminishes when the training data are reduced. To enhance the model’s capacity to capture long-term dependencies within the data and improve its overall performance and generalization ability, subsequent experiments were conducted using Dataset 1 (with a 7:3 split ratio). Nevertheless, despite the reduction in training data, the model’s overall accuracy and F1 score remained at a high level, demonstrating strong generalization capabilities, which is of significant importance for practical industrial applications.

3.5.2. Performance Comparison of Diagnostic Models with Different Cepstral Features

To validate the diagnostic superiority of the proposed hybrid cepstral feature MIGLCC within the DLSTM network, we selected individual cepstral features MFCC, IMFCC, GFCC, and LPCC, along with dual-feature combinations MICC (MFCC and IMFCC), MGCC (MFCC and GFCC), MLCC (MFCC and LPCC), IGCC (IMFCC and GFCC), ILCC (IMFCC and LPCC), and GLCC (GFCC and LPCC), as well as triple-feature combinations MIGCC (MFCC and IMFCC and GFCC), MILCC (MFCC and IMFCC and LPCC), MGLCC (MFCC and GFCC and LPCC), and IGLCC (IMFCC and GFCC and LPCC) for the comparative analysis. Each of these features was extracted from Dataset 1 and input into the DLSTM network for diagnosis. Evaluation was based on the average overall accuracy, precision, recall, and F1 score under the Macro criteria across 10 trials, with the experimental results presented in Table 7.

The results indicate that the proposed hybrid cepstral feature MIGLCC achieved an overall accuracy of 99.41%, a precision of 99.39%, a recall of 99.43%, and an F1 score of 0.9940. This demonstrates that the MIGLCC feature effectively combines the advantages of an MFCC, which provides good resolution for the low-frequency parts of sound data; an IMFCC, which offers higher resolution in the mid-to-high-frequency range; a GFCC, which exhibits robust noise interference resistance; and an LPCC, which reflects the resonant peak characteristics of the signal. By complementing each other’s strengths, an MIGLCC provides a more comprehensive description of the hydraulic plunger pump sound data. Compared to the average diagnostic results of a single cepstral feature, the overall accuracy improved by 10.09%, precision by 10.65%, recall by 9.72%, and F1 score by 0.1120, indicating that the single feature cannot fully capture the complexity and diversity of hydraulic plunger pump sound data. Compared to the dual-feature and triple-feature fused cepstral features, the hybrid cepstral feature MIGLCC demonstrated superior performance in overall accuracy, precision, recall, and F1 score. This indicates that an MIGLCC possesses richer information representation capabilities, capturing critical information in the hydraulic plunger pump sound data more comprehensively and accurately. This enhanced capability aids in distinguishing between different state categories, providing stronger robustness when dealing with hydraulic plunger pump sound data under various conditions.

The average classification accuracy of the MIGLCC-DLSTM model across multiple experiments is illustrated in Figure 13. For the four distinct operating conditions of the hydraulic plunger pump, the model demonstrates remarkable consistency, with a minimum recognition accuracy of 98.30% and a maximum of 100%. Over repeated experiments, the overall average accuracy reaches 99.41%, with a standard deviation of 0.664. These results highlight the MIGLCC-DLSTM method’s minimal susceptibility to random factors across trials, showcasing exceptional stability and robustness, thereby establishing a solid foundation for reliable application in complex industrial environments.

To further investigate the optimal intelligent diagnosis performance of the hybrid cepstral feature MIGLCC in the DLSTM network, confusion matrices for single trials under MFCC, IMFCC, MICC, MIGCC, MILCC, and MIGLCC features were presented, along with a detailed analysis of the DLSTM network performance with different feature inputs from the hydraulic plunger pump sound data. The confusion matrices are shown in Figure 14.

In these matrices, 0 represents normal, 1 represents swash plate wear, 2 represents slipper wear, and 3 represents loose slipper. Figure 14a,b illustrate the diagnostic results in the DLSTM network using single cepstral features MFCC and IMFCC, respectively. Both features achieve a 100% recall for the normal and slipper wear conditions, indicating their strong discriminative capability for these states in the acoustic data of the hydraulic plunger pump.

Figure 14c shows the confusion matrix for the diagnostic results of the dual-feature fused cepstral feature MICC in the DLSTM network. Compared with single cepstral features MFCC and IMFCC, while an MICC exhibits a slight reduction in recognition capability for swash plate wear, it improves the recall for loose slipper from 70.59% and 61.76% to 79.41%, highlighting the enhanced recognition rate for loose slipper achieved through feature fusion.

Figure 14d,e display the confusion matrices for the diagnostic results of the triple-feature fused cepstral features MIGCC and MILCC in the DLSTM network. Compared to the single cepstral feature MFCC, these fused features demonstrate a marked improvement in identifying loose slipper, with recall rising from 70.59% to 91.18% and 82.35%, respectively. Compared to the dual-feature fused cepstral feature MICC, although there is a slight reduction in recognition accuracy for the normal condition, an MIGCC and MILCC achieve a 95.83% recall for swash plate wear and recall of 91.18% and 82.35% for loose slipper, respectively, both higher than those of an MICC.

Figure 14f depicts the confusion matrix for the diagnosis results using the proposed four-feature fused cepstral feature MIGLCC in the DLSTM network. The recall for normal state, swash plate wear, and slipper wear reached 100%, and the recall for loose slipper is 97.06%. This demonstrates that MIGLCC features can effectively distinguish the sound data of the hydraulic plunger pump under different state types. Compared to single cepstral features and the dual-feature and triple-feature fusion methods, the MIGLCC exhibits clear advantages. This further illustrates that MIGLCC features fully integrate the strengths of various cepstral coefficients, showcasing the model’s robustness and generalization capability under multi-type fault experimental conditions.

3.5.3. Performance Comparison of Different Diagnostic Methods

To further validate the superiority of the proposed intelligent diagnosis method MIGLCC-DLSTM, it was compared with popular current diagnosis methods, including SVM, 1D-CNN, and RNN. The experiment used the four-feature fused MIGLCC data from the sound signals of the hydraulic plunger pump as input, and the average overall accuracy of 10 trials as the evaluation metric. The results are presented in Figure 15. The parameter settings for each method are as follows:

(1): SVM: The penalty factor C was set to 0.1, the kernel function type was the radial basis function, and the width parameter σ was 12. The principle of SVM is illustrated in Figure 16a.
(2): 1D-CNN: The network included an input layer, three convolutional layers (with 64, 128, and 256 filters; kernel size of three; and stride of one), three max-pooling layers (pooling window size of two), two fully connected layers, and an output layer. During training, the ReLU activation function was used, with an Adam optimizer, a learning rate of 0.001, a dropout rate of 0.5, a batch size of 16, and 20 epochs. The network structure is shown in Figure 16b.
(3): RNN: The network comprises an input layer, hidden layers, and an output layer. During training, the Adam optimizer was used, with 32 hidden units, a learning rate of 0.001, a dropout rate of 0.2, a batch size of 16, and 20 epochs. The network structure is depicted in Figure 16c.

When using the SVM method in machine learning, the overall diagnostic accuracy reached 92.31%. With the 1D-CNN network in deep learning, the accuracy increased to 95.56%, with 913,668 parameters and a runtime of 47.18 s. The traditional RNN network achieved an overall diagnostic accuracy of 95.04%, with 23,300 parameters and a runtime of 25.66 s. In contrast, the MIGLCC-DLSTM method proposed in this paper achieved a diagnostic accuracy of 99.41%, with only 223,748 parameters and a significantly shorter runtime of 22.53 s. Although the SVM and RNN methods feature lower model complexity, their accuracy and ability to capture complex patterns in the fault diagnosis of hydraulic plunger pump sound signals lag behind the proposed method. While the 1D-CNN method improved diagnostic accuracy, its large parameter size substantially increased computational complexity. In comparison, the MIGLCC-DLSTM intelligent diagnostic method more effectively captures key information from the data, significantly improving diagnostic accuracy while demonstrating clear advantages in computational complexity and runtime. This provides a more feasible solution for diagnosing hydraulic plunger pump sound signals, validating the method’s superior performance.

3.5.4. Performance Analysis of the Diagnostic Model Under Multiple Operating Conditions

To further validate the applicability of the proposed MIGLCC-DLSTM intelligent diagnostic method under various operating conditions of the hydraulic plunger pump, different working conditions were simulated by altering the pressure of the test pump. Pressure is one of the most critical parameters in hydraulic systems, effectively reflecting the system’s state under varying loads. Sound data were collected under pressures of 2 MPa, 8 MPa, 10 MPa, and 15 MPa for the hydraulic plunger pump in its normal state as well as under conditions of swash plate wear, slipper wear, and loose slipper. All other experimental conditions remained unchanged to ensure the model’s capability of accurately diagnosing faults across a wide range of load conditions. The sound data were processed using the proposed MIGLCC feature extraction method and then input into the DLSTM network for diagnosis. The model’s performance was evaluated using the average overall accuracy, precision, recall, and F1 score under the Macro criterion across 10 trials. Table 8 presents the test results of the MIGLCC-DLSTM intelligent diagnostic model under different operating conditions.

The results show that under the operating conditions of 2 MPa, 8 MPa, 10 MPa, and 15 MPa, the overall diagnostic accuracy of the MIGLCC-DLSTM model reached 98.89%, 99.40%, 98.63%, and 98.97%, respectively, with corresponding F1 scores of 0.9885, 0.9939, 0.9841, and 0.9895. These findings demonstrate that the MIGLCC-DLSTM model maintains excellent diagnostic performance across different operating conditions of the hydraulic plunger pump. This highlights not only the model’s strong fault recognition capability under single operating conditions but also its ability to sustain high diagnostic accuracy and generalization across varying conditions. Future research could extend to more practical scenarios, exploring the model’s diagnostic performance under varying environmental temperatures, rotational speeds, and other factors to further enhance its robustness and practical application value.

3.6. Parameter Sensitivity

Compared to single-layer neural networks, multi-layer neural networks exhibit enhanced capabilities in feature extraction, particularly in handling complex signals. By increasing the number of layers in an LSTM network, the representational capacity of the network can be improved, enabling it to effectively learn abstract features from high-dimensional time-series data and thereby enhance the model’s recognition accuracy. However, as the number of layers increases, the complexity of the model also rises, leading to longer training times and a higher risk of overfitting. Therefore, this study evaluated the performance of LSTM networks with different numbers of layers using the extracted MIGLCC feature data. The experimental results are shown in Figure 17.

It can be observed that in diagnosing the sound data of the hydraulic plunger pump, the model achieved an overall diagnostic accuracy of 99.41% with a training time of 22.53 s when the LSTM network had two layers. This configuration demonstrated a significant advantage in overall accuracy compared to network structures with one or four layers. When compared to a three-layer network structure, the two-layer network achieved the same overall diagnostic accuracy but required less training time, indicating higher efficiency. Considering a balance between overall diagnostic accuracy and training time, this study ultimately employed a two-layer LSTM network structure for the intelligent diagnosis task to achieve optimal model performance.

3.7. Visualization of Feature Representations

Through repeated experimental validation, the proposed intelligent diagnosis method, MIGLCC-DLSTM, demonstrated outstanding recognition capabilities across four different state types of hydraulic plunger pump sound signals. To intuitively illustrate the feature learning process of the MIGLCC-DLSTM diagnostic model, we employed t-distributed Stochastic Neighbor Embedding (t-SNE) to analyze the hydraulic plunger pump sound data. The feature clustering results are presented in Figure 18. Here, 0 represents normal, 1 represents swash plate wear, 2 represents slipper wear, and 3 represents loose slipper, while component1 and component2 represent the two dimensions after t-SNE visualization and dimensionality reduction. As shown in Figure 18a, the original sound data appear as scattered, disorganized points. After extracting MIGLCC features, the data for different state types start to cluster, although not very tightly. The normal state data features are somewhat separated from the other three state categories, while the features for swash plate wear, slipper wear, and loose slipper are closer together, with some overlap between slipper wear and loose slipper features. Following processing through the LSTM1 and LSTM2 layers of the DLSTM network, data of the same state type become more densely clustered, and the distances between clusters of different state types increase further. Finally, after the data passes through the FC layer of the DLSTM network, four distinct clusters can be observed, significantly enhancing the classification effect. The results indicate that the intelligent diagnosis method, MIGLCC-DLSTM, possesses robust recognition and classification capabilities for hydraulic plunger pump sound signals.

4. Extended Application of the MIGLCC-DLSTM Method

To further validate the feasibility and versatility of the proposed MIGLCC-DLSTM intelligent diagnosis method, we applied it to different signal categories and diagnostic targets for algorithm extension and verification.

4.1. CWRU Bearing Dataset

In this experiment, we selected rolling bearing test data provided by the Case Western Reserve University Bearing Data Center for algorithm validation [55]. The test bench is shown in Figure 19. The experiment utilized SKF bearings on the motor drive end, which were subjected to electric spark machining to generate three fault types: inner race fault, outer race fault, and rolling element fault. The experimental data encompass four different operating conditions, specifically no load with a bearing speed of 1797 r/min, load 1 with a bearing speed of 1772 r/min, load 2 with a bearing speed of 1750 r/min, and load 3 with a bearing speed of 1730 r/min. Under each condition, vibration data were collected for both the normal state and three fault states. All data were sampled at a frequency of 12 kHz, with a sampling duration of 10 s. Figure 20 illustrates the time–domain waveforms and power spectra of the vibration signals for the four states under the no-load condition, with a bearing speed of 1797 r/min.

We conducted the first extension application experiment of the MIGLCC-DLSTM intelligent diagnosis method using the collected bearing vibration data, following the same experimental process as the hydraulic plunger pump. First, the experimental data were partitioned. Each data segment consisted of 1024 sampling points. For each state type (normal state and three fault states) under the four operating conditions, 117 data segments were obtained, resulting in a total of 1872 data segments across the four conditions. The data were split into a training set and a test set at a ratio of 7:3, with 1310 data segments in the training set and 562 data segments in the test set. Next, MIGLCC features were extracted from the partitioned data. The frame length was set to 256 sampling points with a 50% overlap, a pre-emphasis coefficient of 0.97, and a Hamming window was applied. For single cepstral coefficient feature extraction, each data segment was divided into seven frames, with each cepstral coefficient order set to 12, resulting in seven feature samples, each with 12 dimensions, for each segment. MIGLCC features were extracted by concatenating and normalizing MFCC, IMFCC, GFCC, and LPCC features from the vibration data for the four state types, producing seven feature samples, each with 48 dimensions, for each segment. Finally, the diagnosis was performed. The MIGLCC feature data from the training set were input into the DLSTM network to train the MIGLCC-DLSTM diagnostic model, which was then evaluated using the MIGLCC feature data from the testing set. The average overall accuracy of 10 trials, along with the precision, recall, and F1 score under the Macro criterion, were used as evaluation metrics. The results are presented in Table 9, with the confusion matrix for one of the tests shown in Figure 21.

In the confusion matrix, 0 represents normal, 1 represents inner race fault, 2 represents outer race fault, and 3 represents rolling element fault. The results demonstrate that the proposed method accurately classified normal, inner race faults, and rolling element faults, with only a few outer race fault samples misclassified as inner race faults. However, as shown in Table 9, the model’s overall classification accuracy still reached 99.64%. This confirms the feasibility and applicability of the MIGLCC-DLSTM intelligent diagnosis method for bearing vibration signal diagnosis.

To visually observe the feasibility of the MIGLCC-DLSTM intelligent diagnosis method, we performed t-SNE feature visualization on the CWRU bearing diagnosis data before and after diagnosis, as shown in Figure 22. The original data exhibited a relatively uniform distribution across different state types. After applying the MIGLCC-DLSTM intelligent diagnosis method, features of the same type clustered together distinctly, with a clear separation between features of different types. This demonstrates the method’s effectiveness in distinguishing the four different state types of the bearings.

4.2. Servo Motor Dataset

In this experiment, we selected the high-pressure servo motor system test data of a steam turbine provided by a manufacturer for algorithm validation [56]. The test bench is shown in Figure 23, and the system schematic is illustrated in Figure 24. The experiment induced five fault types by artificially damaging normal components or replacing them with faulty ones: servo valve internal leakage, spring breakage, quick-closing solenoid valve throttling orifice blockage, internal oil leakage, and external oil leakage. Oil pressure data were collected from the M4 measurement point in the high-pressure chamber under normal conditions and the five fault conditions while the servo motor was operating, with a sampling frequency of 12.8 kHz and a duration of 20 s. The time–domain waveforms and low-frequency power spectra of the pressure signals for the six conditions are depicted in Figure 25.

We conducted the second extension application experiment of the MIGLCC-DLSTM intelligent diagnosis method using the collected pressure data from the servo motor. The diagnostic procedure followed the same process as described for the hydraulic plunger pump. First, the experimental data were partitioned. Each data segment consisted of 1024 sampling points, resulting in 250 data segments for each state type and a total of 1500 data segments across the six states. These were divided into training and testing sets in a 7:3 ratio, with 1050 segments in the training set and 450 segments in the testing set. Next, MIGLCC features were extracted from the partitioned data. The frame length was set to 256 sampling points with a 50% overlap and a pre-emphasis coefficient of 0.97, and a Hamming window was applied. For single cepstral coefficient feature extraction, each data segment was divided into seven frames, with each cepstral coefficient order set to 12, resulting in seven feature samples, each with 12 dimensions, for each segment. MIGLCC features were extracted by concatenating and normalizing MFCC, IMFCC, GFCC, and LPCC features from the pressure data for the six state types, producing seven feature samples, each with 48 dimensions, for each segment. Finally, the diagnosis was performed. The MIGLCC feature data from the training set were input into the DLSTM network to train the MIGLCC-DLSTM diagnostic model, which was then evaluated using the MIGLCC feature data from the testing set. The average overall accuracy of 10 trials, along with the precision, recall, and F1 score under the Macro criterion, were used as evaluation metrics. The results are presented in 10, with the confusion matrix for one of the tests shown in Figure 26.

In the confusion matrix, 0 represents normal, 1 represents servo valve internal leakage, 2 represents spring breakage, 3 represents quick-closing solenoid valve throttling orifice blockage, 4 represents internal oil leakage, and 5 represents external oil leakage. The results indicate that the proposed method accurately classified spring breakage, quick-closing solenoid valve throttling orifice blockage, and external oil leakage. However, there were some misclassifications: a few normal samples were incorrectly classified as quick-closing solenoid valve throttling orifice blockage and internal oil leakage, some servo valve internal leakage samples were misclassified as internal oil leakage, and some internal oil leakage samples were misclassified as normal. These misclassifications suggest that the data distributions in the feature space are relatively close, with subtle differences making accurate distinction challenging. Nevertheless, as shown in Table 10, the model achieved an overall classification accuracy of 98.07%, demonstrating the feasibility and applicability of the MIGLCC-DLSTM intelligent diagnosis method for diagnosing servo motor pressure signals.

To visually reveal the classification performance of the MIGLCC-DLSTM intelligent diagnosis method, t-SNE feature visualization was performed on the servo motor diagnosis data before and after diagnosis, as shown in Figure 27. It can be observed that after processing with the MIGLCC-DLSTM intelligent diagnosis method, feature data of the same type formed clusters, and there was a certain distance between features of different types. The feature distances between servo valve internal leakage, spring breakage, and external oil leakage were relatively large. This demonstrates the feasibility of the method.

5. Conclusions

This paper proposes an intelligent diagnostic method for hydraulic plunger pumps based on sound signals, utilizing the MIGLCC-DLSTM model for analyzing the collected sound data. Through cepstral analysis of the sound signals, four distinct features—an MFCC, IMFCC, GFCC, and LPCC—were extracted and fused into a novel mixed cepstral feature, MIGLCC, which was then fed into the DLSTM network for diagnosis. The results demonstrate the following:

The MIGLCC feature effectively integrates the strengths of the individual cepstral features, excelling particularly in capturing both high- and low-frequency information, noise resilience, and resonance peak characteristics. The method achieved an overall diagnostic accuracy of 99.41%, significantly surpassing that of single- or dual- and triple-feature fusion methods, thereby proving its superior ability to represent the complexities and nuances of hydraulic plunger pump sound signals. In comparative analyses with other diagnostic approaches, MIGLCC-DLSTM exhibited far superior performance, with a total parameter of 223,748 and a running time of 22.53 s, showcasing exceptional control over model complexity and computational efficiency.

Furthermore, under various operational conditions, ranging from 2 MPa to 15 MPa, the method maintained a high diagnostic accuracy of 98.63% to 99.41%, underscoring its robust fault detection capabilities and remarkable generalization across different scenarios. By virtue of its outstanding feature extraction capabilities, high accuracy, and exceptional operational efficiency, the MIGLCC-DLSTM intelligent diagnostic method presents a highly effective and reliable solution for hydraulic plunger pump fault diagnosis, offering vast potential for industrial applications. Additionally, when applied to other monitored systems, such as bearings and servo motors, this method continued to deliver excellent diagnostic performance, further affirming its versatility and broad applicability.

However, to enhance the accuracy and practicality of our diagnostic model, future research will focus on integrating physical characteristics of hydraulic systems with data-driven deep learning approaches, aiming to develop hybrid models that reflect the system’s internal mechanisms. This would improve both the interpretability and robustness of fault diagnosis, ensuring reliability in detecting novel fault types. Additionally, more sophisticated feature fusion techniques—such as attention mechanisms and multi-level fusion—will be explored to better combine acoustic features, further boosting diagnostic performance. Finally, a deeper investigation into the acoustic feature variations of hydraulic pumps under varying operational conditions and their underlying physical mechanisms will refine the feature extraction process, improving both the accuracy and timeliness of fault detection. These efforts will contribute to more precise, generalizable, and efficient fault diagnosis systems in industrial applications.

Author Contributions

L.M.: Conceptualization, Data curation, Methodology, Investigation, Formal analysis, Software, Visualization, Writing—original draft. A.J.: Resources, Validation, Writing—review and editing. W.J.: Funding acquisition, Supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Natural Science Foundation of China (Grant Nos. 52275067) and the Province Natural Science Foundation of Hebei, China (Grant Nos. E2023203030).

Data Availability Statement

Data are available on request from the corresponding author.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Nomenclature

The following abbreviations are used in this manuscript:

MFCC	Mel Frequency Cepstral Coefficient
IMFCC	Inverse Mel Frequency Cepstral Coefficient
GFCC	Gammatone Frequency Cepstral Coefficient
LPCC	Linear Prediction Cepstral Coefficient
MICC	MFCC and IMFCC
MIGCC	MFCC and IMFCC and GFCC
MILCC	MFCC and IMFCC and LPCC
MIGLCC	MFCC and IMFCC and GFCC and LPCC
FFT	Fast Fourier Transform
SVM	Support Vector Machine
1D-CNN	1D Convolutional Neural Network
RNN	Recurrent Neural Network
LSTM	Long short-term memory
DLSTM	Double layer long short-term memory
t-SNE	t-distributed Stochastic Neighbor Embedding
FC	Fully connected layer

References

Zheng, Z.; Li, X.; Zhu, Y. Feature Extraction of the Hydraulic Pump Fault Based on Improved Autogram. Measurement 2020, 163, 107908. [Google Scholar] [CrossRef]
Tang, S.; Khoo, C.; Zhu, Y.; Lim, M.; Yuan, S. A Light Deep Adaptive Framework toward Fault Diagnosis of a Hydraulic Piston Pump. Appl. Acoust. 2024, 217, 109807. [Google Scholar] [CrossRef]
Zhu, Y.; Li, G.; Wang, R.; Tang, S.; Su, H.; Cao, K. Intelligent Fault Diagnosis of Hydraulic Piston Pump Combining Improved LeNet-5 and PSO Hyperparameter Optimization. Appl. Acoust. 2021, 183, 108336. [Google Scholar] [CrossRef]
Chao, Q.; Shao, Y.; Liu, C.; Yang, X. Health Evaluation of Axial Piston Pumps Based on Density Weighted Support Vector Data Description. Reliab. Eng. Syst. Saf. 2023, 237, 109354. [Google Scholar] [CrossRef]
Tang, S.; Zhu, Y.; Yuan, S. Intelligent Fault Identification of Hydraulic Pump using Deep Adaptive Normalized CNN and Synchrosqueezed Wavelet Transform. Reliab. Eng. Syst. Saf. 2022, 224, 108560. [Google Scholar] [CrossRef]
Dong, C.; Tao, J.; Chao, Q.; Yu, H.; Liu, C. Subsequence Time Series Clustering-Based Unsupervised Approach for Anomaly Detection of Axial Piston Pumps. IEEE Trans. Instrum. Meas. 2023, 72, 3512212. [Google Scholar] [CrossRef]
Jiang, W.; Ma, L.; Zhang, P.; Zheng, Y.; Zhang, S. Anomaly Detection of Axial Piston Pump Based on the DTW-RCK-IF Composite Method Using Pressure Signals. Appl. Sci. 2023, 13, 13133. [Google Scholar] [CrossRef]
Dong, C.; Tao, J.; Sun, H.; Chao, Q.; Liu, C. Inverse Transient Analysis Based Calibration of Surrogate Pipeline Model for Fault Simulation of Axial Piston Pumps. Mech. Syst. Signal Process. 2023, 205, 110829. [Google Scholar] [CrossRef]
Yu, H.; Li, H.; Li, Y. Vibration Signal Fusion Using Improved Empirical Wavelet Transform and Variance Contribution Rate for Weak Fault Detection of Hydraulic Pumps. ISA Trans. 2020, 107, 385–401. [Google Scholar] [CrossRef] [PubMed]
Tang, H.; Fu, Z.; Huang, Y. A Fault Diagnosis Method for Loose Slipper Failure of Piston Pump in Construction Machinery under Changing Load. Appl. Acoust. 2021, 172, 107634. [Google Scholar] [CrossRef]
Liu, S.; Yin, J.; Hao, M.; Liang, P.; Zhang, Y.; Ai, C.; Jiang, W. Fault Diagnosis Study of Hydraulic Pump Based on Improved Symplectic Geometry Reconstruction Data Enhancement Method. Adv. Eng. Inform. 2024, 61, 102459. [Google Scholar] [CrossRef]
Chen, X.; Zhou, H.; Mao, Y. Analysis of vibration and noise induced by unsteady flow inside a centrifugal compressor. Aerosp. Sci. Technol. 2020, 107, 106286. [Google Scholar]
Karagiovanidis, M.; Pantazi, X.E.; Papamichail, D.; Fragos, V. Early Detection of Cavitation in Centrifugal Pumps Using Low-Cost Vibration and Sound Sensors. Agriculture 2023, 13, 1544. [Google Scholar] [CrossRef]
Ye, S.; Zhang, J.; Xu, B.; Song, W.; Zhu, S. Experimental Studies of the Vibro-acoustic Characteristics of an Axial Piston Pump under Run-up and Steady-state Operating Conditions. Measurement 2019, 133, 522–531. [Google Scholar] [CrossRef]
Zhu, Y.; Li, G.; Tang, S.; Wang, R.; Su, H.; Wang, C. Acoustic Signal-Based Fault Detection of Hydraulic Piston Pump Using a Particle Swarm Optimization Enhancement CNN. Appl. Acoust. 2022, 192, 108718. [Google Scholar] [CrossRef]
Ugli, O.E.M.; Lee, K.; Lee, C. Automatic Optimization of One-Dimensional CNN Architecture for Fault Diagnosis of a Hydraulic Piston Pump Using Genetic Algorithm. IEEE Access 2023, 11, 68462–68472. [Google Scholar] [CrossRef]
Zhang, P.; Jiang, W.; Zheng, Y.; Zhang, S.; Liu, S. Hydraulic-Pump Fault-Diagnosis Method Based on Mean Spectrogram Bar Graph of Voiceprint and ResNet-50 Model Transfer. J. Mar. Sci. Eng. 2023, 11, 1678. [Google Scholar] [CrossRef]
Tang, S.; Zhu, Y.; Yuan, S. A Novel Adaptive Convolutional Neural Network for Fault Diagnosis of Hydraulic Piston Pump with Acoustic Images. Adv. Eng. Inform. 2022, 52, 101554. [Google Scholar] [CrossRef]
Ganji, M.; Ghelmani, A.; Golroo, A.; Sheikhzadeh, H. Mean Texture Depth Measurement with an Acoustical-based Apparatus Using Cepstral Signal Processing and Support Vector Machine. Appl. Acoust. 2020, 161, 107168. [Google Scholar] [CrossRef]
Liang, G.; Guo, S.; Zou, N.; Wu, G. A Characteristic Extraction Method for VoicePrint Slice Statistics Base on Joint Time-frequency Processing. Appl. Acoust. 2024, 216, 109814. [Google Scholar] [CrossRef]
Pu, H.; Wen, Z.; Sun, X.; Han, L.; Na, Y.; Liu, H.; Li, W. Research on the mechanical fault diagnosis method based on sound signal and IEMD-DDCNN. Int. J. Intell. Comput. Cybern. 2023, 16, 629–646. [Google Scholar] [CrossRef]
Yan, H.; Bai, H.; Zhan, X.; Wu, Z.; Wen, L.; Jia, X. Combination of VMD Mapping MFCC and LSTM: A New Acoustic Fault Diagnosis Method of Diesel Engine. Sensors 2022, 22, 8325. [Google Scholar] [CrossRef] [PubMed]
Sun, C.; Song, D.; Liu, H. Pump Fault Detection Based on MFCC-MLCNN. Acad. J. Sci. Technol. 2023, 8, 90–97. [Google Scholar] [CrossRef]
Zhang, H.; Zhao, Z.; Huang, F.; Hu, L. A Study of Sound Recognition Algorithm for Power Plant Equipment Fusing MFCC and IMFCC Feature. In Proceedings of the Conference on Image, Signal Processing, and Pattern Recognition, Changsha, China, 24–26 February 2023; SPIE: Bellingham, WA, USA, 2023; p. 208. [Google Scholar]
Geng, Q.; Wang, F.; Jin, X. Optimization of Mechanical Fault Acoustic Diagnosis of Dry Transformer in Random Forest Based on Gammatone Filter Cepstrum Coefficient and Whale Algorithm. Power Autom. Equip. 2020, 40, 7. [Google Scholar]
Hu, W.; Feng, S.; Zhang, B.; Gao, Y.; Xiao, X.; Qi, H. Hybrid Feature Extraction Method of MFCC+GFCC Helicopter Noise Based on Wavelet Decomposition. J. Phys. Conf. Ser. 2023, 2478, 122008. [Google Scholar] [CrossRef]
Yang, F.; Tian, X.; Ma, L.; Shi, X. An Optimized Variational Mode Decomposition and Symmetrized Dot Pattern Image Characteristic Information Fusion-Based Enhanced CNN Ball Screw Vibration Intelligent Fault Diagnosis Approach. Measurement 2024, 229, 114382. [Google Scholar] [CrossRef]
Jiang, W.; Li, Z.; Lei, Y.; Zhang, S.; Tong, X. Fault Diagnosis and Performance Degradation Degree Recognition Method of Rolling Bearing Based on Deep Learning. J. Yanshan Univ. 2020, 44, 526–536. [Google Scholar]
Wen, L.; Li, X.; Gao, L.; Zhang, Y. A New Convolutional Neural Network-Based Data-Driven Fault Diagnosis Method. IEEE Trans. Ind. Electron. 2018, 65, 5990–5998. [Google Scholar] [CrossRef]
Tang, S.; Zhu, Y.; Yuan, S. Intelligent Fault Diagnosis of Hydraulic Piston Pump Based on Deep Learning and Bayesian Optimization. ISA Trans. 2022, 129, 555–563. [Google Scholar] [CrossRef]
Ding, S.; Zhang, R.; Yang, C.; Zhang, S. Machine Tool Fault Classification Diagnosis Based on Sound Fusion Feature and OCSVM. Manuf. Technol. Mach. Tools 2022, 72, 13–20. [Google Scholar]
Liu, Y.; Zhang, R.; He, Z.; Huang, Q.; Zhu, R.; Li, H.; Fu, Q. The Study of Hydraulic Machinery Condition Monitoring Based on Anomaly Detection and Fault Diagnosis. Measurement 2024, 230, 114518. [Google Scholar] [CrossRef]
Zhou, F.; Yang, X.; Shen, J.; Liu, W. Fault Diagnosis of Hydraulic Pumps Using PSO-VMD and Refined Composite Multiscale Fluctuation Dispersion Entropy. Shock Vib. 2020, 2020, 8840676. [Google Scholar] [CrossRef]
Davis, S.; Mermelstein, P. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE Trans. Acoust. Speech Signal Process. 1980, 28, 357–366. [Google Scholar] [CrossRef]
Zhang, Z.; Xu, C.; Xie, J.; Zhang, Y.; Liu, P.; Liu, Z. MFCC-LSTM Framework for Leak Detection and Leak Size Identification in Gas-liquid Two-phase Flow Pipelines Based on Acoustic Emission. Measurement 2023, 219, 113238. [Google Scholar] [CrossRef]
Jin, S.; Wang, X.; Du, L.; He, D. Evaluation and Modeling of Automotive Transmission Whine Noise Quality Based on MFCC and CNN. Appl. Acoust. 2021, 172, 107562. [Google Scholar] [CrossRef]
Mannem, K.R.; Mengiste, E.; Hasan, S.; Soto, B.G.; Sacks, R. Smart Audio Signal Classification for Tracking of Construction Tasks. Autom. Constr. 2024, 165, 105485. [Google Scholar] [CrossRef]
Aziz, S.; Shahnawazuddin, S. Effective Preservation of Higher-frequency Contents in the Context of Short Utterance Based Children’s Speaker Verification System. Appl. Acoust. 2023, 209, 109420. [Google Scholar] [CrossRef]
Ramirez AD, P.; de la Rosa Vargas, J.I.; Valdez, R.R.; Becerra, A. A Comparative between Mel Frequency Cepstral Coefficients (MFCC) and Inverse Mel Frequency Cepstral Coefficients (IMFCC) Features for an Automatic Bird Species Recognition System. In Proceedings of the 2018 IEEE Latin American Conference on Computational Intelligence (LA-CCI), Gudalajara, Mexico, 7–9 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–4. [Google Scholar]
Wang, Z.; Yan, J.; Wang, Y.; Wang, X. Speech Emotion Feature Extraction Method Based on Improved MFCC and IMFCC Fusion Features. In Proceedings of the 2023 IEEE 2nd International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA), Changchun, China, 24–26 February 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1917–1924. [Google Scholar]
Wang, Y.; Miao, C.; Liu, Y.; Meng, D. Research on a Sound-based Method for Belt Conveyor Longitudinal Tear Detection. Measurement 2022, 190, 110787. [Google Scholar] [CrossRef]
Feng, H.; Chen, X.; Wang, R.; Wang, H.; Yao, H.; Wu, F. Underwater Acoustic Target Recognition Method based on WA-DS Decision Fusion. Appl. Acoust. 2024, 217, 109851. [Google Scholar] [CrossRef]
Lella, K.; Pja, A. Automatic Diagnosis of COVID-19 Disease Using Deep Convolutional Neural Network with Multi-feature Channel from Respiratory Sound Data: Cough, Voice, and Breath. Alex. Eng. J. 2022, 61, 1319–1334. [Google Scholar] [CrossRef]
Semmad, A.; Bahoura, M. Comparative Study of Respiratory Sounds Classification Methods Based on Cepstral Analysis and Artificial Neural Networks. Comput. Biol. Med. 2024, 171, 108190. [Google Scholar] [CrossRef] [PubMed]
Ai, O.C.; Hariharan, M.; Yaacob, S.; Chee, L.S. Classification of Speech Dysfluencies with MFCC and LPCC Features. Expert Syst. Appl. 2012, 39, 2157–2165. [Google Scholar]
Liu, F.; Li, G.; Yang, H. Application of Multi-algorithm Mixed Feature Extraction Model in Underwater Acoustic Signal. Ocean Eng. 2024, 296, 116959. [Google Scholar] [CrossRef]
Paseddula, C.; Gangashetty, S.V. Late Fusion Framework for Acoustic Scene Classification Using LPCC, SCMC, and Log-Mel Band Energies with Deep Neural Networks. Appl. Acoust. 2021, 172, 107568. [Google Scholar] [CrossRef]
Zhang, Y.; Qiu, C.; Tian, Y.; Fang, X.; Miao, Q. Bearing Fault Diagnosis Based on Mixed Cepstrum and LSTM Network. Comb. Mach. Tools Autom. Process. Technol. 2023, 87–92. Available online: https://xueshu.baidu.com/usercenter/paper/show?paperid=1y4d0jq0xu5y0e10su520pq0su698496&site=xueshu_se&hitarticle=1 (accessed on 12 October 2024).
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Chee, M.T.J.; Cao, Q.; Quek, C. FE-RNN: A Fuzzy Embedded Recurrent Neural Network for Improving Interpretability of Underlying Neural Network. Inf. Sci. 2024, 663, 120276. [Google Scholar] [CrossRef]
Xiao, D.; Huang, Y.; Wang, H.; Shi, H.; Liu, C. Health Assessment for Piston Pump Using LSTM Neural Network. In Proceedings of the 2018 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC), Xi’an, China, 15–17 August 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 131–137. [Google Scholar]
Zhu, Y.; Su, H.; Tang, S.; Zhang, S.; Zhou, T.; Wang, J. A Novel Fault Diagnosis Method Based on SWT and VGG-LSTM Model for Hydraulic Axial Piston Pump. J. Mar. Sci. Eng. 2023, 11, 594. [Google Scholar] [CrossRef]
Wei, X.; Chao, Q.; Tao, J.; Liu, C.; Wang, L. Fault Diagnosis of High-Speed Plunger Pump Based on LSTM and CNN. Acta Aeronaut. 2021, 42, 435–445. [Google Scholar]
Abdul, Z.K.; Al-Talabani, A.K. Mel Frequency Cepstral Coefficient and its Applications: A Review. IEEE Access 2022, 10, 122136–122158. [Google Scholar] [CrossRef]
Smith, W.A.; Randall, R.B. Rolling Element Bearing Diagnostics Using the Case Western Reserve University Data: A Benchmark Study. Mech. Syst. Signal Process. 2015, 64–65, 100–131. [Google Scholar] [CrossRef]
Zhou, W.; Hu, Y.; Tang, J. Research on Oil Engine Fault Warning Based on Support Vector Data Description. Therm. Turbine 2022, 51, 290–294. [Google Scholar]

Figure 1. Cepstral feature extraction flowchart.

Figure 2. Distribution of Mel filter bank.

Figure 3. Distribution of inverse Mel filter bank.

Figure 4. Distribution of Gammatone filter bank.

Figure 5. LSTM network architecture.

Figure 6. DLSTM network schematic.

Figure 7. Flow chart of intelligent diagnosis method of hydraulic plunger pump based on sound signals.

Figure 8. Hydraulic plunger pump fault simulation test bench.

Figure 9. Hydraulic plunger pump experimental setup diagram. 1—Oil tank; 2, 24—filter; 3—vane pump; 4, 25—gate valve; 5, 13—flow meter; 6, 15—pressure gauge switch; 7, 16—pressure gauge; 8, 18—relief valve; 9—hydraulic plunger pump; 10—accelerometer; 11—sound level meter; 12—check valve; 14—pressure sensor; 17, 22—accumulator; 19—solenoid valve; 20—electro-hydraulic servo valve; 21—hydraulic cylinder; 23—check throttle valve.

Figure 10. Physical images of faulty components in hydraulic plunger pump: (a) swash plate wear; (b) slipper wear; and (c) loose slipper.

Figure 11. Time–domain waveform and power spectrum of hydraulic plunger pump sound signals: (a) normal; (b) swash plate wear; (c) slipper wear; and (d) loose slipper.

Figure 12. Four types of cepstral features in different states: (a) normal; (b) swash plate wear; (c) slipper wear; and (d) loose slipper.

Figure 13. Average classification accuracy of ten trials.

Figure 14. Confusion matrices for different features: (a) MFCC; (b) IMFCC; (c) MICC; (d) MIGCC; (e) MILCC; and (f) MIGLCC.

Figure 15. Performance comparison of different diagnostic methods.

Figure 16. Principles or network structures of various methods: (a) SVM; (b) 1D-CNN; and (c) RNN.

Figure 17. Performance comparison of LSTM networks with different layer numbers.

Figure 18. t-SNE feature visualization: (a) original data; (b) MIGLCC features; (c) LSTM1 layer; (d) LSTM2 layer; (e) FC layer.

Figure 19. CWRU bearing fault test bench.

Figure 20. Time–domain waveform and power spectrum of CWRU bearing vibration signals: (a) normal; (b) inner race fault; (c) outer race fault; and (d) rolling element fault.

Figure 21. Confusion matrix of CWRU bearing data diagnosis results.

Figure 22. t-SNE feature visualization before and after CWRU bearing diagnosis: (a) original data; (b) MIGLCC-DLSTM classifies data.

Figure 23. Servo motor fault test bench.

Figure 24. Servo motor test system schematic.

Figure 25. Time–domain waveform and power spectrum of servo motor pressure signals: (a) normal; (b) servo valve internal leakage; (c) spring breakage; (d) quick-closing solenoid valve throttling orifice blockage; (e) internal oil leakage; and (f) external oil leakage.

Figure 26. Confusion matrix of servo motor data diagnosis results.

Figure 27. t-SNE feature visualization before and after servo motor diagnosis: (a) original data; (b) MIGLCC-DLSTM classifies data.

Table 1. Key components and parameters of the hydraulic plunger pump.

Name	Model	Parameters
Hydraulic Plunger Pump	MCY14-1B	Number of plungers: 7, Theoretical displacement: 10 mL/r, Rated working pressure: 31.5 MPa
Drive Motor	Y132M4	Rated power: 7.5 kW, Rated speed: 1480 rpm
Data Acquisition Card	NI-USB-6221	Maximum sampling rate: 250 kS/s
Accelerometer	YD72D	Frequency range: 1 Hz~18 kHz
Pressure Sensor	SYB-351	Measurement range: 0~25 MPa, Power supply voltage: DC 24 V, Output range: 0~5 V
Sound Level Meter	AWA5661	Sensitivity: 40 mV/Pa, Frequency range: 10–16 kHz

Table 2. Fault types and injection methods of the hydraulic plunger pump.

Label	State Type	Fault Injection Method
0	Normal	-
1	Swash Plate Wear	Manually induce wear on the swash plate
2	Slipper Wear	Round the corners of the slippers
3	Loose Slipper	Utilize faulty loose slipper components

Table 3. Different proportions of data partitioning in the same dataset.

	Proportion of Division	Training Set	Testing Set
Dataset 1	7:3	271	117
Dataset 2	5:5	194	194
Dataset 3	2:8	77	311

Table 4. The parameter configuration of the LSTM network.

Number	Network Parameter	Value
1	number of network layers	2
2	number of hidden units	128
3	learning rate	0.001
4	dropout rate	0.2
5	batch size	32
6	number of epochs	30

Table 5. Input and output dimensions of each layer in the DLSTM network.

Number	Network Layer	Input Dimension	Output Dimension
1	Input	16 × 7 × 48	16 × 7 × 48
2	LSTM1	16 × 7 × 48	16 × 7 × 64
3	Dropout	16 × 7 × 64	16 × 7 × 64
4	LSTM2	16 × 7 × 64	16 × 7 × 64
5	FC	16 × 64	16 × 4
6	Output	16 × 4	16 × 4

Table 6. Performance comparison of diagnostic models with different data partition ratios.

	Overall Accuracy	Precision	Recall	F1 Score
Dataset 1	99.41%	99.39%	99.43%	0.9940
Dataset 2	99.15%	99.00%	99.26%	0.9912
Dataset 3	98.88%	98.98%	99.05%	0.9887

Table 7. Performance comparison of diagnostic models with different features.

Type	Features	Overall Accuracy	Precision	Recall	F1 Score
single feature	MFCC	90.26%	90.17%	90.92%	0.8938
	IMFCC	89.49%	88.03%	89.62%	0.8824
	GFCC	88.37%	87.72%	88.94%	0.8699
	LPCC	89.14%	89.03%	89.36%	0.8817
	average	89.32%	88.74%	89.71%	0.8820
the dual-feature fused	MICC	92.56%	91.56%	92.96%	0.9169
	MGCC	92.31%	92.90%	92.75%	0.9143
	MLCC	92.39%	92.66%	92.95%	0.9213
	IGCC	91.45%	90.73%	91.69%	0.9040
	ILCC	90.35%	90.43%	90.51%	0.8973
	GLCC	91.45%	91.35%	92.04%	0.9073
	average	91.75%	91.60%	92.15%	0.9102
the triple-feature fused	MIGCC	94.62%	93.18%	94.98%	0.9371
	MILCC	93.85%	94.61%	94.00%	0.9353
	MGLCC	93.59%	93.69%	93.80%	0.9338
	IGLCC	92.39%	92.64%	92.27%	0.9180
	average	93.61%	93.53%	93.76%	0.9311
the four-feature fused	MIGLCC	99.41%	99.39%	99.43%	0.9940

Table 8. The performance analysis of the diagnostic model under multiple operating conditions.

Pressure	Overall Accuracy	Precision	Recall	F1 Score
2 MPa	98.89%	98.78%	98.95%	0.9885
5 MPa	99.41%	99.39%	99.43%	0.9940
8 MPa	99.40%	99.38%	99.42%	0.9939
10 MPa	98.63%	99.02%	98.39%	0.9841
15 MPa	98.97%	99.06%	99.12%	0.9895

Table 9. The results of the CWRU bearing experiment.

Method	Overall Accuracy	Precision	Recall	F1 Score
MIGLCC-DLSTM	99.64%	99.65%	99.64%	0.9964

Table 10. The results of the servo motor experiment.

Method	Overall Accuracy	Precision	Recall	F1 Score
MIGLCC-DLSTM	98.07%	98.43%	98.09%	0.9805

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ma, L.; Jiang, A.; Jiang, W. The Intelligent Diagnosis of a Hydraulic Plunger Pump Based on the MIGLCC-DLSTM Method Using Sound Signals. Machines 2024, 12, 869. https://doi.org/10.3390/machines12120869

AMA Style

Ma L, Jiang A, Jiang W. The Intelligent Diagnosis of a Hydraulic Plunger Pump Based on the MIGLCC-DLSTM Method Using Sound Signals. Machines. 2024; 12(12):869. https://doi.org/10.3390/machines12120869

Chicago/Turabian Style

Ma, Liqiang, Anqi Jiang, and Wanlu Jiang. 2024. "The Intelligent Diagnosis of a Hydraulic Plunger Pump Based on the MIGLCC-DLSTM Method Using Sound Signals" Machines 12, no. 12: 869. https://doi.org/10.3390/machines12120869

APA Style

Ma, L., Jiang, A., & Jiang, W. (2024). The Intelligent Diagnosis of a Hydraulic Plunger Pump Based on the MIGLCC-DLSTM Method Using Sound Signals. Machines, 12(12), 869. https://doi.org/10.3390/machines12120869

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Intelligent Diagnosis of a Hydraulic Plunger Pump Based on the MIGLCC-DLSTM Method Using Sound Signals

Abstract

1. Introduction

2. Introduction to Basic Knowledge

2.1. Extraction of Different Cepstral Features

2.1.1. MFCC

2.1.2. IMFCC

2.1.3. GFCC

2.1.4. LPCC

2.1.5. Hybrid Cepstral Feature MIGLCC

2.2. LSTM Network

3. Hydraulic Plunger Pump Simulation Experiment

3.1. Construction of Experimental Setup and Signal Acquisition

3.2. Experimental Data Partitioning

3.3. MIGLCC Feature Extraction

3.4. Network Parameter Configuration

3.5. Analysis of Experimental Results

3.5.1. Performance Comparison of Diagnostic Models with Different Data Partition Ratios

3.5.2. Performance Comparison of Diagnostic Models with Different Cepstral Features

3.5.3. Performance Comparison of Different Diagnostic Methods

3.5.4. Performance Analysis of the Diagnostic Model Under Multiple Operating Conditions

3.6. Parameter Sensitivity

3.7. Visualization of Feature Representations

4. Extended Application of the MIGLCC-DLSTM Method

4.1. CWRU Bearing Dataset

4.2. Servo Motor Dataset

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI