Nothing Special   »   [go: up one dir, main page]

Essential Steps in Prognostic Health Management

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Essential Steps in Prognostic Health Management

Sreerupa Das, Richard Hall, Stefan Herzog, Gregory Harrison, Michael Bodkin
Lockheed Martin, Global Training and Logistics
100 Global Innovation Circle, Orlando, FL 32825

Abstract—Prognostic health management (PHM) systems are remaining useful life of cutters in a high speed milling
designed to predict impending faults and to determine remaining machine.
useful life of machinery. An efficient prognostic system can speed
up fault diagnosis by providing an indication of what parts of the
II. GENERAL APPROACH
machinery or vehicle are most likely to fail and will need
maintenance in the near future. In this paper, we discuss the PHM analysis involves a variety of steps including
essential steps involved in building an effective PHM system. We collection of raw data from sensors, data characterization,
describe time and frequency domain features that can be digital signal processing, extraction of condition indicators,
extracted from raw sensor data. These features or condition and finally the intelligent processing engine for performing
indicators can help summarize the information in raw data and diagnosis and prognosis. Figure 1 delineates the essential
extract critical clues that reflect the health of the machinery. steps. These steps are described in detail below.
Analytical models can then be used to learn the essential health
indicators and how they relate to fault conditions. In addition,
we describe a case study of implementing a PHM system for a
high speed face milling CNC cutter. We describe features that
were analyzed from sensor data. For the analytical engine, we
used a Neural Network model for learning the association of the
extracted features and the magnitude of wear in the cutter. The
neural network was able to determine remaining useful life of
cutters in terms of number of remaining cuts for a given wear
limit based on extracted features.

Keywords- Prognostic Health Management; Condition


Indicators; Neural Network; CNC Milling cutters

I. INTRODUCTION

Prognostic Health Management (PHM) can improve


equipment availability, reduce maintenance costs and support
a better ability to plan for maintenance events. By knowing
what components are in need of maintenance, and the
criticality of the impending failure, maintenance actions can
be planned in advance, while limiting the amount of
maintenance to what is required, based upon the condition of
the equipment. PHM replaces time-duration-based
maintenance in which components undergo maintenance
activities based on a pre defined schedule. If a maintenance
action is performed prematurely, it raises ongoing operating
costs. By knowing what components are in need of Figure 1. Essential Steps for a PHM System
maintenance and how soon that is required, maintenance
actions can be planned in advance. The goal of PHM systems A. Data collection from Sensors
is to be able to detect impending failure early enough to take
remedial actions in a timely manner. Sensors are essential part of the PHM system. They record
the state of machinery being monitored in form of raw data.
PHM relies on sensor and analysis capabilities for Sensors are mounted strategically on the machinery. They are
monitoring the condition of the machinery or vehicle conditioned to provide digital values corresponding to analog
components to gauge component health, to detect deteriorating states of signals they are monitoring. For example,
conditions and to help plan for the maintenance activities. accelerometers are commonly used to monitor vibration.
Essential steps for implementing a PHM system are discussed Dynamometers are used to measure force, moment of force or
in this paper. Further, as a case study, we describe the power. The issues here are the characteristics of the sensors
implementation of a PHM system for determining the such as the sensitivity of the detected signal, their location in

978-1-4244-9827-7/11/$26.00 ©2011 IEEE


Authorized licensed use limited to: PAKISTAN INST OF ENGINEERING AND APPLIED SCIENCES. Downloaded on March 14,2022 at 20:52:50 UTC from IEEE Xplore. Restrictions apply.
the system, and whether these sensors should be wired or be 4) Delta RMS -- This parameter is the difference between
wireless. Sensors are to be selected depending on the range of two consequent RMS values. This parameter focuses on the
the parameter to be measured. trend of the vibration and is sensitive to vibration signal changes.
5) Peak Value – This is the maximum value of the signal
B. Denoising & Data Characterization in a selected time frame.
Denoising and Data Characterization are important steps in
the PHM analysis. Raw data collected from sensors are rarely 6) Crest Factor – This parameter indicates the damage in
noise-free. Denoising techniques must be used to reduce the an early stage. It is defined as the peak value of the signal
noise content in the signal. Analog and digital filters can be divided by the RMS value of the signal.
designed that help attenuate the effect of high frequency noise 7) Kurtosis – Kurtosis describes how peaked or flat the
in the measurements. Basic sanity checks (e.g., check on distribution is. It is given by:
magnitude of values) and data validation methods must be
used to ensure the data collected is not grossly faulty. These
include checking whether the measured data and the rate at
which the data is changing are within predefined operational
limits. Also, if there are sensors that are sampling data at
different frequencies, they might need to be correlated by up-
sampling or down-sampling them appropriately. To
summarize, functionalities provided by this step are as
follows: Where
Kurt is kurtosis,
• Noise reduction: eliminate noise from the signal
using wide range of statistical signal processing N is the number of points in the time history of signal
algorithms. Band pass filters are also used where s,
appropriate. si is the i-th point in the time history of signal s.

• Data validation: basic sanity check, handle Thus kurtosis is the fourth centralized moment of the signal,
missing data, handle abnormal data values. normalized by the square of the variance.
• Data normalization: scale data ranges between Apart from time domain features described above, features
[0..1], can be extracted from the frequency domain, order domain or
joint time-frequency domain. Several advanced signal
• Data Correlation
processing techniques have been explored in the literature.
Some of them are listed below:
C. Feature Extraction
Sensor data must be processed to extract features or 1) Band Stop/Pass filter – This technique is used to
condition indicators that reflect the health of the machinery attenuate/accentuate a known frequency range in a signal.
being monitored. Condition Indicators or features embody 2) Spectral Density, Power Spectral Density – The
key information that is obtained by processing the raw data. spectral density captures the frequency content of a signal and
Tracking relevant condition indicators over time gives us a helps identify periodicities in the signal. Fast fourier
good indication of fault progression in machinery. This helps transform (fft) forms the basis for analyzing a time domain
us to prepare for an impending fault. Some of the useful time signal in a frequency domain.
domain features or condition indicators are listed below:
3) Ceptrum Analysis – This technique is useful in
1) Mean – Average value of a time varying signal. detecting changes in sideband patterns and detecting periodic
2) Standard Deviation – Measures how much the data structure in spectrum. It is defined as the inverse Fourier
points are dispersed from the 'average'. Standard Deviation σ transform of a logarithmic spectrum of the regular Fourier
(sigma) is the square root of the average value of (X − μ)2. tranform of the time signal.
3) Root Mean Square -- The Root Mean Square value 4) Time-Frequency Analysis – Time Frequency Analysis
(RMS) for a vibration signal reflects the energy content of the for non-stationary signals are gaining popularity. Short Time
signal. It can be expressed as: Fourier Transform with a sliding window, Wigner-Ville
Distribution (autocorrelation of Fourier Transform with a
delay) and Gabor Transform are some of the commonly used
techniques for Time-Frequency analysis.
5) Time Synchronous Averaging & Order Tracking – Time
Synchronous Averaging (TSA) is performed by averaging
Where, together a series of signal segments each corresponding to one
srms is the root mean square value of dataset s, period of a synchronising signal. Before TSA can be
si is the i-th member of dataset s, performed, the signal must be order tracked to give integer
N is the number of points in dataset s. number of samples per revolution and a defined start point

Authorized licensed use limited to: PAKISTAN INST OF ENGINEERING AND APPLIED SCIENCES. Downloaded on March 14,2022 at 20:52:50 UTC from IEEE Xplore. Restrictions apply.
(with help of tachometer readings). Order tracking minimizes E. Perform PHM Analysis
errors introduced by fluctuations in the sampling frequency. One or more models could be generated to train on the task
6) Hilbert Transform – A real function f(t) and its Hilbert and provide prognostics. Since most learning systems start
transform h(t) together form an analytic signal. An analytic with an unbiased and usually random configuration using
signal is one that gives us a one-sided spectrum in the Monte Carlo methods (e.g., starting with random weights in a
frequency domain [1]. One applications of Hilbert transform neural network), the solutions they develop are unique, even
is that the magnitude of the analytic signal is the envelop of though they are trained on the same training data. Also, since
the original signal f(t). Also, since the magnitude of the FFT training data is usually limited, the way each model learns to
of the analytic signal doubled in the logarithmic scale enabling generalize from the limited data is distinct and could be
a large display range. valuable piece of information. Hence it is advantageous to
train a group of models on a specific task and evaluate their
7) Wavelet Analysis – Wavelets are effectively the impluse response on new data (real situation) to generate the final
response functions of a series of filters applied to the signal to outcome of the PHM system. Various techniques have been
extract features of the same order (scale) as the specific researched to pick the best model or make the best prediction
wavelet. By choosing wavelets similar to sought features in given the output of a group of models, trained on a limited set
the signal, wavelets can be used for compression and of data. Some of these techniques include:
extraction of salient features.
1) Ensemble Learning – Such methods use multiple
D. Building a model models to obtain better predictive performance than could be
obtained from any of the constituent models.
Once key condition indicators are extracted, the next step
is to build a model to interpret the information in the features 2) Random Forest – is an ensemble classifier that consists
and correlate them to the behavior of the machinery. Any of many decision trees and outputs the class that is the mode
such analytical model will have to make simplifying of the class's output by individual trees [2].
assumptions about reality. Nevertheless, such models are
important tools to summarize patterns from underlying data 3) Cross Validation – Cross-validation is a way to predict
and are used to make best possible predictions for situations. the performance of a model on a validation set using
computation in place of mathematical analysis. This
Machine Learning and Statistics provide numerous technique is often used to determine the best performing
algorithms that allow computers to evolve behaviors based on model in a group of models.
empirical sensor data. These algorithms take advantage of
examples (training data) to capture the unknown underlying 4) Voting – Given a class of learned models, voting or
probability distribution. A major focus of machine learning is majority response could be used to determine the response of
to automatically learn to recognize complex patterns and make the overall PHM system.
intelligent decisions based on data. However, the challenge
lies in the fact that the set of all possible observations (sensor III. CASE STUDY: PHM FOR MILLING MACHINE
values) and corresponding behaviors is too large to be covered In this case study we present the techniques used in
by the set of observed training data. Hence the model must implementing a PHM system for a high speed face CNC
generalize from the given examples, so as to be able to (Computer Numerical Control) milling cutter. The cutting
generate the best guess on new cases. Some of the commonly process involves discontinuous and varying loads on flutes of
used Machine/Statistical learning models approaches include: the cutter as they engage and disengage with the cutting
• Decision Tree learning surface and results in wear over time. Flute wear phenomenon
is complex and is a function of setup, type of cutter used and
• Association rule learning workpiece materials being processed. With flute wear
progression, more force or power is required to achieve the
• Neural Networks
same amount of cut, i.e., material removed from a workpiece.
• Genetic Programming In addition, as flutes wear, changes in sound emitted from a
cutting operation become distinct. The most undesirable effect
• Logic programming of flute wear is that it results in growing imperfections in the
• Support Vector Machines cutting surface finish which are often unacceptable, especially
while milling fine instruments. Degradation of the milled
• Clustering surface from worn cutters leads to rework or scrapping the
workpiece. Usually cutters have a distinctive wear pattern.
• Bayesian networks There is a break-in period with a steep wear rate in new
• Reinforcement learning cutters. Following the break-in period, wear significantly
slows down to a small uniform rate which is also called the
Other Statistical models such as Regression Model, steady-state wear region. Finally there is acceleration in the
Gaussian Mixture model, Hidden Markov Model also have the wear rate as it approaches its end of life. Although general
same underlying goal – that of generating the most likely trend of the wear of a cutter may be known, each cutter
outcome for a given observation (sensor values). behaves differently possibly due to imperfections in the
composition and geometry. Hence it is important to be able to

Authorized licensed use limited to: PAKISTAN INST OF ENGINEERING AND APPLIED SCIENCES. Downloaded on March 14,2022 at 20:52:50 UTC from IEEE Xplore. Restrictions apply.
estimate the remaining useful life of a cutter during its Column 1: Force (N) in X dimension
operation based on its current conditions. In order to Column 2: Force (N) in Y dimension
determine a cutter’s health condition, sensors can be placed on Column 3: Force (N) in Z dimension
the cutter to measure the vibration and force exerted by the Column 4: Vibration (g) in X dimension
cutter on the workpiece along the three dimensions. Also, Column 5: Vibration (g) in Y dimension
acoustic emission data can be utilized to reflect on the cutter’s Column 6: Vibration (g) in Z dimension
health. Column 7: AE-RMS (V)
In the rest of the paper, we discuss steps taken to predict the Each cut data file contains more than 200,000 records
life of the cutter. A variation of a back propagation Neural corresponding to duration of more than 4 seconds required to
Networks model was used for learning the association of the make one cut.
features with fault conditions. The neural network was used to
In addition, the wear pattern of cutters C1, C4 and C6 are
determine the remaining useful life of a cutter in terms of
provided. The wear data consisted of the wear on each of the
number of remaining cuts for a given wear limit based on
three flutes for the three cutters after each cut (in 10-3 mm) for
extracted features.
about 300 cuts.
IV. CASE STUDY: CNC MILLING CUTTER 2) Task and Evaluation Methodology
Tool wear phenomenon is complex in the varied setup and
A. Problem Definition materials processed. Workpiece surface finish is degraded
from worn cutters leading to rework or scrapping the
1) Background
workpiece. The task was to estimate the maximum number of
cuts one could "safely" make for an unspecified wear limit.
This implied that the maximum wear of any flute should not
exceed the wear limit (not the average wear across the flutes).
E.g., if the wear pattern of three flutes is as Figure 3, then
Figure 4 shows the maximum wear of all the three flutes.

Figure 2. Schematic diagram of a cutter

Figure 2 shows a schematic diagram of a high speed face


milling cutter. In a face milling task, it is very important that
the cutter edge remains sharp which could otherwise result in a
deteriorated or unusable milled surface. The task here is to
determine the remaining useful life of the cutter based on
externally measured conditions (e.g., vibrations). Although Figure 3. A sample wear pattern on three flutes of a cutter
this problem domain has been explored in the past [3-5] the
research presented here is yet another approach to the
problem. More importantly, the analysis described here is
based on very limited data (provided as part of a contest), with
no control over the apparatus, sensors or the data that was
collected – such is often the case in real life situations where
available data is very restricted!
There were six individual 3-flute cutters (C1, C2, C3, C4,
C5 and C6). Each cutter made 315 cuts over an identical work
piece for a face milling job. The spindle speed of the cutter
was 10400 RPM; feed rate was 1555 mm/min; Y depth of cut
(radial) was 0.125 mm; Z depth of cut (axial) was 0.2 mm.
For each of the 315 cuts made by a cutter, dynamometer,
accelerometer and acoustic emission data was collected. The
data was collected at 50,000 Hz/channel. The data acquisition
files (total of 6 sets of 315 files) contained seven columns,
corresponding to:
Figure 4. Maximum wear on all the three flutes

Authorized licensed use limited to: PAKISTAN INST OF ENGINEERING AND APPLIED SCIENCES. Downloaded on March 14,2022 at 20:52:50 UTC from IEEE Xplore. Restrictions apply.
A. Noise Elimination
The task here is to make an estimate of the maximum safe As the cutter engages with and disengages from the work
cuts at integer values of wear over the range 66 to 165 (10- piece at the start and end of every cut, we noticed certain
3
mm) as shown in Figure 4. amount of noise or disparity compared to the rest of the
records while cutting was in progress. This noise was
apparent in the time domain data. In order to eliminate
variations in the end conditions, the first few and last few
records in each cut file were eliminated. Also since this
analysis was performed on face milling cutters and
approximately 315 cuts were required to mill a face, to avoid
disparity on the edges, the first few and the last few cut files
were disregarded.

B. Feature Extaction for CNC Cutter


To provide the appropriate input for training the model,
features were extracted from the provided raw data on
Figure 5. Maximum safe cuts for integer values of wear vibration, force and acoustic emission provided in the
challenge. We explored some of the features used by other
The goodness of function used for evaluating a solution researchers in the literature as (X. Li et. al., 2009).
was defined as follows: 1) Acoustic Emission – Milling processes (multiple-
di = estimated_cutsi - actual_cuts i(for a wear limit i) toothed rotating cutters) involve discontinuous and varying
chip loads with varying chip, cutter and workpiece interactions
if di < 0 then Scorei=exp(-di/10)-1; [6]. Chip formation is a transient phenomenon with changing
if di >=0 then Scorei= exp(di/4.5)-1; behavior over the cutter useful life. Cutting tool edge wear
occurs through the chip fracturing which involves the cutting
Total Score = Sum of all Scorei for 65 < i <166 tool stress in the deformation zone exceeding the strength of
the work material, as illustrated in part in Figure 7. Changing
The score was a measure of error in the solution. As tool geometry due to edge wear increases the contact area in
apparent from the scoring function, deviation from the the deformation zone and in the shear-plane. Acoustic
solution impacted the score exponentially. Predicting end of emission (AE) is transient elastic energy released in materials
life beyond the actual corresponded to overestimation and undergoing deformation fracture or both. Additional sources
predicting end of life before the actual was underestimation. of AE include:
The scoring function penalized overestimation much more
heavily than underestimation. • built up edge breaking off and releasing a burst of
AE
V. OVERVIEW OF THE ANALYSIS • chip changing direction after shearing from
The task of estimating the maximum safe cuts at integer workpiece
values over the range 66 to 165 was essentially accomplished • chips breaking apart produce AE
by determining the wear curves of the three flutes for the three
cutters (C2, C4 and C5) based on the cutters with known wear • tool flank surface contact with workpiece
patterns (C1, C4 and C6). What makes this task interesting is
AE signals are attenuated by:
that this is a classical PHM or even a classical Machine
Learning task where we are presented with a set of training • tool deflection absorbing energy
data, and we are faced with the challenge of building a model
of the underlying system and have to determine the behavior • interface crossings between tool cutting edge and
of the system on new data. Once the wear patterns of the three AE sensor (e.g. spindle bearings, workpiece
flutes of three cutters were determined, the rest of the tasks holding fixture)
were trivial. The maximum wears for the three cutters were
computed by taking the maximum of the three flute wears for
each cutter as in Figure 3. Finally, in order to determine wears
at integer values over the range 66 and 165 (in 10-3mm), the
maximum wear curves were interpolated for the integer values
between 66 and 165. We followed steps described above and
also shown in Figure 1 to implement a PHM model for the
task. The subsequent sections will provide further details on
each step.

Authorized licensed use limited to: PAKISTAN INST OF ENGINEERING AND APPLIED SCIENCES. Downloaded on March 14,2022 at 20:52:50 UTC from IEEE Xplore. Restrictions apply.
Figure 6. Chip deformation in cutting

2) Time Domain Features – The milling isperturbed by


spindles peed variations, tool conditions, and the effect of
chips. As the tool makes each additional cut, it develops wear
on its flutes. With increasing flute wear, variations in the
following time domain features were explored for the Force
components along the x, y, and z dimensions:
• Root Mean Square
• Mean
• Standard Deviation
• Kurtosis
• Crest Factor
3) Frequency Domain Features – Frequency domain
analysis of vibration data often reveals interesting features not
obvious in the time domain. Fast Fourier transform was used
to generate power spectrums for the raw vibration data along
the x, y, and z dimensions. Frequency components for the
tooth pass frequency (3 times rpm = 520Hz) and its harmonics
were visible in the power spectrums and fault progression was
apparent at these frequency bands (Figure 11). We further
computed the total power around first and second harmonics
of the tooth pass frequency (Figure 12).
4) Time-Frequency Analysis (Wavelet Analysis) – Using
Wavelet Analysis, a given signal’s finite energy is projected
on a family of frequency bands. The wavelet decomposition
algorithm extracts information from the original signal by
breaking it into a series of approximations and details
distributed over different frequency bands. For a level l
wavelet and sampling frequency of fs, the decomposed
frequency bandwidth of approximation and detail are as
follows:

We used five levels and wavelet db3 was used in the


analysis and a sampling frequency of 50KHz, hence the
wavelet transform decomposed the acquired signal into the
following frequency bands:
a5: [0 Hz, 781.25 Hz]
d5: [781.25 Hz, 1562.5 Hz]
d4: [1562.5Hz, 3125Hz]
d3: [3125Hz, 6250 Hz]
d2: [6250Hz, 12500Hz]
d1: [12500Hz, 25000 Hz] Figure 7. Fault Progression in Cutter 4 is evident from the increase in power
of 1st and 2nd harmonics of the Tooth Pass Frequency (520Hz) in the vibration
The signals d1 through d4 were weak compared to the a5 data at 100th, 200th, 250th and 300th cuts.
and d5. Also, a5 turned out to be quite noisy, possibly picking
up low frequency noise. Hence we used d5 to extract our
features and were able to remove periodic noise components C. Modeling wear using selected features
(not well understood) but possibly due to cutter peculiarities. We used supervised learning on a three layered neural
network. The network was trained on the task of predicting
the wear patterns from derived features. A representative
neural network is shown in Figure 8.

Authorized licensed use limited to: PAKISTAN INST OF ENGINEERING AND APPLIED SCIENCES. Downloaded on March 14,2022 at 20:52:50 UTC from IEEE Xplore. Restrictions apply.
Hence,

To modify the connections between input and hidden layer


we use the following equation (back propagation of error):

Figure 8. Multi Layered Neural Network


Hence,
We experimented with several variations of the traditional
Backprogation learning algorithm [7]. Here is the basic
algorithm: The summation in equation above represents the actual
The weights in the network are initialized randomly back propagation of the error signal from the above layer, from
For iteration = 1 to N which this algorithm takes its name. The variable η is called
For each training pattern the "learning rate" of the system.
o = Neural network output (forward pass) One downside of the back propagation learning algorithm is
t = Target (supervised learning) that the choice of learning rate, that scales the derivatives,
Compute Error = sum squared difference of t and o impacts the time needed to train the model. If it is too large,
Compute delta weights from hidden layer to output error may not get below a certain value. If the learning rate is
layer; backward pass too small, it might take too long to train. Also, multilayer
Compute delta weights from input layer to hidden neural networks use sigmoid transfer functions in the
layer; backward pass continued intermediate (hidden) layers to constrict an infinite input range
Update the weights in the network into a finite output range. Since the slope of the sigmoid
Until stopping criterion satisfied function approaches zero as the input gets large, the effect of
gradient descent on changes in weights and biases become very
The mechanism of weight change was as follows. For a small, although they may not be very close to their optimal
three layered network, we have an input layer (p), a hidden values.
layer (h) and an output layer (o). An input pattern was
The learning algorithm we used for this modeling task was
presented to the input neurons. In the forward pass, the input
called Resilient Back propagation, also known as Rprop [8].
layer propagates activations to the hidden layer neuron and the
The Rprop algorithm performs direct adaptation of the weight
hidden layer activations propagated to the output layer. The
step based on local gradient information. If a partial derivative
activation yi of a neuron i can was computed as:
(and hence the weight change) alters its sign from the last
update, it indicates that the last weight update was too big. In
that case the magnitude of weight change is reduced. If the
weight change retains its sign from the last update, the update
value is slightly increased to help accelerate convergence.
where wij was the weight of the connection from neuron i to Hence, as defined in [8], the algorithm for weight change
neuron j , xj was the value of activation of neuron j and θi was was made as follows:
a bias for neuron i and g was a transfer function. A sigmoidal
transfer function was used for hidden layer and a linear
transfer function was used for the output layer.
In order to learn the input-to-output relation, we desire that
the output activations produce the target pattern tk whenever
we present as input the pattern pi. For one pattern, p, it is
possible to compute the error made by the network’s output
layer as follows:

In order to lower this error, it is necessary to change the


connections between the output and hidden layer by the
amount:

Authorized licensed use limited to: PAKISTAN INST OF ENGINEERING AND APPLIED SCIENCES. Downloaded on March 14,2022 at 20:52:50 UTC from IEEE Xplore. Restrictions apply.
Figure 9. The final solution

CONCLUSION
We described the essential steps needed to implement a
PHM system in general. We delineated the steps required and
D. TRAINING METHODOLOGY enumerated possible approaches that can be taken at each step.
Furthermore, we discussed the application of the steps to a
All inputs presented to the network were normalized to specific task, that of predicting the remaining useful life of
values between -1 and +1. Each input set consisted of features CNC milling cutters. Using the above mentioned methodical
extracted at a particular cut from one cutter, where each feature approach we were able to generate the best result for the given
was represented by one input unit. There were about 300 input task (generated the nest solution as part of the 2010 PHM Data
patterns for one cutter. For output, the wear values of the three Challenge).
flutes were provided (in 10-3 mm). Only training cutters (C1,
C4 and C6) were used as their wear patterns were provided.
The values of η- and η+ were set to 0.5 and 1.5. And from ACKNOWLEDGMENT
experimentation, we concluded that about 2000 iterations were We would like to thank the Prognostic Health Management
sufficient to train the network. Society (www.phm.org) for providing the data for this analysis.
The data used in this paper was part of their data released for
One important aspect that we found helped learning was to the 2010 PHM Challenge. We would also like to thank
have some concept of time and notion of history during any Lockheed Martin for supporting this research.
given cut. We choose not to make the model any more
complex by using recurrent neural networks and hence used the
input set from the last cut (i.e., at cut c and c-1) to determine REFERENCES
the wear at cut c. This helped the learning process by [1] Hahn Stefan L., Hilbert transforms in signal processing, Artech House,
providing some sense of history. In addition, in order to instill Inc.,Boston, 1996.
a dependence on time (i.e., which cut is it right now), we used [2] Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–
the cut number also as part of the input pattern. 32.
[3] X. Li, B. S. Lim, J. H. Zhou, S. Huang, S. J. Phua, K. C. Shaw and M. J.
A batch of 100 neural networks was trained on all the Er (2009). Fuzzy Neural Network Modelling for Tool Wear Estimation
training. Due to scarcity of training data set, we were not able in Dry Milling Operation, Annual Conference of the Prognostics and
to get good results using cross validation. Evaluation function Health Management Society, San Diego, CA.
(from the challenge) was instead used to select the best model. [4] I. Kandilli, M. Sönmez, H. M. Ertunc and B. Çakır (2007). Online
Monitoring Of Tool Wear In Drilling and Milling By Multi-Sensor
The selected model was used to predict the wear pattern of the Neural Network Fusion, IEEE International Conference on
other three cutters (C2, C3 and C5). Finally, in order to Mechatronics and Automation.
determine wears at integer values over the range 66 and 165 (in [5] V. P. Astakov, S. Shvets (2004). The assessment of plastic deformation
10-3mm), the maximum wear curves were interpolated for the in metal cutting. Journal of Materials Processing Technology 146 .
integer values between 66 and 165. [6] P. Scanlon, A. Lyons and A. O’Loughlin (2007). Acoustic signal
processing for degradation analysis of rotating machinery to determine
Since the training data was limited, we had to depend on the remaining useful life, IEEE Workshop on Applications of Signal
the daily leaderboard’s evaluation to help us refine our Processing to Audio and Acoustics.
solution. We varied the combinations of selected feature sets [7] D. E. Rumelhart, G. E. Hinton, and R. J. Williams (1986). Learning
and concluded that the standard deviation and total power at representations by back-propagating error. Nature, pp. 533–536.
harmonics of the tooth pass frequencies were the best [8] M. Riedmiller and H. Braun (1993). A direct adaptive method for faster
indicators. Other features that helped derive the final solution backpropagation learning: The RPROP algorithm. Proc. IEEE
International Conference On Neural Network, pp. 586-591.
included rms of wavelet decomposed frequency component d5
and kurtosis. The final solution is shown in Figure 9.

Authorized licensed use limited to: PAKISTAN INST OF ENGINEERING AND APPLIED SCIENCES. Downloaded on March 14,2022 at 20:52:50 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: PAKISTAN INST OF ENGINEERING AND APPLIED SCIENCES. Downloaded on March 14,2022 at 20:52:50 UTC from IEEE Xplore. Restrictions apply.

You might also like