Data-Driven Inter-Turn Short Circuit Fault Detection in Induction Machines

SPECIAL SECTION ON DATA-DRIVEN MONITORING, FAULT DIAGNOSIS AND
CONTROL OF CYBER-PHYSICAL SYSTEMS
Received September 2, 2017, accepted October 9, 2017, date of publication October 24, 2017,
date of current version November 28, 2017.
Digital Object Identifier 10.1109/ACCESS.2017.2764474
Data-Driven Inter-Turn Short Circuit Fault

Detection in Induction Machines
ZHAO XU1 , CHANGHUA HU 2 , FENG YANG3 , SHYH-HAO KUO3 ,
CHI-KEONG GOH4 , (Senior Member, IEEE), AMIT GUPTA4 , (Senior Member, IEEE),
AND SIVAKUMAR NADARAJAN4 , (Student Member, IEEE)
1 Schoolof Electronics and Information, Northwestern Polytechnical University, Xi’an 710072, China
2 Xi’anResearch Institute of High-Technology, Xi’an 710025, China
3 Department of Computing Science, Institute of High Performance Computing, Singapore 138632
4 Advanced Technology Center, Rolls-Royce Singapore, Singapore 797575
Corresponding author: Changhua Hu (hch_reu@sina.cn)

This work was supported in part by the National Natural Science Foundation of China under Grant 61603303 and Grant 61401363, in part
by the Fundamental Research Funds for the Central Universities under Grand G2017KY0008, in part by the Natural Science Basic
Research Program of Shaanxi under Grant 2017JQ6005, in part by the China Postdoctoral Science Foundation under Grant 2017M610650,
and in part by the Innovation Development Foundation of Loving Students under Grant ASN-IF2015-1502.
ABSTRACT Inter-turn short circuit (ITSC) fault is one of the critical electrical faults in induction motors
that affects the reliability of many industrial applications. Although the use of data-driven fault detection
techniques have gained much interest, the main deterrent in using these approaches in detecting ITSC
faults is in the generalization and robustness of the diagnosis. In this paper, a data-driven on-line fault
detection framework, incorporated with multi-feature extraction/selection and multi-classifier ensemble is
proposed, capable of detecting ITSC faults in induction motors (IMs) that subjected to variable operating
conditions. By using the synchronous time series signals collected from the machines, multiple feature
extraction/selection is explored to find the sensitive faulty features, and the different types of classification
strategies is used to increase the diversity of single based models. With the increased diversity of the base
learners, the fault detection accuracy is expected to be enhanced and the robustness can be guaranteed.
The framework was implemented and tested using real data collected from a designed test bed, with the
experimental results showing the effectiveness of the framework in detecting ITSC faults in IMs.
INDEX TERMS Data-driven, fault diagnosis, induction motor, inter-turn short circuit.
I. INTRODUCTION The subject of diagnosing ITSC faults in the early stages,

Induction motors (IMs) are widely used in industries for the continued to be a challenge, as these IMs can still perform
conversion of electrical energy into mechanical energy. The the desired operations, e.g. rotation at the designed speed,
failure of IMs will leads to machine downtime, economical and it is difficult to select fault that are sensitive enough to
loss and even threat of human safety. Inter-Turn Short Cir- detect the abnormal electrical characteristics in these IMs if
cuit (ITSC) fault is a common electrical fault in Induction the fault severity is too low. Furthermore, these abnormalities
Motors. Typically, ITSC faults are caused by insulation fail- are compounded with changes in the mechanical speed of the
ures, mechanical stress, moisture and partial discharge [1]. IM when the voltage of the variable frequency are imposed
These faults usually begin as minor and when left unresolved, as well as the connection or disconnection of the loads from
causes phase-to-ground and/or phase-to-phase faults, poten- the IMs leading to operating point variations. With the aim
tially leading to permanent damage in IMs. This, in turn, of detecting ITSC faults at the incipient stage, the available
increases operating cost due to machine downtime, raising established techniques can be broadly classified into three
the need for effective reliability monitoring and non-invasive categories.
fault diagnostics methods, in reducing these unscheduled The first category is based on signal analyses, which uses
downtimes and its associated high costs. Thus, the availability spectral tools to underline the occurrence of specific fre-
of methods, capable of the early detection of these ITSC quency components related to ITSC, using techniques such as
faults during motor operations, is critical and increasingly motor current signature analysis [2]–[5], negative sequence
demanded in various many industries. analysis [6], electromagnetic field monitoring [7], etc.
2169-3536 2017 IEEE. Translations and content mining are permitted for academic research only.
VOLUME 5, 2017 Personal use is also permitted, but republication/redistribution requires IEEE permission. 25055
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Z. Xu et al.: Data-Driven ITSC Fault Detection in Induction Machines
FIGURE 1. Proposed diagnostics condition-monitoring framework.
For example, as an indication of short circuit failure in the operating states that can be considered normal, arising from
windings, it is common to detect changes in the spectrum the different speed and/or loading conditions, throughout the
of the negative sequence voltage. However, the use of such lifetime of the machine. In detecting these specific mode of
conventional methods is often only for detecting specific failures experienced throughout the lifetime of the machines,
known types of abnormalities, therefore unable to detect any the failure detection method must be robust and generalizable
new abnormal behaviour present in the system. In addition, over a range of operating conditions.
if the spectrum of a healthy IM is close or overlapped with There were some papers working on motor fault diagnose
that of a faulty IM, it is difficult to distinguish the faulty from based on ensemble approach. [21] investigated fault feature
the healthy operating conditions. Furthermore, even in safe extraction of mechanical anomaly on induction motor bear-
conditions, the frequency components depend on the speed ing using ensemble super-wavelet transform. [22] employed
and power supply, which these strategies are not well adapted vibration signals from normal bearings and bearings with
to and often only applicable to machines under steady state three different fault locations. Bearing fault detection was
working conditions i.e., at constant speed and load. In over- based on hybrid ensemble detector and empirical mode
coming these limitations, advanced signal processing tech- decomposition. In [23],the stator current signals of induction
niques and high-resolution spectral analysis such as negative- motors are obtained using the MCSA method and the signals
and zero-sequence currents [8] and wavelet based analysis [9] are then processed to produce a set of harmonic-based fea-
were suggested, with the robustness of these methods still tures for classification using the FMMÂĺCRFE model. Above
being questioned [10]. literatures often uses current or vabration signals as the crit-
The second category uses model-based appro- ical fault indicator, in our experiments, however, no obvious
aches [11]–[14], which requires physical and mathematical difference was observed in the early stages of ITSC e.g.,
knowledge of the process a prior. The fault diagnosis is real- 2% and 10%. This implies that simply analysing the cur-
ized by generating features such as specific residuals, param- rent or vibration signals will not guarantee the performance
eter estimation, state estimation, etc. However, the downside of the fault detection. Furthermore, most of existing driven-
to these categories of methods is that, in many situations, driven approaches cannot generalize to unseen data, meaning
the complexity of the systems under observation makes it that the testing data must have the same characteristics as
almost impossible to derive robust and accurate models for the training data. However, as the operating conditions of
online applications. Moreover, these methods assumes the IMs vary, it is not plausible to exhaust all of the working
accurate knowledge of the model parameters, which is not conditions during training. In addition, the framework trained
the case in practice, as uncertainties often exist leading to on previous historical data should correctly predict the data
high false alarm [10]. generated after training, which may be affected by noise,
A good alternative to the above-mentioned two categories resulting from the re-start of the machine, the variations of
is the third category, which, prescribes the use of data-driven system dynamics etc.
approaches to detect the ITSC faults. This is undertaken In mitigating the above identified issues in monitoring
by evaluating the large quantity of available data, collected ITSC faults, a diagnostic condition monitoring framework,
from non-intrusive and in-expensive sensors, which is already based on an ensemble of data-driven techniques using elec-
implemented in current IM control system without disturbing trical signals from the IM, is proposed as shown in Fig. 1 and
the normal operation of the machines. In the areas of on- explained in detail in Fig. 9 of Section IV. The frame-
line monitoring of IM, several data-driven methods have work in Fig. 1 begins with synchronous time series signals
been applied, with multivariate statistical process monitor- collected from the machines, acting as inputs. As mining
ing methods and machine learning approaches [15]–[20] to unknown knowledge from data is one of the most important
name a few. However, in using these methods, assumptions characteristics of data-driven methods, all of the signals were
were often made that all faults are known a priori. With put together as inputs to proposed framework so that useful
the exception of some common faults, not all faults can be signal could be automatically found during training. In the
identified before the design of the diagnostic system, thus data preparation stage, a sliding window mechanism is used,
risking the misclassification of unknown faults. Although breaking down the time series data into short segments, before
fault detection is a reasonably mature field of research, there any preprocessing is done. Feature extraction/selection meth-
are very few techniques developed with real-time operations ods were then used to extract features and select the most
in mind, with the ability to predict incipient faults early informative ones for the following tasks. This is then followed
enough. Furthermore, in an IM, it can be subjected to various by the classification step, where models were built based
25056 VOLUME 5, 2017

on the selected features for fault detection. The purpose of • Generator: controls the load applied to IM;
multiple feature extraction/selection is to explore the sensi- • ITSC controller: constructss various types of ITSC
tive faulty features, and the different types of classification faults;
strategies is to increase the diversity of single based models. • Signal collector: collects various signals from the
With the increased diversity of the base learners, the ensemble sensors.
performance is expected to be enhanced and the robustness
of the fault detection can be guaranteed. From ensemble
learning theory [24], it was argued that the use of multi-
learner ensemble improves the performance as compared to
single base learners, provided the base learners are accurate
and diverse. Thus, the aim of the proposed framework is to
develop an on-line monitoring system, capable of detecting
ITSC faults in IMs, subjected to variable operating condi-
tions. It it envisaged that the incorporation of these methods
in an on-line monitoring setup will make the results sensitive
to low severity faults, robust in handling data noise and
operating conditions of the IMs and fast in detecting ITSC
faults.
Experimental studies were conducted on the proposed
framework, using healthy and faulty ITSC data, gener- FIGURE 3. Configuration of the three-phase stator windings with ITSC
fault.
ated from a three-phase powered IM. The results from
the studies showed that faulty condition can be distin- Fig. 3 illustrates how an ITSC fault is simulated in the
guished from healthy ones even at low severity as the controller. The quantity xa = NA /N represents the relative
unseen faults can be detected under the different working fraction of the fault in phase A, defined as the ratio between
conditions. the shorted turns NA and total turns N in each stator winding.
The rest of this paper is organized as follows. Section II Defined as short percentage (Short%), it represents the per-
presents the experimental setup and the description of the data centage of the stator windings that are short circuited in the
collected from IMs. Section III presents the data preprocess- test run. The higher the percentage, the more severe the fault
ing and feature extraction and selection methods. Section IV is. In simulating the degradation of the fault from incipient to
presents the classification techniques and proposed ensem- severe, the resistor R is attached to A stator coil. Different
ble framework for the condition monitoring system. Finally, values of Short% and R represents the different levels of
the results in Section V showed the effectiveness of the data- fault severities that the IM is subjected to. Based on domain
driven fault detection in IMs. knowledge, the fault severity will increase with the increase in
Short% and/or decrease in R. This phenomenon is illustrated
in Fig. 4 where it can be clearly seen that decreasing fault
severities implies the increasing difficulties in detecting the
faults.
FIGURE 2. Description of the experimental test bed.
II. EXPERIMENTAL SETUP

A. TEST BENCH DESCRIPTION FIGURE 4. Fault condition and severity.
Fig. 2 illustrates the set up of the experiment test bed con-
ducted with the IMs. It mainly consists of four parts:
• Power supplier: controls the fundamental frequency of B. DATA DESCRIPTION
the current/voltage in order to vary the mechanical speed The aforementioned test bed is then used for experimental
of the IM; investigations. The condition of an IM with ITSC is expressed
VOLUME 5, 2017 25057

by 2-tuple of (speed, load). The ‘speed’ values represent the that there are 5000 data points per second. In our experiments,
nominal speed and corresponds to the actual fundamental the window size is set at 2500 data points, or 0.5 second. This
frequency of the current/voltage of IM, controlled by the value was chosen so as to achieve a good balance between
power supplier. The value of speed is varied from 600 rpm to efficiency and effectiveness. If the window size is too big,
1400 rpm. The ‘load’ represents how much work the motor is the time information will be lost as all the data within the
outputting, with values ranging from 0 N·m and 5 N·m of the window will be considered as a single instance, i.e. the time
IM loading conditions. The sensors pick up data under the resolution of the data processing suffers. On the other hand,
different working conditions from the IM, including: phase if the window size is too small, the extracted information from
current from all three phase (IA , IB , IC ) and phase voltage the signal may not be accurate as the window contains too few
from all three phase (VA , VB , VC ). cycles, compromising the accuracy of the frequency spectrum
analysis.
TABLE 1. Data description.
B. FEATURE EXTRACTION
To further extract useful information from current (IA , IB , IC )
and voltage (VA , VB , VC ) in diagnosing the ITSC fault, vari-
ous feature extraction techniques were employed. Each tech-
nique will be used to process all the data from a single window
into a set of feature value. By building new features from
the original time series, feature extraction will reduce the
massive time series data points into a manageable synoptic
data structure whilst preserving most of the characteristics
of the time series. In addition, the use of feature extraction
provides an opportunity to incorporate domain knowledge
into the data.
Table 1 shows the description of three phase currents and
voltages collected from the experiments, with the Short% set 1) FFT
to four different values. When the IM is healthy, ITSC is 0% Fourier Transform (FT) is one of the most popular techniques
and R is initially set to positive infinity to collect the healthy used in analyzing time series and FFT (Fast Fourier Trans-
data. After approximately 300 seconds, ITSC is manually set form) is its fast implementation. Given a vector (x1 , . . . xN ),
with a certain value of Short% with the value of R gradually the time series is represented by its spectrum as
lowered, represent the degradation process. Other faulty data N
were also collected at the same time, i.e., from (0%, infinity) (j−1)(k−1)
X
x(k) = xj ωN (1)
to (2%, 50 ohm) and then (2%, 0.8 ohm) and so on. The sen- j=1
sors collect various types of data at the sampling frequency
of 5 kHz. where ωN = e(−2πi)/Nis the N th root of unity. After FT,
the coefficients obtained are complex values which are not
suitable for most of the existing classifiers. A following step
to get real-valued features is often required. Existing means
include calculating the real value, absolute value and image
value, etc. Here, the absolute values of FT coefficients are
used as the extracted features. Based on the corresponding
frequency components of the features, two types of FFT-
based feature extraction technique can be used:
• Basic FFT (FFT-B): FFT-B uses all frequency
FIGURE 5. Illustration of raw signal segmentation for feature extraction.
components as the extracted features. There are no pre-
assumptions made on the input signal and therefore
III. FAULT INDICATOR FFT-B can be applied to any type of time series.
A. DATA PREPARATION • Harmonic FFT (FFT-H): FFT-H uses the harmonic fre-
As discussed in Section I, in order to capture and identify quency component as the extracted feature. An assump-
specific trends or patterns in the data, a sliding window tion made on the input signal is that there is a funda-
mechanism is used. As illustrated in Fig. 5 i.e. how a raw mental frequency in the signal and its harmonics play an
signal is segmented for the purpose of feature extraction, important role in analyzing the signal. This is true for
the mechanism breaks down the long time series data into the phase currents and/or voltages in an IM and FFT-H
short segments before any preprocessing is done, thus pro- can be used to pick out the most important frequency
viding more insight into the machine behavior. Since the components. Another merit of FFT-H is that the features
sensors acquire data at the frequency of 5 kHz, this implies are independent of the fundamental frequency, i.e. when
25058 VOLUME 5, 2017

FIGURE 6. Amplitude of single-sided spectrum from VA .

FIGURE 7. Procedure of perceptual linear prediction.
the fundamental frequency changes, features from FFT-

H are in the same position and have the same physical
meanings. signal involves its decomposition over a basis obtained from
In essence, FFT-H uses the magnitudes of the harmonics a wavelet, possessing some specific properties, by dilations
as the extracted features. For illustrationi, Fig. 6 shows and translations. Each of the functions of this basis empha-
an example of the single-sided spectrum amplitude of a sizes both a specific spatial (temporal) frequency and its
phase A voltage, VA , under ÂąÂőspeed =1200 rpm and localization in physical space (time). Thus, the signal could
load = 0 N·mÂąÂŕ. The spectrum can be approximately be simultaneously analyzed in time and frequency spaces.
divided into three ranges, namely low (0 to 500 Hz), Wavelet transform decomposes the input signal into approx-
medium (500 to 1500 Hz) and high (1500 to 2500 Hz) imation and detail space represented by wavelet coefficients
frequencies. The highest amplitude of the spectrum is at in a series of sub-bands [28], and these coefficients can be
the fundamental frequency (40Hz) and based on domain used for feature extraction. In our experiments, the wavelet
knowledge, these first several harmonics (and half har- packet transform is used with the mother wavelet of ‘db4’.
monics) carry important information about the signal. After the decomposition, the following features are extracted:
Due to the inverter generating the signals, it is observed maximum, minimum, average power, mean, the standard
that there also exist two symmetric centers in the single- deviation as well as the mean of the absolute values of the
side spectrum, at 1002.5 Hz and 2005 Hz respectively. wavelet coefficients in each sub-band.
In ensuring that as much key information in the extracted
features are retained, the harmonics around these two C. FEATURE SELECTION
centers were also used as features. In the feature extraction stage, many features will be gener-
ated, forming a summary of the original signals. However,
2) PERCEPTUAL LINEAR PREDICTION not all of the extracted features carries useful information for
Perceptual Linear Prediction (PLP) is a popular feature classification. Feature selection is the following step , with
extraction technique in audio signal processing [25]. In PLP, the aim of selecting the most useful features for a specific
the information of the human hearing system is analysed task [29], [30]. In addition, it has been argued that many
such that only the perceptually relevant details remain. practical machine learning methods will degrade in perfor-
During the processing, a critical band analysis is followed by mance when unnecessary features exist [31]. Thus, in most
equal-loudness pre-emphasis and intensity to loudness com- circumstances, feature selection seeks to find a subset of
pression, which are done in frequency domain [26]. In this relevant or influential features from all original features under
way, the warped frequency perception, the nonlinear and some evaluation criterion [29].
frequency-variant human loudness perception are all mod- Since IM operates under varying conditions of changing
eled. The signals are transformed back to the time domain speed and load, this makes the fault detection task even more
after the above propcessing and regular linear prediction challenging. Feature extraction and selection play key roles
analysis is implemented. The block diagram of the feature by transforming the original time series into suitable space
extraction propcess is shown in Fig. 7. and selecting the features which are invariant to the changing
operation conditions of IM. Here, for the purpose of fault
3) WAVELET TRANSFORM detection, Fisher’s ratio is used as feature selection method.
From its first introduction by Grossmann and Morlet [27] in It calculates the ratio of squared inter-class divergence to
the mid-1980s, wavelet transform has been widely applied intra-class spread of a feature xj by:
in various applications such as pattern recognition, process- 2
ing and synthesizing various signals (e.g., speech), image mj(1) − mj(2)
FR(xj ) = (2)
analysis, etc. The wavelet transform of a one-dimensional σj(1)
2 + σ2
j(2)
VOLUME 5, 2017 25059

where mj(c) and σj(c)

2 are the instance mean and variance of
feature xj respectively, and c = 1, 2 represent the two classes.

The large value of the FR indicates the strong discriminative
power of a feature. As seen from above equation, Fisher’s
ratio evaluates the importance of the features independently,
which makes it computationally simple and able to compare
the importance of any two features.
FIGURE 8. Scatter plot of data with FFT-H features. FIGURE 9. Framework for fault detection in ITSC.
As an example, in our experiments, the features are

extracted and selected using FFT-H and Fisher’s ratio respec- during training. Thus, the framework should be gener-
tively, based on healthy and faulty data under 1200 rpm. It can alized to unseen data e.g., train on 2%, 5%, test on 10%.
be clearly seen that the top 10 features are all from IA . This 3) robust to noise: the framework trained on previous
result is expected as current is often used as the important historical data should correctly predict the data gen-
health indicator for the IM. Based on the top two features erated after training, which may be affected by noise,
selected above, the scatter plot of the available data under resulting from the re-start of the machine, the variations
‘[600, 800, 1200, 1400] rpm + [0, 5] N·m’ with healthy and of system dynamics etc.
faulty instances is shown in Fig. 8. It can be clearly seen In making use of the advantages of the different techniques
that the data under different speed are in different clusters. in detecting ITSC faults in the early stages, a multi-feature
In addition, instances under different load and same speed extraction/selection multi-model ensemble framework is pro-
may also form different clusters in the feature space, e.g., posed, as shown in Fig. 9 and was briefly introduced in
600 rpm. The top two features are very important features Section I. As mining unknown knowledge from data is one
under 1200 rpm in separating faulty from healthy instances. of the most important characteristics of data-driven meth-
However, this clear separation does not appear for all the ods, all of the signals were put together as inputs to pro-
speeds considered. There is low consistency between the posed framework so that useful signal could be automatically
features under different speed settings. found during training. The purpose of multiple feature extrac-
tion/selection as well as the different types of classification
IV. FAULT DIAGNOSE SYSTEM strategies is to increase the diversity of single based models.
In detecting ITSC faults, several challenges needs to be With the increased diversity of the base learners, the ensemble
addressed: 1) performance is expected to be enhanced. At the same time,
1) Detect ITSC in early stage: Although existing literature as the complexity of the model grows, an ensemble tends to
often uses current (IA , IB , IC ) signals as the critical have a lower bias, which will reduce the variance due to re-
fault indicator, in our experiments, however, no obvious sampling and re-weighting in the sample and feature spaces.
difference was observed in the early stages of ITSC Therefore, in terms of robustness and/or accuracy, an ensem-
e.g., 2% and 10%. This implies that simply analysing ble learning is usually better than individual learners.
the current signals will not guarantee the performance The first step implementing the proposed condition mon-
of the fault detection. itoring framework is to determine the specific techniques
2) Generalize to unseen data: most of existing driven- used in each block. In our experiments, four feature extrac-
driven approaches cannot generalize to unseen data, tion techniques were explored including the PLP, wavelet-
meaning that the testing data must have the same based feature extraction, FFT-B and FFT-H as introduced in
characteristics as the training data. However, as the Section III-B. Having obtained a set of features, the key for
operating conditions of IMs vary, it is not plausible condition monitoring of the IM is to build appropriate classi-
to exhaust all of the Short% under various conditions fiers, which can then be applied to identify whether instances
25060 VOLUME 5, 2017

are healthy or faulty. In practice, there are different types of TABLE 2. Classifier parameters.
classifiers namely linear, non-linear, statistical, kernel based,
etc, and each has its own advantages and disadvantages.
In out experiments, multiple classifiers were also explored
for the purpose of ensemble, including NB, NN [32], [33],
linear and nonlinear SVM [34]–[36], linear and nonlinear
ELM [37], [38]. All these models need to be trained
using training data prior to testing. In our experiments,
offline training is employed for the model training
process.
The step by step implementation of offline training is as
follows:
1) The training data was prepared into healthy and faulty
data set.
2) A sliding window mechanism was used to break down
the healthy and faulty data set from a long time series
into short segments before any preprocessing was
undertaken as explained in Section III-A.
3) Each data segment then goes into the four feature
extraction blocks and new features were built from the TABLE 3. Experimental setting for feature extraction.
data from the segment by the four extraction methods
independently. Specific features extracted from each
method were introduced in Section III-B.
4) The extracted features from the four feature extraction
methods were put together as inputs for feature selec-
tion. Fisher’s ratio was used as the feature selection
method to evaluate the importance of all the features
based on (2). Only high ranked features were retained
for classification. The number being kept were decided
are decided by the criterion that the classification per-
formance will not be much improved by incorporating
more features as inputs to classifiers.
5) The selected features then acts as inputs to each clas-
sifier. Models were trained off-line and the parameters
of each classifier were decided. The performance of the
classifiers were then evaluated by using the definitions
of TPR and TNR as introduced in Section V-A. V. EXPERIMENTAL VERIFICATION
6) The individual outputs of each classifier were then A. PERFORMANCE EVALUATION
combined at the ensemble stage using the majority
Classification performance evaluation is an important stage
voting approach.
in the construction of a fault detection system as it acts as
During offline training, the parameters of each block in
an index feeding back to the previous stages for improving
the framework were decided and models were saved for
the outputs from each stage or the whole system. It is also a
online testing. The parameters of the classifiers were given
way to reflect the ability of the classification system, helping
in Table 2 with Table 3 showing the four techniques and
to build confidence for users. For a binary classification
their extracted features as well as the number of extracted
problem, without the loss of generality, assuming healthy
features from each input signals. For on-line testing, as long
class is ‘positive’ and faulty class is ‘negative’, each classified
as a data segment is available as inputs, the processing
instance will belong to one of the following four categories:
in Fig. 9 will take place step-by-step and the ensuing results
will displayed. Since online testing is based on the saved • True Positive (TP): Instance is predicted as ‘positive’
models during offline training and only one data segment is when it is actually positive;
processed at a time, the detection of the ITSC faults were • False Positive (FP): Instance is predicted as ‘positive’
found to be very fast.were found to be very fast. Besides when it is actually negative;
giving improved performance, the proposed framework was • True negative (TN): Instance is predicted as ‘negative’
found to be easy to extend and modify, with the possibility when it is actually negative;
of incorporating new data and/or new techniques in each • False negative (FN): the instance is predicted to be ‘neg-
stage. ative’ while it is actually positive.
VOLUME 5, 2017 25061

FIGURE 10. Classification results under different number of features by using PLP.
In this paper, the True Positive Rate (TPR) and True Neg- same characteristics as the training ones. However, it is often
ative Rate (TNR) are utilized as measures to evaluate the not plausible to exhaust all of the Short% during training.
performance of the fault detection. TPR is the proportion of Therefore, the ‘Short%-Generalization’ is employed to test
correctly predicted positive instances defined as: the generalization from a group of Short% values to another
TP unseen Short% value, e.g., train on 2%, 5%, test on 10%.
TPR = (3) IMs work in varying operating conditions, e.g. changing
TP + FN
load or speed. Using the above-mentioned, the following
TNR measures the proportion of negative instances that are sections will test: 1) if the framework can detect the ITSC
correctly identified defined as: in early stage,. e.g., 2% ITSC fault; 2) time-generalization
TN ability (robustness); 3) short percentage generalization abil-
TNR = (4)
TN + FP ity in multi-load and multi-speed scenarios. The analysing
results will be shown in Sections V-B, V-C and V-D. The
results will be summarized in Section V-F. Note that the
TABLE 4. Performance evaluation methods.
following sections only show the representative results due to
the space limitation, which are usually the worst case or the
most difficult testing scenarios based on the data listed
in Table 1.
B. MULTI-LOAD SCENARIO
Load can be connected or disconnected from the IMs, leading
to operating point variations. It is clear from Fig. 8 that
the instances under load and no load conditions may form
different clusters even under the same speed. Whilst it is not
Table 4 lists the three evaluation methods that are used in
plausible to validate against all possible conditions, the ability
the experiments to evaluate the performance of the classifi-
to generalize and detect the unseen behaviors is important.
cation. The 10CV is to randomly split the data into 10 equal
In the following part, the above issue is addressed by selecting
folds and each fold is then selected as testing data and the
the appropriate feature extraction methods, classifiers and
rest as training data. For time series data, the performance
their combinations as in the framework 9 for the framework
evaluated by using 10CV method is used optimistic as data
shown in Fig 10.
appeared in past and some future time would be used to make
predictions for the data to appear in-between the past and
future. However, classification model trained using previous 1) 10-FOLD CROSS-VALIDATION UNDER MULTI-LOAD
historical data may not be able to correctly predict the data SCENARIO
generated after training which may be affected by re-start of One of the key task of the proposed framework is the ability
the machine, the variations of system dynamics etc. In order to generalize unseen scenarios which assumes there exists
to verify whether the data variations can be handled, ‘Time- an underlying set of rules relating the parts of the input
Generalization’ is used to ensure that the testing data will signals or features that does not change when the condition of
always appear in time after training. In this way, the impact IM changes. This relationship, if captured by the classifiers,
of time sequence will also be investigated. In addition, most will enable the framework to function under a range of in a
of existing driven-driven approaches is unable to generalize wide range of operating conditions. This investigation is thus
to unseen data, implying that the testing data must have the focused on finding the investigate feature extraction method
25062 VOLUME 5, 2017

that can identify signals with little variance at different • PLP and FFT-H perform better than the other two meth-
working conditions across the different settings. ods, PLP with better acceptability;
The previously discussed four feature extraction tech- • FFT-B performed the worst with some of the values very
niques, namely PLP, Wavelet-based feature extraction, FFT-B close to 0.5 (random guess);
and FFT-H were selected and implemented. For comparisons • Wavelet-based feature extraction showed low perfor-
of the performance between the different feature extraction mance in TPR with some of them close to 0.6.
methods, all available healthy (0% short circuit + Inf ohm) In addition, when comparing the results from FFT-B and
and faulty (2%,5%,8%,10%) data under each operation con- FFT-H, it is obvious that FFT-H performed better. This is rea-
dition (speed, load) were used in the 10CV evaluation way sonable and within expectation. The fundamental frequency
to estimate the performance of TPR and TNR. NB was the will change with the load under the same speed. The true
chosen classifier due to its simplicity and potentially high fundamental frequency will change over time even under the
generalization ability. The accuracy (ACC) is used as another same speed and load. Therefore, FFT-H which chooses the
metric to select the number of features according to the features based on the actual fundamental frequencies will
classification performance. ACC is the probability that the be more reliable than FFT-B in a fundamental frequency-
classifier classifies a randomly selected instance to the correct dynamic environment. Based on the above analysis and com-
class, which is the proportion of the total number of predic- parison, it is concluded that PLP and FFT-H be the first choice
tions that are correct. It is determined using the equation: and Wavelet-based feature extraction the second choice.
TP + TN FFT-B will not be suitable for fundamental frequency-
ACC = changing signals.
TP + FP + TN + FN
Table 3 shows the results from the 4 feature extraction
2) TIME-GENERALIZATION (ROBUSTNESS) UNDER
techniques and their corresponding number of extracted fea-
MULTI-LOAD SCENARIO
tures from the experiments. The number of features are
The instances in the data set appear in time order. When 10CV
selected based on the classification performance. For exam-
is used, the time order is randomized when generating the
ple, the classification results of ‘speed =1200 rpm and load
training and testing data, i.e., instances in training data may
= [0, 5] N·m’ by using PLP and NB is plotted as an example
appear after testing instances which is not case in practice.
in Fig. 10, which showed that the classification performance
Thus, the experiment here is designed to testify. From the
does not indicate any obvious change after 55 features. In our
results obtained in Section V-B.1, it was shown that FFT-
experiemnts, 11 features from total 66 features were selected
H is a generally good feature extraction technique as the
from each input signals for classification.
data under ‘speed = 1400 rpm’ has more overlaps between
TABLE 5. Performance of the four feature extraction techniques.
the faulty and the healthy data in the feature space. Thus,
in the experiments, the data under ‘speed = 1400 rpm + load
= [0 5] N·m’ was chosen, with FFT-H employed to extract
features. In addition to the extracted features, the nominal
fundamental and actual fundamental frequencies from fourier
transform were also used as two additional features. This
provided the ability to handle variations in the fundamental
frequency, are mainly caused by the changes of load over
time. Six classifiers are used in the experiments including
NB, SVM, NN, ELM and the linear forms of SVM and
ELM, namely, linearSVM and linearELM. In the experi-
ments, the instances under each condition (load, Short%,
resistance) were first separated into two parts based on their
appearing time. Within the available data, the first part is used
as training which includes instances appearing during the
first half of data capture time, whilst the second part is used
as testing which includes those appeared after the first part.
In obtaining a robust estimation of the performance, in each
round, 90% of the training instances were randomly selected
to train a classifier which was then tested on the testing data.
Table 6 shows the performance of the classifiers, with ‘P’
and ‘N’ being the number of positive and negative instances
Table 5 shows the averaged TPR and TNR for the four respectively.
feature extraction techniques, with the performance values The observations from the table are as follows:
< 0.8 being highlighted. The observations from the table are • NB seems to be the worst classifier as compared to the
as follows: other classifiers since it has the worst TPR;
VOLUME 5, 2017 25063

• TNR values are shown to be generally good from the six TABLE 8. Performance of short%-generalization evaluation.
classifiers;
• In essence, the it can be concluded that both nonlin-
ear (SVM, NN, ELM) and linear classifiers (linearSVM
linearELM) can be used for monitoring the working
conditions;
• Although not indicated in the table, it should be noted
that linearSVM is more computationally intensive than
the rest. Specifically, in linearSVM requires several hun-
dred seconds for training whilst the others needs less
than 20 seconds.
TABLE 6. Performance of time-generalization under multi-load scenario.
• Separating 2% faults from healthy data is found to be

very difficult as indicated from the TPR values. This is
TABLE 7. Four experimental scenarios for Short% generalization. reasonable since 2% faults is in the very early stages of
ITSC faults and it belongs to low severity fault.
• Training on lower severity fault and testing on higher
severity fault is possible, providing a mean to detect
ITSC faults in very early stages.
• The two linear classifiers, linearSVM and linearELM,
showed similar performance as compared to the nonlin-
ear classifiers, therefore, only nonlinear classifiers were
considered in the following sections.
3) SHORT PERCENTAGE-GENERALIZATION UNDER
MULTI-LOAD SCENARIO C. MULTI-SPEED SCENARIO
The result in Table 6 showed the performance of fault detec- In this section, the affect of the detecting performance
tion using different classifiers. Here, the ability of the frame- on changing speed, under multi-load multi-speed scenar-
work to detect ITSC faults in the early stages, based on the ios, is evaluated. From Fig. 8, it is clearly seen that
performance of different Short%, is evaluated. As it is not the data under different speed form different clusters,
possible to test all of the Short% during training and testing which makes it difficult to find features that are speed-
due to heavy computational load, it is, therefore, important invariant for the classifiers to distinguish faulty from healthy
that the framework is incorporated with generalization capa- instances. It is more complicated scenarios than those in
bility. In test out whether a model trained on a group of Section V-B.
Short% can pick out faults in another Short%, four scenarios,
listed in Table 7, were designed , where one Short% is taken 1) TIME-GENERALIZATION (ROBUSTNESS) UNDER
out for testing and the remaining for training. Similar to the MULTI-SPEED SCENARIO
previous experiments, data under ‘speed = 1400 rpm + load = (1200, 1400) rpm of the mixed data gives an appropriate
[0 5] N·m’ were used and FFT-H was employed to extract coverage of the speed range for testing the multi-speed sce-
features. All instances from the data set were first separated narios. In addition, it is observed from the result in Table 5
into two groups namely training and testing based on their that the healthy and faulty data at 1200 rpm data speed can
appearing time. be separated easily in two dimensional feature space while
Table 8 shows the ensuing performance including TPR and those of 1400 rpm overlap heavily, making it difficult for the
TNR for each case using the six different classifiers. From the classification. In our simulations, the first half of the data
table, the following can be observed: from both 1200 rpm and 1400 rpm were used as training
• The testing performance from most classifiers is shown data and second half as testing. The same data were also
to be satisfactory in terms of TPR and TNR when tested using 10-fold CV and 4 classifiers were also explored,
Short% equals 5%, 8% and 10%. An exception is in namely, SVM, NN, ELM and NB. As 10-fold CV gives an
case 2, where NN obtained very poor result. NB per- optimistic estimation of the performance to time series and
formed the worst with un-acceptable TPR (<50%) in NB continues to perform badly, NB will not be considered in
cases 3 and 4; the following sections.
25064 VOLUME 5, 2017

TABLE 9. Performance of time-generalization under multi-speed TABLE 11. Performance of short%-generalization under multi-speed
scenario. scenario.
Comparing Table 9 under multi-load multi-speed scenario

with Table 6 under multi-load single-speed scenario, the fol-
lowing were observed:
• All classifiers performed better on 1200 rpm data com-
pared to 1400 rpm data, which is in agreement with the
classification results observed in Section V-B.1;
• Compared with Table 6 under the multi-load single-
speed scenario, the performance is decreased when
training and testing with multi-speed. However, the per-
formance was found to still be satisfactory. This implies • It is easier to detect ITSC fault under no-load scenario
that the framework can be applied on changing speed for multi-speed case as compared with having load, i.e.
and load conditions with the selected feature and classi- 5 N·m in the experiments, which is not obvious in single-
fiers; speed cases.
• SVM and ELM perform well and NN seems to be the • The performance of detecting 2% ITSC is relatively
worst among the three classifiers. good when the speed is 1200 rpm, but the detection
failed at 1400 rpm. This is because there is much
TABLE 10. Four experimental scenarios for short% generalization.
overlap between the healthy data and 2% faulty data
of 1400 rpm.
• Training on lower severity fault and testing on higher
severity fault is possible in multi-speed cases, which
provides a mean to detect the ITSC faults in very early
stage.
D. ENSEMBLE FOR FAULT DETECTION

Ensemble is expected to achieve improved performance
under the condition that the multiple ‘weak’ classification
models are very different in predicting testing instances. Here
‘weak’ means that a classifier’s performance is low but should
be higher than random guesses (i.e. TPR and TNR > 0.5).
For consideration of simplicity, majority voting is used as
2) SHORT%-GENERALIZATION UNDER the ensemble method. In all experiments, we never see a
MULTI-SPEED SCENARIO case where all classifiers produce good results. Each time
The experiments conducted here are similar to those in some fail to classify acceptably. In practice, the condition
Section V-B.3, with the difference being the data from two for improved ensemble is hard to be fulfilled and ensemble
speed values at 1200 rpm and 1400 rpm were also included generally obtains an intermediate performance of the multiple
in both training and testing. Table 10 lists the four cases of classifiers. This is also the case in our experiments. See for
the experimental scenarios. example in Table 9 under speed = 1400 rpm, the highest
The corresponding results from the 4 cases are shown and lowest TPR are 0.89 (by SVM) and 0.63 (by NN), while
in Table 11, inclusive of the overall and detailed perfor- the ensemble TPR of the three classifiers is 0.84. Similar
mances. From the results, the following were observed: phenomenon can be observed from the TNR values in this
• The performance of NN is not stable, as it can detect case. Even though ensemble does not guarantee improvement
5% in single speed scenario but unable to do in multi- in classification performance compared to each base classi-
speed scenarios. The performance of SVM and ELM fier, it can generally be expected that the robustness of the
decreased as compared with the results in Table 11, classification can be improved by producing the ensemble
however, the results in classifying the different Short% results.
is acceptable with the lowest values of TPR and TNR are Based on their performance as shown and discussed in
both greater than 0.8 in most of the cases; the above sections, the three nonlinear classifiers SVM, NN
VOLUME 5, 2017 25065

and ELM are selected as the base classifiers for ensemble. TABLE 13. Comparison of performance of short%-generalization under
multi-speed scenario.
The experiments are done in the same scenarios as in the
Section V-C by the evaluation of Time-Generalization and
Short%-Generalization. Ensemble results of the two scenar-
ios are shown in last columns of Table 9 and Table 11.
TABLE 12. Comparison of performance of time-generalization under

multi-speed scenario.
variation with time. In addition, detecting ITSC of IM work-

E. COMPARE WITH DEEP NEURAL NETWORK BASED ing under multi-load multi-speed scenarios is not well stud-
FAULT DETECTION ied. To solve the above problems, multiple scenarios have
Deep neural networks are very efficient in complex nonlinear been designed and tested based on the designed test bed as
system modeling and have recently gained much success on shown in Figure 2. Using the selected features in Table 3
various applications, such as classification, speech recogna- and trained models with the parameters settings in Table 2,
tion, etc. Deep autoencoder (DAE) is one type of deep neural the condition monitoring system can work in a wide range
network that can automaticly learn the concise representa- of operating conditions as long as the operating condi-
tions of the big data in either unsuperivsed or supervised tion (speed, load) of the testing data is included in the training
way. In this work, we compare our proposed methods with stage. In addition, the good performance in Table 5 and 9
DAE based fault detection. The DAE is used to extracted indicates that the proposed framework is robust to the system
features from the data and SVM is used as classifiers. We tried noise and small system dynamic variations resulted from
to used the normalized raw data (IA , IB , IC , VA , VB , VC ) for re-start or running of the IM, etc. Table 8 and 11 test the
fault detection, but the results are not acceptable with unbal- generalization from one group of Short% to another different
anced TPR and TNR. So we transformed the raw data into Short%. This means that a model trained on the training data
frequency domain by FFT-H, then the DAE is performed for does not have all knowledge about testing data and needs to
feature extraction followed by SVM as classifer. The DAE depend on its generalization ability for prediction. The good
has 4 layers, and each layer is composed of 1200-500-200- performance of implementations implies that the framework
1200 nodes. After feature extraction, the outputs of the third has the ability to predict data from an unknown Short%. The
layer are used as the inputs for SVM to train the classifier. The is promising since it is not possible to exhaust all the Short%
parameters settings of DEA are {activation function: sigmoid; during training, we can always train the framework with low
learning rate:0.0005; momentum: 0.5; regularization: 0.0001; severity fault to guarantee the testing of higher severity fault.
iteration 2000}. The parameters settings of SVM are {RBF
kernel, C = 100, γ = 0.2}. We compare the DAE based fault VI. CONCLUSION
detection with our methods for the two scenarios in Tables 9 In this paper, we have designed and described a con-
and 11 which also transform the raw data with FFT-H as dition monitoring framework with multi-feature extrac-
inputs. The comparison results are shown in Table 12 and 13 tion/selection multi-model ensemble. In addition to 10-fold
respectively. It can be observed that the performance of our Cross-Validation which is used as a general method in data
proposed method is better that DAE based method in most mining to evaluate the testing performance, two other evalua-
cases. tions were also carried out for assessing the model generaliza-
tion abilities including the generalization of predicting new
F. DISCUSSION time series appearing in a future time (Time generalization),
The proposed framework evaluates the large amounts of data predicting new time series of a different short circuit percent-
without disturbing the normal operation of the IM in real- age (Short% generalization). Experimental studies based on
time. It is not restricted to the domain knowledge and expe- the proposed framework were conducted on real data which
rience but provide the possibility to mining the unknown includes healthy data and the ITSC faulty data generated
information from the data by simply testing the different from a three-phase powered IM. With the selected inputs and
combinations of inputs, features and classifiers which is too techniques in the framework, the results from the empirical
complicated to build the mathematical models. studies showed that it is possible to distinguish the faulty
Most of the existing driven-driven approaches are evalu- condition from the healthy ones in early stage and can also
ated by 10CV, which does not consider the system dynamic detect the unseen fault under different working conditions.
25066 VOLUME 5, 2017

ACKNOWLEDGEMENT [20] A. Siddique, G. S. Yadava, and B. Singh, ‘‘Applications of artificial intelli-

The authors would like to acknowledge the support received gence techniques for induction machine stator fault diagnostics: Review,’’
in Proc. 4th IEEE Int. Symp. Diagnostics Elect. Mach., Power Electron.
from the Rolls-Royce Singapore Advanced Technology Drives (SDEMPED), Aug. 2003, pp. 29–34.
Centre (ATC) for the work done under the Computational [21] W. He, Y. Zi, B. Chen, F. Wu, and Z. He, ‘‘Automatic fault feature extrac-
Engineering Lab (CEL). tion of mechanical anomaly on induction motor bearing using ensem-
ble super-wavelet transform,’’ Mech. Syst. Signal Process., vols. 54–55,
pp. 457–480, Mar. 2015.
REFERENCES [22] G. Georgoulas, T. Loutas, C. D. Stylios, and V. Kostopoulos, ‘‘Bear-
ing fault detection based on hybrid ensemble detector and empirical
[1] P. K. Dagadkar and C. V. Honade, ‘‘Monitoring of power transformer
mode decomposition,’’ Mech. Syst. Signal Process., vol. 41, nos. 1–2,
incipient fault,’’ Int. J. Innov. Res. Sci. Technol., vol. 2, no. 3, pp. 187–189,
pp. 510–525, 2013.
2015.
[23] M. Seera, C. P. Lim, S. Nahavandi, and C. K. Loo, ‘‘Condition moni-
[2] H. V. Padullaparti, P. Chirapongsananurak, M. E. Hernandez, and
toring of induction motors: A review and an application of an ensem-
S. Santoso, ‘‘Analytical approach to estimate feeder accommodation limits
ble of hybrid intelligent models,’’ Expert Syst. Appl., vol. 41, no. 10,
based on protection criteria,’’ IEEE Access, vol. 4, pp. 4066–4081, 2016.
pp. 4891–4903, 2014.
[3] J.-H. Jung, J.-J. Lee, and B.-H. Kwon, ‘‘Online diagnosis of induc-
[24] T. G. Dietterich, ‘‘Ensemble methods in machine learning,’’ in Multiple
tion motors using MCSA,’’ IEEE Trans. Ind. Electron., vol. 53, no. 6,
Classifier Systems. Berlin, Germany: Springer, 2000, pp. 1–15.
pp. 1842–1852, Dec. 2006.
[25] G. Zweig and M. Picheny, ‘‘Advances in large vocabulary
[4] A. Khlaief, M. Boussak, and M. Gossa, ‘‘Open phase faults detection in continuous speech recognition,’’ Adv. Comput., vol. 60, pp. 249–291,
pmsm drives based on current signature analysis,’’ in Proc. 19th Int. Conf. 2004. [Online]. Available: http://www.sciencedirect.com/science/
Elect. Mach. (ICEM), Sep. 2010, pp. 1–6. article/pii/S0065245803600124
[5] S. M. A. Cruz and A. J. M. Cardoso, ‘‘Stator winding fault diagnosis in [26] H. Hermansky, ‘‘Perceptual linear predictive (PLP) analysis of speech,’’
three-phase synchronous and asynchronous motors, by the extended Park’s J. Acoust. Soc. Amer., vol. 87, no. 4, pp. 1738–1752, 1990.
vector approach,’’ IEEE Trans. Ind. Appl., vol. 37, no. 5, pp. 1227–1233, [27] A. Grossmann and J. Morlet, ‘‘Decomposition of hardy functions into
Sep. 2001. square integrable wavelets of constant shape,’’ SIAM J. Math. Anal.,
[6] J. L. Kohler, J. Sottile, and F. C. Trutt, ‘‘Condition monitoring of stator vol. 15, no. 4, pp. 723–736, 1984.
windings in induction motors. I. Experimental investigation of the effective [28] S. G. Mallat, ‘‘A theory for multiresolution signal decomposition:
negative-sequence impedance detector,’’ IEEE Trans. Ind. Appl., vol. 38, The wavelet representation,’’ IEEE Trans. Pattern Anal. Mach. Intell.,
no. 5, pp. 1447–1453, Sep. 2002. vol. 11, no. 7, pp. 674–693, Jul. 1989.
[7] H. Henao, C. Demian, and G.-A. Capolino, ‘‘A frequency-domain detec- [29] I. Guyon and A. Elisseeff, ‘‘An introduction to variable and feature selec-
tion of stator winding faults in induction machines using an external tion,’’ J. Mach. Learn. Res., vol. 3, pp. 1157–1182, Jan. 2003.
flux sensor,’’ IEEE Trans. Ind. Electron., vol. 39, no. 5, pp. 1272–1279, [30] Y. Saeys, I. Inza, and P. Larrañaga, ‘‘A review of feature selec-
Sep./Oct. 2003. tion techniques in bioinformatics,’’ Bioinformatics, vol. 23, no. 19,
[8] M. B. K. Bouzid and G. Champenois, ‘‘New expressions of symmetrical pp. 2507–2517, 2007.
components of the induction motor under stator faults,’’ IEEE Trans. Ind. [31] R. Kohavi and G. H. John, ‘‘Wrappers for feature subset selection,’’ Artif.
Electron., vol. 60, no. 9, pp. 4093–4102, Sep. 2013. Intell., vol. 97, nos. 1–2, pp. 273–324, 1997.
[9] J. Cusidó, L. Romeral, J. A. Ortega, J. A. Rosero, and A. G. Espinosa, [32] K. Hornik, ‘‘Approximation capabilities of multilayer feedforward net-
‘‘Fault detection in induction machines using power spectral density works,’’ Neural Netw., vol. 4, no. 2, pp. 251–257, 1991.
in wavelet decomposition,’’ IEEE Trans. Ind. Electron., vol. 55, no. 2, [33] F. Scarselli and A. C. Tsoi, ‘‘Universal approximation using feedforward
pp. 633–643, Feb. 2008. neural networks: A survey of some existing methods, and some new
[10] B. Aubert, J. Régnier, S. Caux, and D. Alejo, ‘‘Kalman-filter-based indi- results,’’ Neural Netw., vol. 11, no. 1, pp. 15–37, 1998.
cator for online interturn short circuits detection in permanent-magnet [34] C. Cortes and V. Vapnik, ‘‘Support-vector networks,’’ Mach. Learn.,
synchronous generators,’’ IEEE Trans. Ind. Electron., vol. 62, no. 3, vol. 20, no. 3, pp. 273–297, 1995.
pp. 1921–1930, Mar. 2015. [35] R. G. Brereton and G. R. Lloyd, ‘‘Support vector machines for classifica-
[11] F. Duan and R. Živanović, ‘‘Condition monitoring of an induction motor tion and regression,’’ Analyst, vol. 135, no. 2, pp. 230–267, 2010.
stator windings via global optimization based on the hyperbolic cross [36] C.-W. Hsu et al. (2010) A Practical Guide to Support Vector
points,’’ IEEE Trans. Ind. Electron., vol. 62, no. 3, pp. 1826–1834, Classification. [Online]. Available: http://www.csie.ntu.edu.tw/~cjlin/
Mar. 2015. papers/guide/guide.pdf
[12] C. H. De Angelo, G. R. Bossio, S. J. Giaccone, M. I. Valla, J. A. Solsona, [37] G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew, ‘‘Extreme learning machine:
and G. O. García, ‘‘Online model-based stator-fault detection and identi- Theory and applications,’’ Neurocomputing, vol. 70, nos. 1–3,
fication in induction motors,’’ IEEE Trans. Ind. Electron., vol. 56, no. 11, pp. 489–501, 2006.
pp. 4671–4680, Nov. 2009. [38] G.-B. Huang, H. Zhou, X. Ding, and R. Zhang, ‘‘Extreme learning machine
[13] R. Isermann, ‘‘Model-based fault-detection and diagnosis—Status and for regression and multiclass classification,’’ IEEE Trans. Syst., Man,
applications,’’ Annu. Rev. Control, vol. 29, no. 1, pp. 71–85, 2005. Cybern. B, Cybern., vol. 42, no. 2, pp. 513–529, Apr. 2012.
[14] S. Bachir, S. Tnani, J. C. Trigeassou, and G. Champenois, ‘‘Diagnosis
by parameter estimation of stator and rotor faults occurring in induction
machines,’’ IEEE Trans. Ind. Electron., vol. 53, no. 3, pp. 963–973,
Jun. 2006.
[15] C. Li et al., ‘‘Bearing fault diagnosis using fully-connected winner-take-all
autoencoder,’’ IEEE Access, to be published.
[16] S. Yin, S. X. Ding, X. Xie, and H. Luo, ‘‘A review on basic data-driven ZHAO XU received the B.Eng. and M.Eng.
approaches for industrial process monitoring,’’ IEEE Trans. Ind. Electron., degrees in automatic control from Northwestern
vol. 61, no. 11, pp. 6418–6428, Nov. 2014.
Polytechnical University, Xi’an, China, in 2005
[17] S. Toma, L. Capocchi, and G.-A. Capolino, ‘‘Wound-rotor induction gener-
and 2008, respectively, and the Ph.D. degree in
ator inter-turn short-circuits diagnosis using a new digital neural network,’’
control and instrumentation from Nanyang Tech-
IEEE Trans. Ind. Electron., vol. 60, no. 9, pp. 4043–4052, Sep. 2013.
[18] B. Yao, P. Zhen, L. Wu, and Y. Guan, ‘‘Rolling element bearing fault
nological University, Singapore, in 2013. She
diagnosis using improved manifold learning,’’ IEEE Access, vol. 5, was a Research Scientist with the Institute of
pp. 6027–6035, 2017. High Performance Computing from 2012 to 2014.
[19] M. B. K. Bouzid, G. Champenois, N. M. Bellaaj, L. Signac, and K. Jelassi, Since 2015, she has been an Assistant Professor
‘‘An effective neural approach for the automatic location of stator interturn with the School of Electronics and Information,
faults in induction motor,’’ IEEE Trans. Ind. Electron., vol. 55, no. 12, Northwestern Polytechnical University. Her research interests include con-
pp. 4277–4289, Dec. 2008. trol, data-based diagnosis, and prognosis.
VOLUME 5, 2017 25067

CHANGHUA HU was a Visiting Scholar with CHI-KEONG GOH (SM’14) received the B.Eng.
the University of Duisburg in 2008. He is cur- and Ph.D. degrees in electrical engineering from
rently a Professor with the High-Tech Institute, the National University of Singapore, Singapore,
Xi’an, China. He has published two books, and in 2003 and 2007, respectively.
about 100 articles. His research interests include He is currently the Team Lead in data ana-
fault diagnosis and prediction, life prognosis, and lytics and optimization with the Rolls-Royce
fault tolerant control. He received the Changjiang Advanced Technology Centre, Singapore. His cur-
Scholar by the Chinese Ministry of Education rent research interests include evolutionary com-
in 2013. putation and data analytics and their application.
Dr. Goh has served as a reviewer for various
international journals, such as the IEEE TRANSACTIONS ON SYSTEMS, MAN, AND
CYBERNETICS, PART B: CYBERNETICS, the IEEE TRANSACTIONS ON SYSTEMS, MAN,
AND CYBERNETICS, PART C: APPLICATIONS AND REVIEWS, and Neurocomputing.
AMIT GUPTA (SM’09) received the bache-

lor’s degree in electrical engineering from IIT
Roorkee and the Ph.D. degree in electrical engi-
FENG YANG received the B.Eng. degree in infor- neering from the National University of Singa-
mation engineering and the M.Eng. degree in con- pore. From 2000 to 2012, he was with Bechtel
trol engineering from Xi’an Jiaotong University, Corporation, Samsung Heavy Industries, Delphi
Xi’an, China, and the Ph.D. degree in machine Automotive Systems, and Vestas Wind Systems.
learning and bioinformatics from Nanyang Tech- Since 2012, he has been the Chief of the Rolls-
nological University (NTU), Singapore, in 2012. Royce Electrical, Rolls-Royce Singapore Pte. Ltd.
From 2011 to 2012, he was a Research Associate He is currently the Director of the Electrical Pro-
with NTU. In 2012, he joined the Institute of gramme at the Rolls-Royce@NTU Corporate Laboratory. He is fellow of the
High Performance Computing, Agency for Sci- IET and a Chartered Engineer from the Engineering Council U.K. He is an
ence, Technology, and Research, Singapore, where Associate Editor of the IEEE TRANSITIONS ON POWER ELECTRONICS and plays an
he is currently a Scientist with the Department of Computing Science. His active role in organizing electrical power engineering conferences in Asia.
research interests include machine learning, signal processing, information
retrieval, big data, and data-based diagnosis and prognosis.
SIVAKUMAR NADARAJAN (S’12) received the

bachelor’s degree in electrical and electronics
engineering and the master’s degree in electri-
cal drives and controls from the Pondicherry
Engineering College, Pondicherry University,
Puducherry, India, in 2002 and 2004, respectively.
He is currently pursuing the Ph.D. degree with
the National University of Singapore, Singapore,
SHYH-HAO KUO is involved in parallel and where he has been involved in modeling, design,
concurrent programming languages, and software and demonstrating condition monitoring tech-
refactoring and algorithm transformations soft- niques for brushless synchronous generators.
ware engineering methods for concurrent pro- From 2004 to 2009, he was a Lecturer with SVCE, Sriperumbudur, India,
gramming optimization algorithms. a Research Engineer with M S Elevators (Toshiba joint venture), Malaysia,
and a Software Engineer with Delphi Automotive Systems, Singapore.
Since 2009, he has been with Rolls-Royce Singapore Pte. Ltd, Singapore.
His research interests include electric drives, power electronic converters,
condition monitoring, and modeling of electrical machines.
25068 VOLUME 5, 2017

Data-Driven Inter-Turn Short Circuit Fault Detection in Induction Machines

Uploaded by

Copyright:

Available Formats

Data-Driven Inter-Turn Short Circuit Fault Detection in Induction Machines

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data-Driven Inter-Turn Short Circuit Fault Detection in Induction Machines

Uploaded by

Copyright:

Available Formats

SPECIAL SECTION ON DATA-DRIVEN MONITORING, FAULT DIAGNOSIS AND

CONTROL OF CYBER-PHYSICAL SYSTEMS

Data-Driven Inter-Turn Short Circuit Fault

Corresponding author: Changhua Hu (hch_reu@sina.cn)

I. INTRODUCTION The subject of diagnosing ITSC faults in the early stages,

FIGURE 1. Proposed diagnostics condition-monitoring framework.

25056 VOLUME 5, 2017

FIGURE 2. Description of the experimental test bed.

II. EXPERIMENTAL SETUP

VOLUME 5, 2017 25057

25058 VOLUME 5, 2017

FIGURE 6. Amplitude of single-sided spectrum from VA .

the fundamental frequency changes, features from FFT-

VOLUME 5, 2017 25059

where mj(c) and σj(c)

feature xj respectively, and c = 1, 2 represent the two classes.

As an example, in our experiments, the features are

25060 VOLUME 5, 2017

VOLUME 5, 2017 25061

25062 VOLUME 5, 2017

VOLUME 5, 2017 25063

TABLE 6. Performance of time-generalization under multi-load scenario.

• Separating 2% faults from healthy data is found to be

25064 VOLUME 5, 2017

Comparing Table 9 under multi-load multi-speed scenario

D. ENSEMBLE FOR FAULT DETECTION

VOLUME 5, 2017 25065

TABLE 12. Comparison of performance of time-generalization under

variation with time. In addition, detecting ITSC of IM work-

25066 VOLUME 5, 2017

ACKNOWLEDGEMENT [20] A. Siddique, G. S. Yadava, and B. Singh, ‘‘Applications of artificial intelli-

VOLUME 5, 2017 25067

AMIT GUPTA (SM’09) received the bache-

SIVAKUMAR NADARAJAN (S’12) received the

25068 VOLUME 5, 2017

You might also like