1. Introduction
Global warming refers to the continuous rise in the average temperature of the Earth’s climate system. Over the past 50 years, due to uncontrolled emissions of greenhouse gasses, the average temperature has increased at the fastest rate on record [
1]. To address this severe challenge, the global energy structure is rapidly transitioning towards cleaner and low-carbon energy sources. Against this backdrop, wind power, as an important renewable energy source, has seen continuous rapid growth in installed capacity and power generation [
2]. By the end of 2022, the total installed capacity of onshore and offshore wind power globally reached 841.9 GW and 64.3 GW, respectively, with 77.6 GW of new wind power capacity added during the year [
3]. Wind turbines are typically installed in harsh environments such as oceans and are subjected to adverse factors such as alternating loads. As operational time increases, the health condition of turbines gradually deteriorates, significantly reducing their power generation performance and affecting the overall efficiency of wind farms [
4]. Therefore, there is an urgent need to accurately assess the extent of performance anomalies in wind turbines and implement effective maintenance measures or technical upgrades to enhance their power generation capabilities. The performance of wind turbines is typically evaluated using methods such as the capacity factor, availability, power generation, wind turbine power curve (WTPC) analysis, and multivariate power estimation [
5].
The capacity factor, availability, and power generation reflect the fault time and fault rate of wind turbines, covering multiple aspects such as power generation efficiency, operational status, and environmental impact. The WTPC and multivariate power estimation provide more detailed performance analysis, supporting fault diagnosis and optimized control [
6]. The WTPC is a core tool for describing the relationship between wind speed and power output, used to evaluate the power generation efficiency of wind turbines. By combining measured wind speed data with the WTPC, the power generation of a wind turbine in a specific location and time period can be predicted. Additionally, analyzing the differences between actual power output and the expected values from the WTPC can effectively identify and diagnose potential faults in the wind turbine system. However, due to the variability of wind power and the unpredictability of wind speed, the standard power curves defined by the International Electrotechnical Commission (IEC) often struggle to accurately monitor the actual performance and operational status of wind turbines [
7]. The IEC standard requires averaging data over wind speed intervals of 0.5 m/s or 1 m/s. While this method is simple, it may obscure subtle variations in turbine operation. In the field of WTPC modeling, extensive research has been conducted, covering various machine learning algorithms, including backpropagation neural networks (BPNNs) [
8], support vector machines (SVMs) [
9], extreme learning machines (ELMs) [
10], and deep learning models such as convolutional neural networks (CNNs) [
11]. Wang et al. [
11] proposed an innovative data-driven deep learning method, which integrates an ELM, channel attention mechanisms, a CNN, and Huber loss functions, significantly improving the modeling accuracy of the WTPC. Furthermore, Mehrjoo et al. [
12] developed a hybrid estimation method based on a weighted balanced loss function, which optimizes both estimation error and goodness-of-fit by shrinking the estimates toward a standardized target model. The WTPC can effectively capture the performance trends of wind turbines over long-term operation, making it suitable for assessing long-term power generation efficiency and health status [
13]. However, since WTPC modeling typically requires a long period of data accumulation (e.g., 30 days or more) to build a stable power curve, its response to short-term anomalies is relatively slow, such as yaw angle faults or sensor drift. This is because the proportion of early-stage anomaly data in long-term windows is relatively low, making it difficult to significantly impact the overall power curve, thereby limiting its application in real-time fault detection and short-term performance evaluation.
To address these limitations, multivariate power estimation methods have been proposed, incorporating additional environmental and operational parameters such as air density and rotor speed to enhance sensitivity to short-term anomalies [
14,
15]. These methods leverage the complex relationships between multiple variables to characterize the nonlinear dynamics between environmental parameters and wind power, enabling the more accurate and timely detection of performance deviations. This makes them suitable for both real-time fault detection and short-term performance evaluation, as well as the precise assessment of wind turbine performance degradation under varying wind energy scenarios through controlled input environmental variables. Pandit et al. [
16] incorporated air density into a Gaussian process model for wind turbine power evaluation to improve fitting accuracy. Astolfi [
17] proposed multivariate approaches to the wind turbine power curve, incorporating additional environmental information and working parameters as input variables for data-driven models to improve the accuracy of theoretical power extraction under non-stationary conditions, leveraging SCADA data and advanced methods from artificial intelligence and applied statistics. Manobela et al. [
18] proposed a wind turbine power evaluation method based on Gaussian processes, data filtering, and artificial neural network modeling, using wind speed and wind direction as input variables. Schlechtingen et al. [
19] established four wind turbine power models based on cluster center fuzzy logic, neural networks, k-nearest neighbor models, and adaptive neuro-fuzzy inference systems, using wind speed, wind direction, and ambient temperature as input variables. Cascianelli et al. [
7] proposed an ensemble of multivariate polynomial regression models to predict the active power of wind turbines and provide reliable prediction intervals, incorporating environmental conditions, operational and thermal variables, and interactions between turbines, achieving a mean absolute error of approximately 1.0% of the rated power on real SCADA data from an Italian wind farm. Lee et al. [
20] conducted multivariate wind turbine power curve regression by combining SCADA data and met mast data. Input variables included wind speed, wind direction, humidity, turbulence intensity, and the wind shear coefficient, with the regression model being an additive multivariate conditional kernel density estimation model. These studies enhanced the models by incorporating additional environmental parameters as inputs, thereby reducing the variance in wind power prediction errors. In summary, in specific case studies, incorporating environmental variables and operational state variables from the SCADA system is beneficial for improving the accuracy of power evaluation models. By integrating additional data such as wind speed, wind direction, air density, temperature, turbulence intensity, and other relevant parameters, the models can better capture the complex relationships and dynamics affecting wind turbine performance. However, factors such as shutdowns, power limiting, and equipment failures often contaminate actual wind turbine operation data with complex anomalies, significantly compromising the accuracy of multivariable power estimation models [
21].
Abnormal data adversely affect the monitoring of wind turbine operational status and can distort power estimation models developed based on such data. Therefore, refining wind data before model establishment is crucial. Data cleaning can be divided into two steps: preliminary filtering and anomaly detection (AD) [
22]. The former is used to quickly remove obvious anomalies, while the latter is employed to deeply identify complex anomalies. Filtering aims to eliminate data points that violate physical laws or operational logic, such as data from the non-generation, start-up, or shutdown phases of wind turbines. It is generally a rule-based approach, relying on an understanding of system behavior, such as defining thresholds for wind speed and pitch angle. On the other hand, AD focuses on identifying abnormal data points, such as those caused by sensor failures or extreme weather conditions. AD typically employs statistical or machine learning techniques to detect data points that deviate significantly from normal patterns [
23]. This step-by-step approach can more comprehensively improve data quality, laying a solid foundation for subsequent modeling and analysis. However, many studies only employ AD strategies or combine them with minimal preprocessing. This widespread neglect is particularly unusual, as the international standard IEC 61400-12 [
24] explicitly mandates a data quality check for power curve measurement, which includes removing unavailable measurements, as well as filtering and excluding data based on power-limited conditions and fault records, with reference to operator logs. Wang et al. [
11] proposed a method that combines the 3
σ criterion and the quartile algorithm for data cleaning to address the limited performance of a single approach in certain cases, while employing the Mahalanobis distance to measure the distance between data points. The effectiveness of the proposed method was validated through comparisons with commonly used techniques such as isolation forest and the local outlier factor. However, this method only applies a single filtering rule, where samples satisfying the conditions of wind speed greater than the cut-in speed and power less than 1 kW are identified as irrational data. Morrison et al. [
22] proposed a multi-rule filtering method and explored the impact of such filtering by comparing the performance of four different AD methods with and without filtering. Although this method incorporates information about the pitch angle as a filtering rule, it neglects the role of rotor speed.
To effectively identify and filter abnormal data, thereby improving data quality, a novel pre-filtering (PF) method is proposed in this paper for machine learning-based multivariate power estimation in wind turbines. First, PF is performed by setting filtering rules based on the operational state variables of wind turbines, specifically pitch angle and rotor speed. Subsequently, AD is conducted using a sliding window approach combined with the 3σ criterion and the quartile method. Following this, the performance of two widely used machine learning algorithms, the BPNN and SVM, is compared for multivariate power estimation with and without PF on two distinct datasets. The main contributions and novelties of the present study can be summarized as follows:
A novel PF method is proposed to enhance machine learning-based multivariate power estimation in wind turbines. Samples corresponding to start-up and shutdown phases are filtered by applying thresholds to pitch angle and rotor speed data. The effectiveness of the proposed filtering method is demonstrated through visualization by comparing the results of different filtering rules.
By introducing settings for sliding window size and step size, the AD method is optimized to avoid the incorrect cleaning of data in the four corners of the region. A dual-window validation mechanism is adopted, where a sample is confirmed as a final anomaly only when it is identified as a potential anomaly in two consecutive windows.
By comparing model performance with and without PF, the effectiveness of PF in improving data quality and model accuracy is validated. Experiments conducted on two distinct datasets enhance the reliability and generalizability of the results.
3. Results and Discussion
To validate the effectiveness of the proposed method, two SCADA datasets, namely Dataset A and Dataset B, were utilized in the case studies. Dataset A was downloaded from the Kaggle website with rated powers of 1.7 MW (megawatts) [
11]. Dataset B was obtained from an offshore wind farm in Guangdong, China, with rated powers of 5.5 MW. The 5.5 MW wind turbine is typically used in large-scale wind farms, especially offshore wind farms, due to its high single-unit capacity, which effectively reduces the unit power generation cost. The inclusion of both a smaller turbine and a larger offshore turbine allowed for the analysis of the method’s performance across different turbine sizes and operational environments, albeit within a limited scope. All datasets were sampled at 10 min intervals, with each variable value averaged over the sampling interval to smooth data fluctuations, reduce noise, and preserve the overall trend. Based on the generation principles of wind turbines and SCADA variables, wind speed, active power, ambient temperature, rotor speed, and pitch angle were selected as input variables. After removing missing values, Dataset A contained 40,640 samples, spanning from May 2019 to March 2020, while Dataset B contained 21,001 samples, spanning from November 2023 to March 2024. Using Dataset A as an example, the processing flow of PF and AD was analyzed. Furthermore, the performance of the multivariate power estimation model on Dataset A and Dataset B after data processing was compared based on the final estimation results.
3.1. Pre-Filtering Results
First, the distribution of SCADA data from the wind turbine in Dataset A was analyzed, as illustrated in
Figure 3.
Figure 3a shows the wind speed–power distribution, where scattered outliers are observed outside the main region, along with a cluster of outliers at the bottom.
Figure 3b illustrates the wind speed and pitch angle distribution. The distribution analysis shows that most normal operating data points had pitch angles concentrated between 0° and 20°. As illustrated in
Figure 2, pitch angles in region III exhibited a threshold behavior. Therefore, in Dataset A, data points with pitch angles exceeding 20° were typically associated with abnormal operating conditions, including non-generation states (such as shutdown or start-up phases) and power-limiting operation. Based on the wind speed–rotor speed distribution shown in
Figure 3c, a rotor speed threshold of 9.1 rpm was selected as a filtering rule to improve data quality for power estimation. According to
Figure 2, at low wind speeds, the rotor speed exhibited significant fluctuations, often corresponding to non-generation states such as shutdown, start-up, or idling. Filtering out data points with rotor speeds below 9.1 rpm effectively removed anomalies associated with these non-generation states.
Based on the above analysis, the PF rules applied in this study were defined as follows: (1) power generation below 1 kW when the wind speed exceeded the cut-in speed; (2) a pitch angle exceeding 20°; and (3) a rotor speed lower than 9.1 rpm. Any samples satisfying one or more of these conditions were filtered out. The thresholds for the pitch angle and rotor speed were determined empirically based on the distribution of operational data collected from the SCADA system, combined with control strategies, and were not optimized through a specific process. These thresholds were derived from the typical operational behavior of wind turbines across different wind speed ranges. Due to differences in turbine models and control strategies, the pitch angle and rotor speed thresholds of different wind turbines need to be determined based on actual operational data.
The visualization results under different filtering strategies are shown in
Figure 4. It can be observed that adding the rotor speed threshold as a filtering condition improved the filtering effect. However, some scattered points still exist outside the main region, indicating that AD was needed in addition to filtering strategies to further enhance data quality.
3.2. Anomaly Detection Results
To effectively identify outliers in the wind speed–power data, the following steps were implemented. First, we set a sliding window for wind speed with a size of 1 m/s and a step size of 0.5 m/s, ensuring that each wind speed–power sample was processed twice. Next, we calculated the Mahalanobis distance within each window and marked potential outliers based on the 3σ criterion. For samples not marked as outliers, we further applied the quartile method for secondary detection to identify additional potential outliers. After processing all windows, we only removed samples that were simultaneously marked as potential outliers by two windows to ensure the reliability of the detection results.
Taking Dataset A as an example,
Figure 5 presents the AD results in two scenarios: (a) binning only, without a moving window; and (b) binning with a moving window. Both methods exhibited significant cleaning effectiveness through the identification of a large number of outliers. The proposed method, incorporating a sliding window, achieved a smoother data distribution, particularly in the region close to the rated wind speed, thereby addressing the discontinuities caused by the binning method at the edges of the bins.
Autoencoders, as a type of deep learning model, are also applicable to AD [
26]. The autoencoder was trained using the dataset, with a compressed representation dimension set to 2. To prevent overfitting, the Tikhonov regularization with a factor of 0.001 was applied, and sparsity regularization with a constant of 4 and a proportion of 0.1 was used. The model was trained for a maximum of 200 epochs. The results of the autoencoder-based AD are shown in
Figure 6. Although incorporating more input features improved the AD performance, some discrete points remained in the main distribution region, leading to a less optimal result compared to the proposed method. The AD performance did not show significant improvement even when the compression dimension was set to 3 or 4. This could be due to the fact that autoencoders are deep learning models that are well suited for complex or high-dimensional data. For the low-dimensional data used in this study, the autoencoder might have been too complex, leading to overfitting during the training process or an inability to effectively learn the intrinsic features of the data.
3.3. Multivariate Power Estimation Modeling Results
To comprehensively validate the superiority of the proposed algorithm, comparative experiments were conducted with different data processing methods. In this study, the dataset was divided into a training set (80%) and a test set (20%) to evaluate the effectiveness of the proposed filtering strategy. The hyperparameter settings for the BPNN and SVM are shown in
Table 1. For the BPNN, a hidden layer size of 10, a learning rate of 0.001, and 20 training epochs were selected after considering the balance between computational efficiency and model performance. These values were chosen to ensure that the model was capable of converging quickly while avoiding overfitting or underfitting. For the SVM, the parameters such as the radial basis function kernel, a box constraint of 1000, and an epsilon value of 50 were selected based on common recommendations for regression tasks and were found to work well for the datasets used in this study. While hyperparameter tuning could potentially improve performance, the selected values provided a solid starting point for demonstrating the effectiveness of the proposed filtering strategy. During the analysis of the impact of different data processing methods on model performance, the hyperparameters were kept consistent [
22]. A brief description of the methods is as follows:
Base-BPNN: the BPNN was applied directly to the raw data without any preprocessing.
AD-BPNN: the BPNN was applied to the dataset after anomaly detection.
PF-AD-BPNN: the BPNN was applied to the dataset after pre-filtering and anomaly detection.
Similarly, Base-SVM, AD-SVM, and PF-AD-SVM represent the application of the SVM in different data processing stages.
The results of power estimation by the SVM, including a comparison of actual and estimated values, are presented in
Figure 7 and
Figure 8. As can be seen from
Figure 7a and
Figure 8a, sparse outliers far from the main region could not be effectively estimated. The reason may be that in areas far from the main region, data points are scarce, making it difficult for the model to learn effective patterns from limited samples, resulting in inaccurate estimations. Outliers may have exhibited characteristics significantly different from those of the main region’s data. In comparison to the direct use of raw data, the AD-BPNN evaluation model exhibited a notable decrease in large errors. However, a noticeable deviation existed in the low-wind-speed region, representing the sample points that required filtering. As can be seen from
Figure 7c and
Figure 8c, the PF-AD-BPNN model exhibited excellent performance in addressing large errors and local deviations.
The results of power estimation by the SVM, including a comparison of actual and estimated values, are presented in
Figure 9 and
Figure 10. Although the multivariate estimation performance differed slightly compared to that of the BPNN, the qualitative impact of PF and AD on the model remained consistent. Through comparative experiments using both the BPNN and SVM methods, the effectiveness of PF and AD in improving power estimation performance across different models was validated.
The evaluation metrics of different power estimation methods for Dataset A and Dataset B are detailed in
Table 2.
Figure 11 shows the PF performance diagram of Dataset B, where some anomalies are filtered out. However, some scattered points still remain outside the main region, suggesting that AD was required in conjunction with filtering strategies to further improve data quality. The remaining processing steps after filtering were the same as those for Dataset A.
For both the BPNN and SVM models, the models with PF and AD outperformed the base models and models with only AD across all evaluation metrics. This indicates that PF and AD significantly improved model performance. Models with only AD showed improvements over the base models, but the improvements were less significant compared to models combining PF and AD. For Dataset A, the PF-AD-BPNN demonstrated a reduction of 22.67% and 19.31% in the MAE and RMSE, respectively, when compared to the AD-BPNN. Similarly, the PF-AD-SVM showed a decrease of 6.41% and 6.13% in the MAE and RMSE, respectively, compared to the AD-SVM. In contrast, for Dataset B, the PF-AD-BPNN exhibited a decline of 16.24% and 23.59% in the MAE and RMSE, respectively, relative to the AD-BPNN. Likewise, the PF-AD-SVM reflected a reduction of 10.33% and 10.94% in the MAE and RMSE, respectively, compared to the AD-SVM. This suggests that AD alone was beneficial but more effective when combined with PF.
Since machine learning models require real-time applicability in wind farm operations, an assessment of the computational costs associated with PF and AD was conducted. The computer used has an Intel Core i5-13400F 2.50 GHz processor and 32 GB of random-access memory. The time required to filter the data was in the order of milliseconds, which was negligible compared to the overall training time. The training times for the different strategies are provided in
Table 3. As shown, while the filtering step incurred a very minimal computational cost, the training time varied significantly depending on the strategy used. The AD process notably increased the computational cost. This supports the idea that the filtering step, while essential, had a minimal impact on overall processing time compared to the training phase. Overall, a training time in the order of seconds is acceptable for real-time applications in wind farm operations.
3.4. Discussion
This paper proposes a filtering method that combines pitch angle and rotor speed thresholds based on the operational state parameters of wind turbines. The visualization results demonstrate better cleaning effectiveness compared to those using only pitch angle. The filtering process primarily removed sample points where the wind turbine was not generating power, was shut down, or was starting up. Samples during the start-up and shutdown processes, where power generation was incomplete, could lead to power estimation errors, as reflected in the low-wind-speed region of the AD-BPNN and AD-SVM results shown in
Figure 7b and
Figure 9b. Additionally, the use of a sliding window during the AD process avoided discontinuities at the boundaries of bins in the binning method, while preserving more normal samples.
The WTPC offers advantages over multivariate methods in terms of simplicity, interpretability, and computational efficiency. By focusing on the relationship between wind speed and power output, the WTPC provides a clear representation of turbine performance. However, it may not fully capture the complexity of turbine behavior under varying environmental conditions, where multivariate methods could provide more comprehensive insights.
Based on the PF-AD-BPNN method, a univariate WTPC model was constructed with wind speed as the input variable and active power as the output target. The evaluation metrics are presented in
Table 4. In comparison to the WTPC, the multivariate estimation method achieved significant improvements across all evaluation metrics, as presented in
Table 2. Specifically, it achieved reductions in the MAE of 70.57% and 81.04% for the two datasets, respectively, and reductions in the RMSE of 70.77% and 80.65%, respectively. Additionally, R
2 improved by 1.24% and 2.62%, while the NMAPE decreased by 70.57% and 81.04% for the two datasets. These improvements can facilitate the timely detection of abnormal conditions in the real-time monitoring of wind turbine power generation performance, such as a reduction in the power coefficient caused by blade icing or damage, thus offering valuable guidance for turbine maintenance and operation.
In summary, multivariate power estimation serves as an important complement to the WTPC-based method for evaluating wind turbine power generation performance. It provides a more comprehensive reflection of turbine performance under different operating conditions and offers more accurate insights for turbine operation and maintenance. Enhancing model performance through PF and AD is highly significant, as it improves the reliability and accuracy of power estimation.