Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,835)

Search Parameters:
Keywords = hyper-parameter optimization

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 1978 KiB  
Article
Deep Neural Network Optimization for Efficient Gas Detection Systems in Edge Intelligence Environments
by Amare Mulatie Dehnaw, Ying-Jui Lu, Jiun-Hann Shih, Cheng-Kai Yao, Mekuanint Agegnehu Bitew and Peng-Chun Peng
Processes 2024, 12(12), 2638; https://doi.org/10.3390/pr12122638 (registering DOI) - 22 Nov 2024
Abstract
This paper introduces an optimized deep neural network (DNN) framework for an efficient gas detection system applicable across various settings. The proposed optimized DNN model addresses key issues in conventional machine learning (ML), including slow computation times, convergence issues, and poor adaptability to [...] Read more.
This paper introduces an optimized deep neural network (DNN) framework for an efficient gas detection system applicable across various settings. The proposed optimized DNN model addresses key issues in conventional machine learning (ML), including slow computation times, convergence issues, and poor adaptability to new data, which can result in increased prediction errors and reduced reliability. The proposed framework methodology comprises four phases: data collection, pre-processing, offline DNN training optimization, and online model testing and deployment. The training datasets are collected from seven classes of liquid beverages and environmental air samples using integrated gas sensor devices and an edge intelligence environment. The proposed DNN algorithm is trained on high-performance computing systems by fine-tuning multiple hyperparameter optimization techniques, resulting in an optimized DNN. This well-trained DNN model is validated using unseen new testing datasets in high-performance computing systems. Experimental results demonstrate that the optimized DNN can accurately recognize different beverages, achieving an impressive detection accuracy rate of 98.29%. The findings indicate that the proposed system significantly enhances gas identification capabilities and effectively addresses the slow computation and performance issues associated with traditional ML methods. This work highlights the potential of optimized DNNs to provide reliable and efficient contactless detection solutions across various industries, enhancing real-time gas detection applications. Full article
(This article belongs to the Special Issue Research on Intelligent Fault Diagnosis Based on Neural Network)
13 pages, 3389 KiB  
Article
Dynamic Prediction of Proton-Exchange Membrane Fuel Cell Degradation Based on Gated Recurrent Unit and Grey Wolf Optimization
by Xiangdong Wang, Zerong Huang, Daxing Zhang, Haoyu Yuan, Bingzi Cai, Hanlin Liu, Chunsheng Wang, Yuan Cao, Xinyao Zhou and Yaolin Dong
Energies 2024, 17(23), 5855; https://doi.org/10.3390/en17235855 - 22 Nov 2024
Abstract
This paper addresses the challenge of degradation prediction in proton-exchange membrane fuel cells (PEMFCs). Traditional methods often struggle to balance accuracy and complexity, particularly under dynamic operational conditions. To overcome these limitations, this study proposes a data-driven approach based on the gated recurrent [...] Read more.
This paper addresses the challenge of degradation prediction in proton-exchange membrane fuel cells (PEMFCs). Traditional methods often struggle to balance accuracy and complexity, particularly under dynamic operational conditions. To overcome these limitations, this study proposes a data-driven approach based on the gated recurrent unit (GRU) neural network, optimized by the grey wolf optimizer (GWO). The integration of the GWO automates the hyperparameter tuning process, enhancing the predictive performance of the GRU network. The proposed GWO-GRU method was validated utilizing actual PEMFC data under dynamic load conditions. The results demonstrate that the GWO-GRU method achieves superior accuracy compared to other standard methods. The method offers a practical solution for online PEMFC degradation prediction, providing stable and accurate forecasting for PEMFC systems in dynamic environments. Full article
Show Figures

Figure 1

Figure 1
<p>PEMFC durability test. (<b>a</b>) Test bench in FCLAB. (<b>b</b>) Constant and dynamic currents in the two experiments.</p>
Full article ">Figure 2
<p>Degradation voltage profiles under constant and dynamic load conditions.</p>
Full article ">Figure 3
<p>Raw voltage data and processed voltage data under dynamic load conditions.</p>
Full article ">Figure 4
<p>Architecture of GRU cells.</p>
Full article ">Figure 5
<p>Searching for prey versus attacking prey, (<b>a</b>) searching, (<b>b</b>) attacking.</p>
Full article ">Figure 6
<p>GWO-GRU schematic diagram.</p>
Full article ">Figure 7
<p>Prediction results under different methods and training lengths. (<b>a</b>) Fifty percent training length. (<b>b</b>) Sixty percent training length. (<b>c</b>) Seventy percent training length. (<b>d</b>) Eighty percent training length.</p>
Full article ">Figure 8
<p>Absolute percentage error results under different methods and training lengths. (<b>a</b>) Fifty percent training length. (<b>b</b>) Sixty percent training length. (<b>c</b>) Seventy percent training length. (<b>d</b>) Eighty percent training length.</p>
Full article ">
15 pages, 2546 KiB  
Article
Intelligent Analysis and Prediction of Computer Network Security Logs Based on Deep Learning
by Zhiwei Liu, Xiaoyu Li and Dejun Mu
Electronics 2024, 13(22), 4556; https://doi.org/10.3390/electronics13224556 - 20 Nov 2024
Viewed by 304
Abstract
Since the beginning of the 21st century, the development of computer networks has been advancing rapidly, and the world has gradually entered a new era of digital connectivity. While enjoying the convenience brought by digitization, people are also facing increasingly serious threats from [...] Read more.
Since the beginning of the 21st century, the development of computer networks has been advancing rapidly, and the world has gradually entered a new era of digital connectivity. While enjoying the convenience brought by digitization, people are also facing increasingly serious threats from network security (NS) issues. Due to the significant shortcomings in accuracy and efficiency of traditional Long Short-Term Memory (LSTM) neural networks (NN), different scholars have conducted research on computer NS situation prediction methods to address the aforementioned issues of traditional LSTM based NS situation prediction algorithms. Although these algorithms can improve the accuracy of NS situation prediction to a certain extent, there are still some limitations, such as low computational efficiency, low accuracy, and high model complexity. To address these issues, new methods and techniques have been proposed, such as using NN and machine learning techniques to improve the accuracy and efficiency of prediction models. This article referred to the Bidirectional Gated Recurrent Unit (BiGRU) improved by Gated Recurrent Unit (GRU), and introduced a multi model NS situation prediction algorithm with attention mechanism. In addition, the improved Particle Swarm Optimization (PSO) algorithm can be utilized to optimize hyperparameters and improve the training efficiency of the GRU NN. The experimental results on the UNSW-NB15 dataset show that the algorithm had an average absolute error of 0.0843 in terms of NS prediction accuracy. The RMSE was 0.0932, which was lower than traditional prediction algorithms LSTM and GRU, and significantly improved prediction accuracy. Full article
(This article belongs to the Section Networks)
Show Figures

Figure 1

Figure 1
<p>Attention CNN GRU algorithm flowchart.</p>
Full article ">Figure 2
<p>GRU model flowchart.</p>
Full article ">Figure 3
<p>Specific flowchart of PSO algorithm.</p>
Full article ">Figure 4
<p>Specific method for evaluating NS situation values.</p>
Full article ">Figure 5
<p>Normalized NS situation time series of UNSW-NB15 dataset.</p>
Full article ">Figure 6
<p>Normalized NS situation time series of ADFA-IDS dataset.</p>
Full article ">Figure 7
<p>Comparison of situational values of different prediction models on the UNSW-NB15 dataset.</p>
Full article ">Figure 8
<p>Comparison of situational values of different prediction models on the ADFA-IDS dataset.</p>
Full article ">
18 pages, 7262 KiB  
Article
Multi-Energy Coupling Load Forecasting in Integrated Energy System with Improved Variational Mode Decomposition-Temporal Convolutional Network-Bidirectional Long Short-Term Memory Model
by Xinfu Liu, Wei Liu, Wei Zhou, Yanfeng Cao, Mengxiao Wang, Wenhao Hu, Chunhua Liu, Peng Liu and Guoliang Liu
Sustainability 2024, 16(22), 10082; https://doi.org/10.3390/su162210082 - 19 Nov 2024
Viewed by 289
Abstract
Accurate load forecasting is crucial to the stable operation of integrated energy systems (IES), which plays a significant role in advancing sustainable development. Addressing the challenge of insufficient prediction accuracy caused by the inherent uncertainty and volatility of load data, this study proposes [...] Read more.
Accurate load forecasting is crucial to the stable operation of integrated energy systems (IES), which plays a significant role in advancing sustainable development. Addressing the challenge of insufficient prediction accuracy caused by the inherent uncertainty and volatility of load data, this study proposes a multi-energy load forecasting method for IES using an improved VMD-TCN-BiLSTM model. The proposed model consists of optimizing the Variational Mode Decomposition (VMD) parameters through a mathematical model based on minimizing the average permutation entropy (PE). Moreover, load sequences are decomposed into different Intrinsic Mode Functions (IMFs) using VMD, with the optimal number of models determined by the average PE to reduce the non-stationarity of the original sequences. Considering the coupling relationship among electrical, thermal, and cooling loads, the input features of the forecasting model are constructed by combining the IMF set of multi-energy loads with meteorological data and related load information. As a result, a hybrid neural network structure, integrating a Temporal Convolutional Network (TCN) with a Bidirectional Long Short-Term Memory (BiLSTM) network for load prediction is developed. The Sand Cat Swarm Optimization (SCSO) algorithm is employed to obtain the optimal hyper-parameters of the TCN-BiLSTM model. A case analysis is performed using the Arizona State University Tempe campus dataset. The findings demonstrate that the proposed method can outperform six other existing models in terms of Mean Absolute Percentage Error (MAPE) and Coefficient of Determination (R2), verifying its effectiveness and superiority in load forecasting. Full article
(This article belongs to the Special Issue Energy Management System and Sustainability)
Show Figures

Figure 1

Figure 1
<p>Research progress of deep learning techniques in the field of load forecasting.</p>
Full article ">Figure 2
<p>Flowchart of the Sand Cat Swarm Optimization algorithm.</p>
Full article ">Figure 3
<p>Network structure of TCN model.</p>
Full article ">Figure 4
<p>Network structure of LSTM.</p>
Full article ">Figure 5
<p>Network structure of BiLSTM.</p>
Full article ">Figure 6
<p>Network structure of TCN-BiLSTM.</p>
Full article ">Figure 7
<p>A multi-energy load forecasting flowchart based on VMD-TCN-BiLSTM.</p>
Full article ">Figure 8
<p>VMD results. (<b>a</b>) Electric load, (<b>b</b>) Cooling load, (<b>c</b>) Thermal load.</p>
Full article ">Figure 9
<p>Electrical load forecasting results. (<b>a</b>) Electrical load forecasting results of single and combined model. (<b>b</b>) Local magnification of result.</p>
Full article ">Figure 10
<p>Thermal load forecasting results. (<b>a</b>) Thermal load forecasting results of single and combined model. (<b>b</b>) Local magnification of result.</p>
Full article ">Figure 11
<p>Cooling load forecasting results. (<b>a</b>) Cooling load forecasting results of single and combined model. (<b>b</b>) Local magnification of result.</p>
Full article ">
24 pages, 6232 KiB  
Article
Towards Cleaner Cities: Estimating Vehicle-Induced PM2.5 with Hybrid EBM-CMA-ES Modeling
by Saleh Alotaibi, Hamad Almujibah, Khalaf Alla Adam Mohamed, Adil A. M. Elhassan, Badr T. Alsulami, Abdullah Alsaluli and Afaq Khattak
Toxics 2024, 12(11), 827; https://doi.org/10.3390/toxics12110827 - 19 Nov 2024
Viewed by 421
Abstract
In developing countries, vehicle emissions are a major source of atmospheric pollution, worsened by aging vehicle fleets and less stringent emissions regulations. This results in elevated levels of particulate matter, contributing to the degradation of urban air quality and increasing concerns over the [...] Read more.
In developing countries, vehicle emissions are a major source of atmospheric pollution, worsened by aging vehicle fleets and less stringent emissions regulations. This results in elevated levels of particulate matter, contributing to the degradation of urban air quality and increasing concerns over the broader effects of atmospheric emissions on human health. This study proposes a Hybrid Explainable Boosting Machine (EBM) framework, optimized using the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), to predict vehicle-related PM2.5 concentrations and analyze contributing factors. Air quality data were collected from Open-Seneca sensors installed along the Nairobi Expressway, alongside meteorological and traffic data. The CMA-ES-tuned EBM model achieved a Mean Absolute Error (MAE) of 2.033 and an R2 of 0.843, outperforming other models. A key strength of the EBM is its interpretability, revealing that the location was the most critical factor influencing PM2.5 concentrations, followed by humidity and temperature. Elevated PM2.5 levels were observed near the Westlands roundabout, and medium to high humidity correlated with higher PM2.5 levels. Furthermore, the interaction between humidity and traffic volume played a significant role in determining PM2.5 concentrations. By combining CMA-ES for hyperparameter optimization and EBM for prediction and interpretation, this study provides both high predictive accuracy and valuable insights into the environmental drivers of urban air pollution, providing practical guidance for air quality management. Full article
(This article belongs to the Special Issue Atmospheric Emissions Characteristics and Its Impact on Human Health)
Show Figures

Figure 1

Figure 1
<p>Proposed EBM-CMA-ES framework for the prediction and assessment of PM<sub>2.5</sub>.</p>
Full article ">Figure 2
<p>Sites for the data collection along Nairobi expressway.</p>
Full article ">Figure 3
<p>Twelve-hour daily variation in average PM<sub>2.5</sub> at different sites along Nairobi expressway.</p>
Full article ">Figure 3 Cont.
<p>Twelve-hour daily variation in average PM<sub>2.5</sub> at different sites along Nairobi expressway.</p>
Full article ">Figure 4
<p>Prediction error plots using both training and testing datasets: (<b>a</b>) EBM; (<b>b</b>) XGBoost; (<b>c</b>) RF; (<b>d</b>) LightGBM; (<b>e</b>) AdaBoost; (<b>f</b>) MLR.</p>
Full article ">Figure 4 Cont.
<p>Prediction error plots using both training and testing datasets: (<b>a</b>) EBM; (<b>b</b>) XGBoost; (<b>c</b>) RF; (<b>d</b>) LightGBM; (<b>e</b>) AdaBoost; (<b>f</b>) MLR.</p>
Full article ">Figure 5
<p>Uncertainty analysis of the machine learning model by plotting the ratio of predicted PM<sub>2.5</sub> to the observed PM<sub>2.5</sub> vs observed PM<sub>2.5</sub>: (<b>a</b>) EBM model (<b>b</b>) XGBoost model (<b>c</b>) RF model; (<b>d</b>) LightGBM model; (<b>e</b>) AdaBoost model; (<b>f</b>) MLR model.</p>
Full article ">Figure 6
<p>Global factors importance analysis via EBM.</p>
Full article ">Figure 7
<p>Influence of location on PM<sub>2.5</sub> concentrations.</p>
Full article ">Figure 8
<p>Influence of humidity on PM<sub>2.5</sub> concentrations.</p>
Full article ">Figure 9
<p>Influence of temperature on PM<sub>2.5</sub> concentrations.</p>
Full article ">Figure 10
<p>EBM-based heatmap for the interaction of humidity and hourly traffic volume.</p>
Full article ">Figure 11
<p>EBM-based local interpretation of Sample # 12 in testing dataset.</p>
Full article ">Figure 12
<p>EBM-based local interpretation of Sample # 12 in testing dataset.</p>
Full article ">
18 pages, 7824 KiB  
Article
Vessel Traffic Flow Prediction in Port Waterways Based on POA-CNN-BiGRU Model
by Yumiao Chang, Jianwen Ma, Long Sun, Zeqiu Ma and Yue Zhou
J. Mar. Sci. Eng. 2024, 12(11), 2091; https://doi.org/10.3390/jmse12112091 - 19 Nov 2024
Viewed by 306
Abstract
Vessel traffic flow forecasting in port waterways is critical to improving safety and efficiency of port navigation. Aiming at the stage characteristics of vessel traffic in port waterways in time sequence, which leads to complexity of data in the prediction process and difficulty [...] Read more.
Vessel traffic flow forecasting in port waterways is critical to improving safety and efficiency of port navigation. Aiming at the stage characteristics of vessel traffic in port waterways in time sequence, which leads to complexity of data in the prediction process and difficulty in adjusting the model parameters, a convolutional neural network (CNN) based on the optimization of the pelican algorithm (POA) and the combination of bi-directional gated recurrent units (BiGRUs) is proposed as a prediction model, and the POA algorithm is used to search for optimized hyper-parameters, and then the iterative optimization of the optimal parameter combinations is input into the best combination of iteratively found parameters, which is input into the CNN-BiGRU model structure for training and prediction. The results indicate that the POA algorithm has better global search capability and faster convergence than other optimization algorithms in the experiment. Meanwhile, the BiGRU model is introduced and compared with the CNN-BiGRU model prediction; the POA-CNN-BiGRU combined model has higher prediction accuracy and stability; the prediction effect is significantly improved; and it can provide more accurate prediction information and cycle characteristics, which can serve as a reference for the planning of ships’ routes in and out of ports and optimizing the management of ships’ organizations. Full article
(This article belongs to the Special Issue Management and Control of Ship Traffic Behaviours)
Show Figures

Figure 1

Figure 1
<p>Structure of CNN-BiGRU network.</p>
Full article ">Figure 2
<p>Example of BiGRU model structure.</p>
Full article ">Figure 3
<p>The iterative computational flow of the POA.</p>
Full article ">Figure 4
<p>POA-CNN-BiGRU model prediction process.</p>
Full article ">Figure 5
<p>Location of the study area.</p>
Full article ">Figure 6
<p>Vessel traffic flow in the main channel of Qingdao Harbor. (<b>a</b>) Data collection interval 1 h; (<b>b</b>) data collection interval 1.5 h; (<b>c</b>) data collection interval 2 h.</p>
Full article ">Figure 6 Cont.
<p>Vessel traffic flow in the main channel of Qingdao Harbor. (<b>a</b>) Data collection interval 1 h; (<b>b</b>) data collection interval 1.5 h; (<b>c</b>) data collection interval 2 h.</p>
Full article ">Figure 7
<p>Comparison of model prediction errors under different combinations of sliding window and initial population size.</p>
Full article ">Figure 8
<p>Schematic diagram of the sliding prediction process.</p>
Full article ">Figure 9
<p>Optimizing iterative changes to the algorithm.</p>
Full article ">Figure 10
<p>POA-CNN-BiGRU model prediction results.</p>
Full article ">Figure 11
<p>Comparison of model prediction results.</p>
Full article ">Figure 12
<p>Comparison of model evaluation results.</p>
Full article ">Figure 13
<p>Model prediction error at different time intervals.</p>
Full article ">
36 pages, 14136 KiB  
Article
A Novel Multi-Objective Hybrid Evolutionary-Based Approach for Tuning Machine Learning Models in Short-Term Power Consumption Forecasting
by Aleksei Vakhnin, Ivan Ryzhikov, Harri Niska and Mikko Kolehmainen
AI 2024, 5(4), 2461-2496; https://doi.org/10.3390/ai5040120 - 19 Nov 2024
Viewed by 397
Abstract
Accurately forecasting power consumption is crucial important for efficient energy management. Machine learning (ML) models are often employed for this purpose. However, tuning their hyperparameters is a complex and time-consuming task. The article presents a novel multi-objective (MO) hybrid evolutionary-based approach, GA-SHADE-MO, for [...] Read more.
Accurately forecasting power consumption is crucial important for efficient energy management. Machine learning (ML) models are often employed for this purpose. However, tuning their hyperparameters is a complex and time-consuming task. The article presents a novel multi-objective (MO) hybrid evolutionary-based approach, GA-SHADE-MO, for tuning ML models aimed at solving the complex problem of forecasting power consumption. The proposed algorithm simultaneously optimizes both hyperparameters and feature sets across six different ML models, ensuring enhanced accuracy and efficiency. The study focuses on predicting household power consumption at hourly and daily levels. The hybrid MO evolutionary algorithm integrates elements of genetic algorithms and self-adapted differential evolution. By incorporating MO optimization, GA-SHADE-MO balances the trade-offs between model complexity (the number of used features) and prediction accuracy, ensuring robust performance across various forecasting scenarios. Experimental numerical results show the superiority of the proposed method compared to traditional tuning techniques, and random search, showcasing significant improvements in predictive accuracy and computational efficiency. The findings suggest that the proposed GA-SHADE-MO approach offers a powerful tool for optimizing ML models in the context of energy consumption forecasting, with potential applications in other domains requiring precise predictive modeling. The study contributes to the advancement of ML optimization techniques, providing a framework that can be adapted and extended for various predictive analytics tasks. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

Figure 1
<p>Representation of a solution in the GA-SHADE-MO algorithm.</p>
Full article ">Figure 2
<p>Example of solutions of MO problem in the context of building ML models.</p>
Full article ">Figure 3
<p>Visualization of the obtained power consumption in the private house.</p>
Full article ">Figure 4
<p>The view on the map of the observation station and an area of the house.</p>
Full article ">Figure 5
<p>Correlation matrices of Daily (<b>left</b>) and Hourly levels (<b>right</b>) on train datasets.</p>
Full article ">Figure 6
<p>Splitting the data using time series cross-validation.</p>
Full article ">Figure 7
<p>Found Pareto front of tuned ML models using GA-SHADE-MO. Daily level. All-features scenario.</p>
Full article ">Figure 8
<p>The set of used features for each tuned ML model. Daily level. All-features scenario.</p>
Full article ">Figure 9
<p>Found Pareto front of tuned ML models using GA-SHADE-MO. Daily level. Excluded ambient temperature scenario.</p>
Full article ">Figure 10
<p>The set of used features for each tuned ML model. Daily level. Excluded ambient temperature scenario.</p>
Full article ">Figure 11
<p>Found Pareto front of tuned ML models using GA-SHADE-MO. Daily level. Power lags and time scenario.</p>
Full article ">Figure 12
<p>The set of used features for each tuned ML model. Daily level. Power lags and time scenario.</p>
Full article ">Figure 13
<p>Comparison graph of the performance of the tuned MLP model and the actual values. Daily level. All-features scenario.</p>
Full article ">Figure 14
<p>Scatter plot (<b>left</b>), residuals plot (<b>center</b>), and histogram of residuals values (<b>right</b>) of tuned MLP model on a daily level. All-features scenario.</p>
Full article ">Figure 15
<p>Found Pareto fronts using the proposed GA-SHADE-MO and Random Search. Daily level. All-features scenario.</p>
Full article ">Figure 16
<p>Found Pareto front of tuned ML models using GA-SHADE-MO. Hourly level. All-features scenario.</p>
Full article ">Figure 17
<p>The set of used features for each tuned ML model. Hourly level. All-features scenario.</p>
Full article ">Figure 18
<p>Found Pareto front of tuned ML models using GA-SHADE-MO. Hourly level. Excluded ambient temperature scenario.</p>
Full article ">Figure 19
<p>The set of used features for each tuned ML model. Hourly level. Excluded ambient temperature scenario.</p>
Full article ">Figure 20
<p>Found Pareto front of tuned ML models using GA-SHADE-MO. Hourly level. Power lags and time scenario.</p>
Full article ">Figure 21
<p>The set of used features for each tuned ML model. Hourly level. Power lags and time scenario.</p>
Full article ">Figure 22
<p>Comparison graph of the performance of the tuned MLP model and the actual values, hourly level. All-features scenario.</p>
Full article ">Figure 23
<p>Scatter plot (<b>left</b>), residuals plot (<b>center</b>), and histogram of residuals values (<b>right</b>) of tuned MLP model on hourly level. All-features scenario.</p>
Full article ">Figure 24
<p>Found Pareto fronts using the proposed GA-SHADE-MO and Random Search. Hourly level, all-features scenario.</p>
Full article ">
17 pages, 1713 KiB  
Article
Simplified Knowledge Distillation for Deep Neural Networks Bridging the Performance Gap with a Novel Teacher–Student Architecture
by Sabina Umirzakova, Mirjamol Abdullaev, Sevara Mardieva, Nodira Latipova and Shakhnoza Muksimova
Electronics 2024, 13(22), 4530; https://doi.org/10.3390/electronics13224530 - 18 Nov 2024
Viewed by 312
Abstract
The rapid evolution of deep learning has led to significant achievements in computer vision, primarily driven by complex convolutional neural networks (CNNs). However, the increasing depth and parameter count of these networks often result in overfitting and elevated computational demands. Knowledge distillation (KD) [...] Read more.
The rapid evolution of deep learning has led to significant achievements in computer vision, primarily driven by complex convolutional neural networks (CNNs). However, the increasing depth and parameter count of these networks often result in overfitting and elevated computational demands. Knowledge distillation (KD) has emerged as a promising technique to address these issues by transferring knowledge from a large, well-trained teacher model to a more compact student model. This paper introduces a novel knowledge distillation method that simplifies the distillation process and narrows the performance gap between teacher and student models without relying on intricate knowledge representations. Our approach leverages a unique teacher network architecture designed to enhance the efficiency and effectiveness of knowledge transfer. Additionally, we introduce a streamlined teacher network architecture that transfers knowledge effectively through a simplified distillation process, enabling the student model to achieve high accuracy with reduced computational demands. Comprehensive experiments conducted on the CIFAR-10 dataset demonstrate that our proposed model achieves superior performance compared to traditional KD methods and established architectures such as ResNet and VGG networks. The proposed method not only maintains high accuracy but also significantly reduces training and validation losses. Key findings highlight the optimal hyperparameter settings (temperature T = 15.0 and smoothing factor α = 0.7), which yield the highest validation accuracy and lowest loss values. This research contributes to the theoretical and practical advancements in knowledge distillation, providing a robust framework for future applications and research in neural network compression and optimization. The simplicity and efficiency of our approach pave the way for more accessible and scalable solutions in deep learning model deployment. Full article
Show Figures

Figure 1

Figure 1
<p>The architecture of teacher–student models using the knowledge distillation algorithm.</p>
Full article ">Figure 2
<p>The architecture of novel teacher model.</p>
Full article ">Figure 3
<p>The result of the proposed method based on CIFAR—10 dataset, blue boxes show correct predictions while red boxes are incorrect.</p>
Full article ">
50 pages, 64978 KiB  
Article
Investigating the Surface Damage to Fuzhou’s Ancient Houses (Gu-Cuo) Using a Non-Destructive Testing Method Constructed via Machine Learning
by Lei Zhang, Yile Chen, Liang Zheng, Binwen Yan, Jiali Zhang, Ali Xie and Senyu Lou
Coatings 2024, 14(11), 1466; https://doi.org/10.3390/coatings14111466 - 18 Nov 2024
Viewed by 561
Abstract
As an important part of traditional Chinese architecture, Fuzhou’s ancient houses have unique cultural and historical value. However, over time, environmental factors such as efflorescence and plant growth have caused surface damage to their gray brick walls, leading to a decline in the [...] Read more.
As an important part of traditional Chinese architecture, Fuzhou’s ancient houses have unique cultural and historical value. However, over time, environmental factors such as efflorescence and plant growth have caused surface damage to their gray brick walls, leading to a decline in the quality of the buildings’ structure and even posing a threat to the buildings’ safety. Traditional damage detection methods mainly rely on manual labor, which is inefficient and consumes a lot of human resources. In addition, traditional non-destructive detection methods, such as infrared imaging and laser scanning, often face difficulty in accurately identifying specific types of damage, such as efflorescence and plant growth, on the surface of gray bricks and are easily hampered by diverse surface features. This study uses the YOLOv8 machine learning model for the automated detection of two common types of damage to the gray brick walls of Fuzhou’s ancient houses: efflorescence and plant growth. We establish an efficient gray brick surface damage detection model through dataset collection and annotation, experimental parameter optimization, model evaluation, and analysis. The research results reveal the following. (1) Reasonable hyperparameter settings and model-assisted annotation significantly improve the detection accuracy and stability. (2) The model’s average precision (AP) is improved from 0.30 to 0.90, demonstrating good robustness in detecting complex backgrounds and high-resolution real-life images. The F1 value of the model’s gray brick detection efficiency is improved (classification model performance index) from 0.22 to 0.77. (3) The model’s ability to recognize the damage details of gray bricks under high-resolution conditions is significantly enhanced, demonstrating its ability to cope with complex environments. (4) The simplified data enhancement strategy effectively reduces the feature extraction interference and enhances the model’s adaptability in different environments. Full article
Show Figures

Figure 1

Figure 1
<p>Scope of the research and investigation (image source: drawn by the author).</p>
Full article ">Figure 2
<p>The locations where the photos were collected. The numbers in the figure represent the following: 1 is No. 108 Wushan Road; 2 is No. 151 Baima South Road; 3 is No. 169 Baima South Road; 4 is No. 17 Tianhuangling Lane; 5 is No. 20 Daguangli; 6 is No. 250 Dongguan Street; 7 is No. 254 Dongguan Street; 8 is No. 172 Heping Street; 9 is No. 44 Jing Street; 10 is Jiuyan Gong Dacuo; 11 is Kaiyinglu; 12 is No. 50 Nanhou Street; 13 is No. 174 Nanhou Street; 14 is Shangbao Qizhu Hall; 15 is No. 4 Weicuoli; 16 is No. 8 Weicuoli; 17 is the former residence of the Wei family; 18 is No. 135 Wushan Village; 19 is Wushan Mansion, Wushan Village; 20 is No. 1 Chiqian, Yijing Village; 21 is No. 9 Chiqian, Yijing Village; 22 is No. 11 Zhonglie Road; 23 is No. 50 Zhuzifang; 24 is No. 58 Taibaojing Lane; 25 is No. 6 Wenrufang; 26 is No. 42 Nanhou Street; 27 is No. 20 Daguangli; 28 is No. 2 Longjin Yizhi Lane; 29 is the former site of Yihua Photo Studio; 30 is No. 78–84 Zhongping Road; 31 is No. 88–98 Zhongping Road; and 32 is No. 53 Dongxing (image source: drawn by the author).</p>
Full article ">Figure 3
<p>Research methods and steps (image source: drawn by the author).</p>
Full article ">Figure 4
<p>We used the LabelImg tool to label the collected images. Since the author uses the simplified Chinese version of the software, the Chinese displayed in the screenshot is that which comes with the LabelImg tool. The red box in the figure represents the marked range. (image source: screenshot from LabelImg tool).</p>
Full article ">Figure 5
<p>Design of the YOLOv8 architecture used in this study (image source: drawn by the author).</p>
Full article ">Figure 6
<p>Climate analysis of Fuzhou City (image source: drawn by the author via Ladybug).</p>
Full article ">Figure 7
<p>Analysis of the annual wind frequency rise in Fuzhou (image source: drawn by the author via Ladybug).</p>
Full article ">Figure 8
<p>Analysis of the enthalpy–humidity diagram for Fuzhou (image source: drawn by the author via Climate Consultant).</p>
Full article ">Figure 9
<p>The location of gray brick in an ancient Fuzhou house (image source: drawn by the author).</p>
Full article ">Figure 10
<p>Ancient houses in Fuzhou feature gray brick on their façade (image source: drawn by the author).</p>
Full article ">Figure 11
<p>The location of gray bricks in the saddle fire wall (image source: drawn by the author).</p>
Full article ">Figure 12
<p>The location of gray bricks in the arched fire wall (image source: drawn by the author).</p>
Full article ">Figure 13
<p>The location of gray bricks in the herringbone fire wall (image source: drawn by the author).</p>
Full article ">Figure 14
<p>The distribution of gray bricks in the walls of Western-style buildings (image source: drawn by the author).</p>
Full article ">Figure 15
<p>Schemes follow the same formatting, all images are normalized to 512 × 512 pixels. (image source: drawn by the author).</p>
Full article ">Figure 16
<p>The mAP numerical statistics of the first ten epochs of ten model optimization experiments (image source: drawn by the author).</p>
Full article ">Figure 17
<p>Loss value change trend during model training (image source: drawn by the author).</p>
Full article ">Figure 18
<p>Performance statistics of the models at different epochs. (asterisk * in the figure indicates the median). In the figure, F1* indicates score threshold = 0.5; Recall* indicates score threshold = 0.5; Precision* indicates score threshold = 0.5. (image source: drawn by the author).</p>
Full article ">Figure 19
<p>Confusion matrix of the 23rd epoch model (image source: drawn by the author).</p>
Full article ">Figure 20
<p>Confusion matrix of the 170th epoch model (image source: drawn by the author).</p>
Full article ">Figure 21
<p>Confusion matrix of the 204th epoch model (image source: drawn by the author).</p>
Full article ">Figure 22
<p>Confusion matrix of the 300th epoch model (image source: drawn by the author).</p>
Full article ">Figure 23
<p>Analysis of the detection results of the different epoch models (image source: drawn by the author).</p>
Full article ">Figure 24
<p>Feature map analysis of the model in the process of gray brick efflorescence detection (image source: drawn by the author).</p>
Full article ">Figure 25
<p>Feature map analysis of the model in the process of gray brick plant growth detection (image source: drawn by the author).</p>
Full article ">Figure 26
<p>Feature map of the model’s original image detection process for gray brick efflorescence (image source: drawn by the author).</p>
Full article ">Figure 27
<p>Feature map of the model’s original image detection process for gray brick plant growth (image source: drawn by the author).</p>
Full article ">Figure 28
<p>Testing of the model in field applications (image source: drawn by the author).</p>
Full article ">
20 pages, 9472 KiB  
Article
Reduced-Order Model of Coal Seam Gas Extraction Pressure Distribution Based on Deep Neural Networks and Convolutional Autoencoders
by Tianxuan Hao, Lizhen Zhao, Yang Du, Yiju Tang, Fan Li, Zehua Wang and Xu Li
Information 2024, 15(11), 733; https://doi.org/10.3390/info15110733 - 16 Nov 2024
Viewed by 495
Abstract
There has been extensive research on the partial differential equations governing the theory of gas flow in coal mines. However, the traditional Proper Orthogonal Decomposition–Radial Basis Function (POD-RBF) reduced-order algorithm requires significant computational resources and is inefficient when calculating high-dimensional data for coal [...] Read more.
There has been extensive research on the partial differential equations governing the theory of gas flow in coal mines. However, the traditional Proper Orthogonal Decomposition–Radial Basis Function (POD-RBF) reduced-order algorithm requires significant computational resources and is inefficient when calculating high-dimensional data for coal mine gas pressure fields. To achieve the rapid computation of gas extraction pressure fields, this paper proposes a model reduction method based on deep neural networks (DNNs) and convolutional autoencoders (CAEs). The CAE is used to compress and reconstruct full-order numerical solutions for coal mine gas extraction, while the DNN is employed to establish the nonlinear mapping between the physical parameters of gas extraction and the latent space parameters of the reduced-order model. The DNN-CAE model is applied to the reduced-order modeling of gas extraction flow–solid coupling mathematical models in coal mines. A full-order model pressure field numerical dataset for gas extraction was constructed, and optimal hyperparameters for the pressure field reconstruction model and latent space parameter prediction model were determined through hyperparameter testing. The performance of the DNN-CAE model order reduction algorithm was compared to the POD-RBF model order reduction algorithm. The results indicate that the DNN-CAE method has certain advantages over the traditional POD-RBF method in terms of pressure field reconstruction accuracy, overall structure retention, extremum capture, and computational efficiency. Full article
Show Figures

Figure 1

Figure 1
<p>Physical field coupling relation of coal seam gas extraction flow model.</p>
Full article ">Figure 2
<p>Geometric modeling for numerical simulation of gas extraction.</p>
Full article ">Figure 3
<p>CAE model structure for reconfiguration of pressure field in coal seam gas extraction.</p>
Full article ">Figure 4
<p>Structure of the DNN model for predicting patent space parameters of methane extraction pressure field.</p>
Full article ">Figure 5
<p>DNN-CAE model construction process.</p>
Full article ">Figure 6
<p>DNN-CAE-based reduced-order model for coal seam gas extraction.</p>
Full article ">Figure 7
<p>Mesh search results for model batch size and learning rate for CAE.</p>
Full article ">Figure 8
<p>Learning error of CAE decreases when batch_size = 16 and lr = 5 × 10<sup>−5</sup>.</p>
Full article ">Figure 9
<p>Effect of CAE on the reconstruction of the gas extraction pressure field.</p>
Full article ">Figure 9 Cont.
<p>Effect of CAE on the reconstruction of the gas extraction pressure field.</p>
Full article ">Figure 10
<p>Mesh search results for model batch size and learning rate for DNN.</p>
Full article ">Figure 11
<p>Decrease in learning error of DNN when batch_size = 1, lr = 1 × 10<sup>−5</sup>.</p>
Full article ">Figure 12
<p>Comparison of the prediction effect of DNN-CAE and POD-RBF on the pressure field of gas extraction.</p>
Full article ">Figure 12 Cont.
<p>Comparison of the prediction effect of DNN-CAE and POD-RBF on the pressure field of gas extraction.</p>
Full article ">Figure 13
<p>Plan view of mining and excavation projects in the Shoushan Mine area.</p>
Full article ">Figure 14
<p>Plan view of borehole construction.</p>
Full article ">Figure 15
<p>Distribution of test boreholes in the mining face and DNN-CAE reconstruction of the gas pressure field.</p>
Full article ">
17 pages, 2380 KiB  
Article
Nondestructive Detection of Litchi Stem Borers Using Multi-Sensor Data Fusion
by Zikun Zhao, Sai Xu, Huazhong Lu, Xin Liang, Hongli Feng and Wenjing Li
Agronomy 2024, 14(11), 2691; https://doi.org/10.3390/agronomy14112691 - 15 Nov 2024
Viewed by 309
Abstract
To enhance lychee quality assessment and address inconsistencies in post-harvest pest detection, this study presents a multi-source fusion approach combining hyperspectral imaging, X-ray imaging, and visible/near-infrared (Vis/NIR) spectroscopy. Traditional single-sensor methods are limited in detecting pest damage, particularly in lychees with complex skins, [...] Read more.
To enhance lychee quality assessment and address inconsistencies in post-harvest pest detection, this study presents a multi-source fusion approach combining hyperspectral imaging, X-ray imaging, and visible/near-infrared (Vis/NIR) spectroscopy. Traditional single-sensor methods are limited in detecting pest damage, particularly in lychees with complex skins, as they often fail to capture both external and internal fruit characteristics. By integrating multiple sensors, our approach overcomes these limitations, offering a more accurate and robust detection system. Significant differences were observed between pest-free and infested lychees. Pest-free lychees exhibited higher hardness, soluble sugars (11% higher in flesh, 7% higher in peel), vitamin C (50% higher in flesh, 2% higher in peel), polyphenols, anthocyanins, and ORAC values (26%, 9%, and 14% higher, respectively). The Vis/NIR data processed with SG+SNV+CARS yielded a partial least squares regression (PLSR) model with an R2 of 0.82, an RMSE of 0.18, and accuracy of 89.22%. The hyperspectral model, using SG+MSC+SPA, achieved an R2 of 0.69, an RMSE of 0.23, and 81.74% accuracy, while the X-ray method with support vector regression (SVR) reached an R2 of 0.69, an RMSE of 0.22, and 76.25% accuracy. Through feature-level fusion, Recursive Feature Elimination with Cross-Validation (RFECV), and dimensionality reduction using PCA, we optimized hyperparameters and developed a Random Forest model. This model achieved 92.39% accuracy in pest detection, outperforming the individual methods by 3.17%, 10.25%, and 16.14%, respectively. The multi-source fusion approach also improved the overall accuracy by 4.79%, highlighting the critical role of sensor fusion in enhancing pest detection and supporting the development of automated non-destructive systems for lychee stem borer detection. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>Schematic diagram of the visible/near-infrared spectroscopy acquisition device.</p>
Full article ">Figure 2
<p>Schematic diagram of the hyperspectral imaging acquisition device.</p>
Full article ">Figure 3
<p>Schematic diagram of the X-ray image acquisition system.</p>
Full article ">Figure 4
<p>Multi-source information fusion flowchart.</p>
Full article ">Figure 5
<p>(<b>a</b>) Raw visible/near-infrared spectrum, (<b>b</b>) visible/near-infrared spectrum after SG+SNV preprocessing.</p>
Full article ">Figure 6
<p>(<b>a</b>) Raw hyperspectral spectrum, (<b>b</b>) hyperspectral spectrum after SG+MSC preprocessing.</p>
Full article ">Figure 7
<p>PCA classification of grayscale values in X-ray imaging feature regions for stem-borer-infested and non-infested fruit.</p>
Full article ">Figure 8
<p>(<b>a</b>) Litchi fruit without pests, (<b>b</b>) litchi fruit with pests.</p>
Full article ">
19 pages, 10002 KiB  
Article
Reliability Analysis of High-Pressure Tunnel System Under Multiple Failure Modes Based on Improved Sparrow Search Algorithm–Kriging–Monte Carlo Simulation Method
by Yingdong Wang, Chen Xing and Leihua Yao
Appl. Sci. 2024, 14(22), 10527; https://doi.org/10.3390/app142210527 - 15 Nov 2024
Viewed by 265
Abstract
It is often difficult for a structural safety design method based on deterministic analysis to fully and reasonably reflect the randomness of mechanical parameters, while the traditional reliability analysis method has a large calculation cost and low accuracy. In this paper, based on [...] Read more.
It is often difficult for a structural safety design method based on deterministic analysis to fully and reasonably reflect the randomness of mechanical parameters, while the traditional reliability analysis method has a large calculation cost and low accuracy. In this paper, based on the seepage–stress coupling numerical model, the random variables affecting the reliability of the collaborative bearing of surrounding rock and lining structures are successfully identified. Then, the improved sparrow search algorithm (ISSA) is used to optimize the hyper-parameters of the Kriging surrogate model, in order to improve the computational efficiency and accuracy of the reliability analysis model. Finally, the ISSA-Kriging-MCS model is used to quantitatively evaluate the reliability of the surrounding rock-reinforced concrete lining structure under multiple failure modes, and the sensitivity of each random variable is discussed in depth. The results show that the high-pressure tunnel structure has high safety and reliability. The reliability indexes of each failure mode decrease with the increase in the coefficient of variation (COV) of random variables. In addition, the same random variable also exhibits varying degrees of influence in different failure modes. Full article
Show Figures

Figure 1

Figure 1
<p>LHS process diagram.</p>
Full article ">Figure 2
<p>Reliability analysis flow chart.</p>
Full article ">Figure 3
<p>Example 1 real response surface. (<b>a</b>) Real response surface. (<b>b</b>) Limit state surface.</p>
Full article ">Figure 4
<p>Example 2: real response surface.</p>
Full article ">Figure 5
<p>Survey and location map of studied area [<a href="#B27-applsci-14-10527" class="html-bibr">27</a>].</p>
Full article ">Figure 6
<p>The computational grid model. (<b>a</b>) A diagram of the overall model grid division. (<b>b</b>) A schematic diagram of the lining grid.</p>
Full article ">Figure 7
<p>The relationship between axial stress and mechanical parameters.</p>
Full article ">Figure 8
<p>The response surface of each random variable to the calculation results. The change in color gradient of all response surface plots (from bottom to top) represents the increasing reliability of the system.</p>
Full article ">Figure 8 Cont.
<p>The response surface of each random variable to the calculation results. The change in color gradient of all response surface plots (from bottom to top) represents the increasing reliability of the system.</p>
Full article ">Figure 9
<p>Sensitivity analysis curve of failure mode 1.</p>
Full article ">Figure 10
<p>Sensitivity analysis curve of failure mode 2.</p>
Full article ">Figure 11
<p>Sensitivity analysis curve of failure mode 3.</p>
Full article ">Figure 12
<p>The variation curves of random variables under different failure modes.</p>
Full article ">
27 pages, 3743 KiB  
Article
Performance Analysis and Improvement of Machine Learning with Various Feature Selection Methods for EEG-Based Emotion Classification
by Sherzod Abdumalikov, Jingeun Kim and Yourim Yoon
Appl. Sci. 2024, 14(22), 10511; https://doi.org/10.3390/app142210511 - 14 Nov 2024
Viewed by 682
Abstract
Emotion classification is a challenge in affective computing, with applications ranging from human–computer interaction to mental health monitoring. In this study, the classification of emotional states using electroencephalography (EEG) data were investigated. Specifically, the efficacy of the combination of various feature selection methods [...] Read more.
Emotion classification is a challenge in affective computing, with applications ranging from human–computer interaction to mental health monitoring. In this study, the classification of emotional states using electroencephalography (EEG) data were investigated. Specifically, the efficacy of the combination of various feature selection methods and hyperparameter tuning of machine learning algorithms for accurate and robust emotion recognition was studied. The following feature selection methods were explored: filter (SelectKBest with analysis of variance (ANOVA) F-test), embedded (least absolute shrinkage and selection operator (LASSO) tuned using Bayesian optimization (BO)), and wrapper (genetic algorithm (GA)) methods. We also executed hyperparameter tuning of machine learning algorithms using BO. The performance of each method was assessed. Two different EEG datasets, EEG Emotion and DEAP Dataset, containing 2548 and 160 features, respectively, were evaluated using random forest (RF), logistic regression, XGBoost, and support vector machine (SVM). For both datasets, the experimented three feature selection methods consistently improved the accuracy of the models. For EEG Emotion dataset, RF with LASSO achieved the best result among all the experimented methods increasing the accuracy from 98.78% to 99.39%. In the DEAP dataset experiment, XGBoost with GA showed the best result, increasing the accuracy by 1.59% and 2.84% for valence and arousal. We also show that these results are superior to those by the previous other methods in the literature. Full article
(This article belongs to the Special Issue Advances in Biosignal Processing)
Show Figures

Figure 1

Figure 1
<p>EEG brainwave dataset training.</p>
Full article ">Figure 2
<p>Flowchart of GA.</p>
Full article ">Figure 3
<p>Violin plots of statistical features in the EEG Emotion dataset: (<b>a</b>) mean, (<b>b</b>) mean difference (computed between windows), (<b>c</b>) min, (<b>d</b>) min difference (computed between windows), (<b>e</b>) min difference (computed for each quarter window), (<b>f</b>) max, (<b>g</b>) max difference (computed between windows), (<b>h</b>) max difference (computed for each quarter window), (<b>i</b>) standard deviation, (<b>j</b>) standard deviation difference (computed between windows), (<b>k</b>) log, (<b>l</b>) correlation, (<b>m</b>) entropy, (<b>n</b>) FFT.</p>
Full article ">Figure 4
<p>Violin plot of ten randomly selected features included in the DEAP dataset.</p>
Full article ">Figure 5
<p>FFT-based frequency analysis of the EEG dataset: randomly selected FFT of a sample with (<b>a</b>) positive and (<b>b</b>) negative emotion levels; emotion level analysis of the DEAP dataset: (<b>c</b>) neutral labels from the EEG Emotion dataset, (<b>d</b>) valence level, and (<b>e</b>) arousal level from the DEAP dataset.</p>
Full article ">Figure 6
<p>Graph comparing the four performance indicators of feature selection methods on the EEG Emotion dataset: (<b>a</b>) filter-based feature selection method; (<b>b</b>) embedded-based feature selection method; (<b>c</b>) wrapper-based feature selection method.</p>
Full article ">Figure 6 Cont.
<p>Graph comparing the four performance indicators of feature selection methods on the EEG Emotion dataset: (<b>a</b>) filter-based feature selection method; (<b>b</b>) embedded-based feature selection method; (<b>c</b>) wrapper-based feature selection method.</p>
Full article ">Figure 7
<p>Graph comparing the four performance indicators of feature selection methods on the DEAP dataset: (<b>a</b>) filter-based feature selection method; (<b>b</b>) embedded-based feature selection method; (<b>c</b>) wrapper-based feature selection method.</p>
Full article ">Figure 8
<p>Correlation heatmaps: (<b>a</b>) before feature selection, (<b>b</b>) after feature selection for the EEG Emotion dataset, (<b>c</b>) before feature selection for the DEAP dataset, (<b>d</b>) after feature selection for the valence label in the DEAP dataset, and (<b>e</b>) after feature selection for the arousal label in the DEAP dataset.</p>
Full article ">
22 pages, 7765 KiB  
Article
Bayesian-Neural-Network-Based Approach for Probabilistic Prediction of Building-Energy Demands
by Akash Mahajan, Srijita Das, Wencong Su and Van-Hai Bui
Sustainability 2024, 16(22), 9943; https://doi.org/10.3390/su16229943 - 14 Nov 2024
Viewed by 460
Abstract
Reliable prediction of building-level energy demand is crucial for the building managers to optimize and regulate energy consumption. Conventional prediction models omit the uncertainties associated with demand over time; hence, they are mostly inaccurate and unreliable. In this study, a Bayesian neural network [...] Read more.
Reliable prediction of building-level energy demand is crucial for the building managers to optimize and regulate energy consumption. Conventional prediction models omit the uncertainties associated with demand over time; hence, they are mostly inaccurate and unreliable. In this study, a Bayesian neural network (BNN)-based probabilistic prediction model is proposed to tackle this challenge. By quantifying the uncertainty, BNNs provide probabilistic predictions that capture the variations in the energy demand. The proposed model is trained and evaluated on a subset of the building operations dataset of Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, which includes diverse attributes related to climate and key building-performance indicators. We have performed thorough hyperparameter tuning and used fixed-horizon validation to evaluate trained models on various test data to assess generalization ability. To validate the results, quantile random forest (QRF) was used as a benchmark. This study compared BNN with LSTM, showing that BNN outperformed LSTM in uncertainty quantification. Full article
Show Figures

Figure 1

Figure 1
<p>High-level floor plan of building 59 with locations of indoor temperature sensors, windows, and entrances.</p>
Full article ">Figure 2
<p>Total electricity and HVAC demand.</p>
Full article ">Figure 3
<p>Schematic diagram of a BNN.</p>
Full article ">Figure 4
<p>Data distribution HVAC (<b>top</b>) and electricity (<b>bottom</b>) time series before and after preprocessing (right to left).</p>
Full article ">Figure 5
<p>Time-series visualization of (<b>a</b>) day, week, and month HVAC prediction (top to bottom), (<b>b</b>) day, week, and month electricity prediction (top to bottom).</p>
Full article ">Figure 5 Cont.
<p>Time-series visualization of (<b>a</b>) day, week, and month HVAC prediction (top to bottom), (<b>b</b>) day, week, and month electricity prediction (top to bottom).</p>
Full article ">Figure 6
<p>Kernel density estimation plot for actual values and predicted values of (<b>a</b>) HVAC and (<b>b</b>) electricity on complete test dataset.</p>
Full article ">Figure 7
<p>Observed tall spikes during prediction of HVAC demand on complete test dataset.</p>
Full article ">Figure A1
<p>One-day, one-week, and one-month electricity predictions using LSTM-based model.</p>
Full article ">Figure A2
<p>One-day, one-week, and one-month HVAC predictions using LSTM-based model.</p>
Full article ">Figure A3
<p>One-day, one-week, and one-month electricity predictions using BNN-based models.</p>
Full article ">Figure A3 Cont.
<p>One-day, one-week, and one-month electricity predictions using BNN-based models.</p>
Full article ">Figure A4
<p>One-day, one-week, and one-month HVAC predictions using BNN-based model.</p>
Full article ">Figure A5
<p>One-year predictions made by electricity and HVAC model.</p>
Full article ">
20 pages, 552 KiB  
Article
SBNNR: Small-Size Bat-Optimized KNN Regression
by Rasool Seyghaly, Jordi Garcia, Xavi Masip-Bruin and Jovana Kuljanin
Future Internet 2024, 16(11), 422; https://doi.org/10.3390/fi16110422 - 14 Nov 2024
Viewed by 309
Abstract
Small datasets are frequent in some scientific fields. Such datasets are usually created due to the difficulty or cost of producing laboratory and experimental data. On the other hand, researchers are interested in using machine learning methods to analyze this scale of data. [...] Read more.
Small datasets are frequent in some scientific fields. Such datasets are usually created due to the difficulty or cost of producing laboratory and experimental data. On the other hand, researchers are interested in using machine learning methods to analyze this scale of data. For this reason, in some cases, low-performance, overfitting models are developed for small-scale data. As a result, it appears necessary to develop methods for dealing with this type of data. In this research, we provide a new and innovative framework for regression problems with a small sample size. The base of our proposed method is the K-nearest neighbors (KNN) algorithm. For feature selection, instance selection, and hyperparameter tuning, we use the bat optimization algorithm (BA). Generative Adversarial Networks (GANs) are employed to generate synthetic data, effectively addressing the challenges associated with data sparsity. Concurrently, Deep Neural Networks (DNNs), as a deep learning approach, are utilized for feature extraction from both synthetic and real datasets. This hybrid framework integrates KNN, DNN, and GAN as foundational components and is optimized in multiple aspects (features, instances, and hyperparameters) using BA. The outcomes exhibit an enhancement of up to 5% in the coefficient of determination (R2 score) using the proposed method compared to the standard KNN method optimized through grid search. Full article
(This article belongs to the Special Issue Deep Learning Techniques Addressing Data Scarcity)
Show Figures

Figure 1

Figure 1
<p>Predicted (red points for test data and blue points for train data) compared to actual values (green line) of GridCV optimized KNN on the Servo dataset.</p>
Full article ">Figure 2
<p>Predicted (red points for test data and blue points for train data) compared to actual values (green line) of the proposed method on Servo dataset.</p>
Full article ">Figure 3
<p>Predicted (Red Points for test data and blue points for train data) compared to Actual values (Green line) Proposed method on BONE dataset.</p>
Full article ">Figure 4
<p>Predicted (red points for test data and blue points for train data) compared to actual values (green line) of the proposed method on the FAT dataset.</p>
Full article ">
Back to TopTop