Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,682)

Search Parameters:
Keywords = VGG19

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 6079 KiB  
Article
Secure Hybrid Deep Learning for MRI-Based Brain Tumor Detection in Smart Medical IoT Systems
by Nermeen Gamal Rezk, Samah Alshathri, Amged Sayed, Ezz El-Din Hemdan and Heba El-Behery
Diagnostics 2025, 15(5), 639; https://doi.org/10.3390/diagnostics15050639 - 6 Mar 2025
Viewed by 48
Abstract
Background/Objectives: Brain tumors are among the most aggressive diseases, significantly contributing to human mortality. Typically, the classification of brain tumors is performed through a biopsy, which is often delayed until brain surgery is necessary. An automated image classification technique is crucial for [...] Read more.
Background/Objectives: Brain tumors are among the most aggressive diseases, significantly contributing to human mortality. Typically, the classification of brain tumors is performed through a biopsy, which is often delayed until brain surgery is necessary. An automated image classification technique is crucial for accelerating diagnosis, reducing the need for invasive procedures and minimizing the risk of manual diagnostic errors being made by radiologists. Additionally, the security of sensitive MRI images remains a major concern, with robust encryption methods required to protect patient data from unauthorized access and breaches in Medical Internet of Things (MIoT) systems. Methods: This study proposes a secure and automated MRI image classification system that integrates chaotic and Arnold encryption techniques with hybrid deep learning models using VGG16 and a deep neural network (DNN). The methodology ensures MRI image confidentiality while enabling the accurate classification of brain tumors and not compromising performance. Results: The proposed system demonstrated a high classification performance under both encryption scenarios. For chaotic encryption, it achieved an accuracy of 93.75%, precision of 94.38%, recall of 93.75%, and an F-score of 93.67%. For Arnold encryption, the model attained an accuracy of 94.1%, precision of 96.9%, recall of 94.1%, and an F-score of 96.6%. These results indicate that encrypted images can still be effectively classified, ensuring both security and diagnostic accuracy. Conclusions: The proposed hybrid deep learning approach provides a secure, accurate, and efficient solution for brain tumor detection in MIoT-based healthcare applications. By encrypting MRI images before classification, the system ensures patient data confidentiality while maintaining high diagnostic performance. This approach can empower radiologists and healthcare professionals worldwide, enabling early and secure brain tumor diagnosis without the need for invasive procedures. Full article
(This article belongs to the Special Issue Artificial Intelligence in Brain Diseases)
Show Figures

Figure 1

Figure 1
<p>A typical brain tumor detection system.</p>
Full article ">Figure 2
<p>A typical IoT-based Brain Tumor Detection Model.</p>
Full article ">Figure 3
<p>Baker map for an 8 × 8 matrix.</p>
Full article ">Figure 4
<p>Basic VGG 16 neural network.</p>
Full article ">Figure 5
<p>Proposed hybrid VGG19-DNN system for tumor detection.</p>
Full article ">Figure 6
<p>Encrypted MRI images for normal and tumor cases.</p>
Full article ">Figure 6 Cont.
<p>Encrypted MRI images for normal and tumor cases.</p>
Full article ">Figure 7
<p>Results of the proposed system’s accuracy when using Arnold and chaotic encryption algorithms.</p>
Full article ">Figure 8
<p>Results of the proposed system’s precision when using Arnold and chaotic encryption algorithms.</p>
Full article ">Figure 9
<p>Results of the proposed system’s recall when using Arnold and chaotic encryption algorithms.</p>
Full article ">Figure 10
<p>Results of the proposed system’s F1-score when using Arnold and chaotic encryption algorithms.</p>
Full article ">Figure 11
<p>Proposed remote monitoring framework for initial brain tumor detection.</p>
Full article ">
23 pages, 10794 KiB  
Article
Hand–Eye Separation-Based First-Frame Positioning and Follower Tracking Method for Perforating Robotic Arm
by Handuo Zhang, Jun Guo, Chunyan Xu and Bin Zhang
Appl. Sci. 2025, 15(5), 2769; https://doi.org/10.3390/app15052769 - 4 Mar 2025
Viewed by 221
Abstract
In subway tunnel construction, current hand–eye integrated drilling robots use a camera mounted on the drilling arm for image acquisition. However, dust interference and long-distance operation cause a decline in image quality, affecting the stability and accuracy of the visual recognition system. Additionally, [...] Read more.
In subway tunnel construction, current hand–eye integrated drilling robots use a camera mounted on the drilling arm for image acquisition. However, dust interference and long-distance operation cause a decline in image quality, affecting the stability and accuracy of the visual recognition system. Additionally, the computational complexity of high-precision detection models limits deployment on resource-constrained edge devices, such as industrial controllers. To address these challenges, this paper proposes a dual-arm tunnel drilling robot system with hand–eye separation, utilizing the first-frame localization and follower tracking method. The vision arm (“eye”) provides real-time position data to the drilling arm (“hand”), ensuring accurate and efficient operation. The study employs an RFBNet model for initial frame localization, replacing the original VGG16 backbone with ShuffleNet V2. This reduces model parameters by 30% (135.5 MB vs. 146.3 MB) through channel splitting and depthwise separable convolutions to reduce computational complexity. Additionally, the GIoU loss function is introduced to replace the traditional IoU, further optimizing bounding box regression through the calculation of the minimum enclosing box. This resolves the gradient vanishing problem in traditional IoU and improves average precision (AP) by 3.3% (from 0.91 to 0.94). For continuous tracking, a SiamRPN-based algorithm combined with Kalman filtering and PID control ensures robustness against occlusions and nonlinear disturbances, increasing the success rate by 1.6% (0.639 vs. 0.629). Experimental results show that this approach significantly improves tracking accuracy and operational stability, achieving 31 FPS inference speed on edge devices and providing a deployable solution for tunnel construction’s safety and efficiency needs. Full article
Show Figures

Figure 1

Figure 1
<p>Hand–eye separation schematic.</p>
Full article ">Figure 2
<p>Network structure of initial frame positioning model based on improved RFBNet. Schemes follow the same formatting.</p>
Full article ">Figure 3
<p>ShuffleNet V2 block.</p>
Full article ">Figure 4
<p>Comparison of tracking results of the proposed method with the baseline algorithm and the classical algorithm.</p>
Full article ">Figure 5
<p>Tracking and positioning process diagram of drilling robot arm based on SiamRPN.</p>
Full article ">Figure 6
<p>PID control system block diagram.</p>
Full article ">Figure 7
<p>Comparison of initial frame positioning model effect of improved RFBNet.</p>
Full article ">Figure 8
<p>Comparison of success rate and accuracy rate between proposed algorithm and baseline algorithm.</p>
Full article ">Figure 9
<p>Comparison of success rate and accuracy rate between proposed algorithm and classical algorithm.</p>
Full article ">Figure 10
<p>Comparison of accuracy rates between the proposed method and the classical algorithm. Figures are arranged in three columns per row.</p>
Full article ">Figure 11
<p>Comparison of accuracy rates between the proposed method and the classical algorithm. Figures are arranged in three columns per row.</p>
Full article ">Figure 12
<p>Comparison of tracking results of the proposed method with the baseline algorithm and the classical algorithm.</p>
Full article ">
21 pages, 17670 KiB  
Article
Advancing Traffic Sign Recognition: Explainable Deep CNN for Enhanced Robustness in Adverse Environments
by Ilyass Benfaress, Afaf Bouhoute and Ahmed Zinedine
Computers 2025, 14(3), 88; https://doi.org/10.3390/computers14030088 - 4 Mar 2025
Viewed by 199
Abstract
This paper presents a traffic sign recognition (TSR) system based on the deep convolutional neural network (CNN) architecture, which proves to be extremely accurate in recognizing traffic signs under challenging conditions such as bad weather, low-resolution images, and various environmental-impact factors. The proposed [...] Read more.
This paper presents a traffic sign recognition (TSR) system based on the deep convolutional neural network (CNN) architecture, which proves to be extremely accurate in recognizing traffic signs under challenging conditions such as bad weather, low-resolution images, and various environmental-impact factors. The proposed CNN is compared with other architectures, including GoogLeNet, AlexNet, DarkNet-53, ResNet-34, VGG-16, and MicronNet-BF. Experimental results confirm that the proposed CNN significantly improves recognition accuracy compared to existing models. In order to make our model interpretable, we utilize explainable AI (XAI) approaches, specifically Gradient-weighted Class Activation Mapping (Grad-CAM), that can give insight into how the system comes to its decision. The evaluation of the Tsinghua-Tencent 100K (TT100K) traffic sign dataset showed that the proposed method significantly outperformed existing state-of-the-art methods. Additionally, we evaluated our model on the German Traffic Sign Recognition Benchmark (GTSRB) dataset to ensure generalization, demonstrating its ability to perform well in diverse traffic sign conditions. Design issues such as noise, contrast, blurring, and zoom effects were added to enhance performance in real applications. These verified results indicate both the strength and reliability of the CNN architecture proposed for TSR tasks and that it is a good option for integration into intelligent transportation systems (ITSs). Full article
Show Figures

Figure 1

Figure 1
<p>Flowchart with all method steps.</p>
Full article ">Figure 2
<p>Different weather conditions, different daytime hours, and intricate situations.</p>
Full article ">Figure 3
<p>A list of traffic signs and their labels in the TT100K dataset [<a href="#B34-computers-14-00088" class="html-bibr">34</a>], grouped according to their classification (red, blue, and yellow denote prohibitory, mandatory, and warning signs, respectively).</p>
Full article ">Figure 4
<p>Distribution of classes by instances range.</p>
Full article ">Figure 5
<p>Distribution of traffic sign instances by class.</p>
Full article ">Figure 6
<p>Examples of data augmentation techniques.</p>
Full article ">Figure 7
<p>CNN Architecture for TSR.</p>
Full article ">
22 pages, 3211 KiB  
Article
Development of a Software and Hardware Complex for Monitoring Processes in Production Systems
by Vadim Pechenin, Rustam Paringer, Nikolay Ruzanov and Aleksandr Khaimovich
Sensors 2025, 25(5), 1527; https://doi.org/10.3390/s25051527 - 28 Feb 2025
Viewed by 291
Abstract
The article presents a detailed exposition of a hardware–software complex that has been developed for the purpose of enhancing the productivity of accounting for the state of the production process. This complex facilitates the automation of the identification of parts in production containers [...] Read more.
The article presents a detailed exposition of a hardware–software complex that has been developed for the purpose of enhancing the productivity of accounting for the state of the production process. This complex facilitates the automation of the identification of parts in production containers and the utilisation of supplementary markers. The complex comprises a mini computer (system unit in industrial version) with connected cameras (IP or WEB), a communication module with LED and signal lamps, and developed software. The cascade algorithm developed for the detection of labels and objects in containers employs trained convolutional neural networks (YOLO and VGG19), thereby enhancing the recognition accuracy while concurrently reducing the size of the training sample for neural networks. The efficacy of the developed system was assessed through laboratory experimentation, which yielded experimental results demonstrating 93% accuracy in detail detection using the developed algorithm, in comparison to the 72% accuracy achieved through the utilisation of the traditional approach employing a single neural network. Full article
(This article belongs to the Special Issue Computer Vision and Sensors-Based Application for Intelligent Systems)
28 pages, 3159 KiB  
Systematic Review
Artificial Vision Systems for Fruit Inspection and Classification: Systematic Literature Review
by Ignacio Rojas Santelices, Sandra Cano, Fernando Moreira and Álvaro Peña Fritz
Sensors 2025, 25(5), 1524; https://doi.org/10.3390/s25051524 - 28 Feb 2025
Viewed by 269
Abstract
Fruit sorting and quality inspection using computer vision is a key tool to ensure quality and safety in the fruit industry. This study presents a systematic literature review, following the PRISMA methodology, with the aim of identifying different fields of application, typical hardware [...] Read more.
Fruit sorting and quality inspection using computer vision is a key tool to ensure quality and safety in the fruit industry. This study presents a systematic literature review, following the PRISMA methodology, with the aim of identifying different fields of application, typical hardware configurations, and the techniques and algorithms used for fruit sorting. In this study, 56 articles published between 2015 and 2024 were analyzed, selected from relevant databases such as Web of Science and Scopus. The results indicate that the main fields of application include orchards, industrial processing lines, and final consumption points, such as supermarkets and homes, each with specific technical requirements. Regarding hardware, RGB cameras and LED lighting systems predominate in controlled applications, although multispectral cameras are also important in complex applications such as foreign material detection. Processing techniques include traditional algorithms such as Otsu and Sobel for segmentation and deep learning models such as ResNet and VGG, often optimized with transfer learning for classification. This systematic review could provide a basic guide for the development of fruit quality inspection and classification systems in different environments. Full article
Show Figures

Figure 1

Figure 1
<p>Search strategy used in databases.</p>
Full article ">Figure 2
<p>Article selection scheme according to PRISMA [<a href="#B13-sensors-25-01524" class="html-bibr">13</a>].</p>
Full article ">Figure 3
<p>Distribution of studies in the time analyzed.</p>
Full article ">Figure 4
<p>Studies distribution by geographical location.</p>
Full article ">Figure 5
<p>Percentage distribution according to fruit type.</p>
Full article ">Figure 6
<p>Percentage distribution by classification objective.</p>
Full article ">Figure 7
<p>Carrot-sorting machine [<a href="#B33-sensors-25-01524" class="html-bibr">33</a>].</p>
Full article ">Figure 8
<p>Stylization filter applied to improve object detection [<a href="#B61-sensors-25-01524" class="html-bibr">61</a>].</p>
Full article ">Figure 9
<p>Different transformations applying data augmentation: (<b>a</b>) original image, (<b>b</b>) rotation, (<b>c</b>) darkening, (<b>d</b>) brightening, (<b>e</b>) pretzel, and (<b>f</b>) blurring [<a href="#B21-sensors-25-01524" class="html-bibr">21</a>].</p>
Full article ">Figure 10
<p>Percentage distribution of each type of application.</p>
Full article ">Figure 11
<p>Temporal evolution of the use of different classification algorithms.</p>
Full article ">Figure 12
<p>Deep learning algorithms used in the studies.</p>
Full article ">Figure 13
<p>Traditional algorithms used in the studies.</p>
Full article ">Figure 14
<p>Environmental conditions of deep learning algorithms used in the studies.</p>
Full article ">
17 pages, 2886 KiB  
Article
Classification of Cloud Particle Habits Using Transfer Learning with a Deep Convolutional Neural Network
by Yefeng Xu, Ruili Jiao, Qiubai Li and Minsong Huang
Atmosphere 2025, 16(3), 294; https://doi.org/10.3390/atmos16030294 - 28 Feb 2025
Viewed by 184
Abstract
The habits of cloud particles are a significant factor impacting microphysical processes in clouds. The accurate identification of cloud particle shapes within clouds is a fundamental requirement for calculating various cloud microphysical parameters. In this study, we established a cloud particle image dataset [...] Read more.
The habits of cloud particles are a significant factor impacting microphysical processes in clouds. The accurate identification of cloud particle shapes within clouds is a fundamental requirement for calculating various cloud microphysical parameters. In this study, we established a cloud particle image dataset encompassing nine distinct habit categories, totaling 8100 images. These images were captured using three probes with varying resolutions: the Cloud Particle Imager (CPI), the Two-Dimensional Stereo Probe (2D-S), and the High-Volume Precipitation Spectrometer (HVPS). Furthermore, this study performs a comparative analysis of ten different transfer learning (TL) models based on this dataset. It was found that the VGG-16 model exhibits the highest classification accuracy, reaching 97.90%. This model also demonstrates the highest recall, precision, and F1 measure. The results indicate that the VGG-16 model can reliably classify the shapes of ice crystal particles measured by both line scan imagers (2D-S, HVPS) and an area scan imager (CPI). Full article
(This article belongs to the Section Atmospheric Techniques, Instruments, and Modeling)
Show Figures

Figure 1

Figure 1
<p>Typical examples of ice crystal shapes for the HVPS, 2D-S and CPI probes(images standardized to the same size).</p>
Full article ">Figure 2
<p>Overview of training methodology.</p>
Full article ">Figure 3
<p>Accuracy (<b>top</b>) and cross-loss entropy curves (<b>bottom</b>) of ten CNN models based on transfer learning during training and testing stages.</p>
Full article ">Figure 4
<p>The performance of the VGG-16 in each category on precision (P), recall (R) and F1 measure (F1) with the test dataset.</p>
Full article ">Figure 5
<p>Confusion matrix of ice crystal particle classification results for the VGG-16 classification model on the test set (maximum possible value in the matrix is 180).</p>
Full article ">Figure 6
<p>Confusion matrix of ice crystal particle classification results for the VGG-16 classification model on the CPI (<b>a</b>), 2D-S (<b>b</b>) and HPVS (<b>c</b>) probe test sets (maximum possible value in the matrix is 60).</p>
Full article ">
16 pages, 5408 KiB  
Technical Note
Predicting the Spatial Distribution of VLF Transmitter Signals Using Transfer Learning Models
by Hanqing Shi, Wei Xu, Binbin Ni, Xudong Gu, Shiwei Wang, Jingyuan Feng, Wen Cheng, Wenchen Ma, Haotian Xu, Yudi Pan and Dongfang Zhai
Remote Sens. 2025, 17(5), 871; https://doi.org/10.3390/rs17050871 - 28 Feb 2025
Viewed by 133
Abstract
The D-region ionosphere (60–100 km altitude) is critical for radio communication and space weather research but cannot be easily measured because it is too low for satellites and too high for balloons. The most effective technique is to remotely sense by measuring Very-Low-Frequency [...] Read more.
The D-region ionosphere (60–100 km altitude) is critical for radio communication and space weather research but cannot be easily measured because it is too low for satellites and too high for balloons. The most effective technique is to remotely sense by measuring Very-Low-Frequency (VLF, 3–30 kHz) waves emitted from man-made transmitters, a technique that was traditionally utilized to estimate the average ionospheric condition between the transmitter and receiver. Recently, various methods have been proposed to remotely sense the D-region ionosphere in large areas using network observation of VLF transmitter signals. The key component of these methods is the VLF propagation model, and the Long-Wavelength Propagation Capability (LWPC) model is employed in most cases due to its relatively fast computation speed. However, it is still too long and thus insufficient for real-time remote sensing. To overcome this limitation, we have proposed a neural network model to replace the LWPC model and to shorten the computation time of VLF propagation. This model is specifically obtained using the transfer learning method by retraining the last three layers of the well-established VGG16, GoogLeNet, and ResNet architectures. We have tested different methods to organize the input data for these neural network models and verified their performance using the validation dataset and real measurements. Among the three models, GoogLeNet outperforms the other two, and the root mean squared error (RMSE), with respect to LWPC results, is as low as 0.334. Moreover, the proposed neural network model can dramatically reduce the computation time. The computation time to calculate the signal distribution near the transmitter is 1184 s if one uses the LWPC model but 0.87 s if the present neural network model is used. The performance of this model is also excellent for ionospheric conditions that are not included in the validation dataset. Therefore, this model is robust and can be used to remotely sense, in real time, the D-region ionosphere in large areas, as well as various scientific and engineering needs. Full article
Show Figures

Figure 1

Figure 1
<p>The <span class="html-italic">h</span> and <span class="html-italic">β</span> values used for the calculation of VLF amplitude along a single propagation path, corresponding to the azimuth angle of 0° near NWC. The figures on the left refer to <span class="html-italic">h</span>′ and <span class="html-italic">β</span> in the area surrounding NWC, and the one on the right refers to the amplitude distribution.</p>
Full article ">Figure 2
<p>Three methods to convert the ionospheric parameters near the transmitter into the input of neural network models. Input data in method 1 is the 1 × 400 matrix from one propagation path, which is reshaped into 20 × 20. The one in method 2 is the 360 × 400 matrix from 360 propagation paths. The input in method 3 is the figure mapping the data from 360 propagation paths to the corresponding geographic latitude and longitude coordinates.</p>
Full article ">Figure 3
<p>The brief structure of a convolutional neural network. The convolution layer is utilized to capture the feature of the data input. The pooling layer is employed to reduce the complexity of the feature. And the fully connected layer serves as the role to connect the output data to the network.</p>
Full article ">Figure 4
<p>The flowchart of the transfer learning model. The weight of former layers, such as convolution layers, which are used to capture the feature are frozen. The last several layers are retrained to modify the pre-trained model.</p>
Full article ">Figure 5
<p>The training loss and validation loss of (<b>a</b>) input method 1; (<b>b</b>) input method 2; and (<b>c</b>) input method 3.</p>
Full article ">Figure 6
<p>The (<b>a</b>) <span class="html-italic">h</span>′ and (<b>b</b>) <span class="html-italic">β</span> values used for the calculation of signal amplitude near NWC. (<b>c</b>) The spatial distribution of signal amplitude near NWC, used for model verification.</p>
Full article ">Figure 7
<p>Upper row: the VLF amplitude obtained using (<b>a</b>) input method 1; (<b>c</b>) input method 2; and (<b>e</b>) input method 3. Bottom row: (<b>b</b>,<b>d</b>,<b>f</b>) the difference between the upper row and the results shown in <a href="#remotesensing-17-00871-f006" class="html-fig">Figure 6</a>c.</p>
Full article ">Figure 8
<p>(<b>a</b>) The great circle paths between NWC, JJI, and VTX transmitters and the receivers in Suizhou, Wuhan, and Leshan, China. (<b>b</b>) <span class="html-italic">β</span> (<b>c</b>) <span class="html-italic">h</span>′ derived from the VLF measurements in Suizhou, Wuhan, and Leshan. The signal amplitude near NWC calculated using (<b>d</b>) LWPC and (<b>e</b>) the GoogLeNet model (input method 2). (<b>f</b>) The difference between LWPC results and the results obtained using the GoogLeNet model.</p>
Full article ">
28 pages, 4958 KiB  
Article
Application of Multiple Deep Learning Architectures for Emotion Classification Based on Facial Expressions
by Cheng Qian, João Alexandre Lobo Marques, Auzuir Ripardo de Alexandria and Simon James Fong
Sensors 2025, 25(5), 1478; https://doi.org/10.3390/s25051478 - 27 Feb 2025
Viewed by 193
Abstract
Facial expression recognition (FER) is essential for discerning human emotions and is applied extensively in big data analytics, healthcare, security, and user experience enhancement. This study presents a comprehensive evaluation of ten state-of-the-art deep learning models—VGG16, VGG19, ResNet50, ResNet101, DenseNet, GoogLeNet V1, MobileNet [...] Read more.
Facial expression recognition (FER) is essential for discerning human emotions and is applied extensively in big data analytics, healthcare, security, and user experience enhancement. This study presents a comprehensive evaluation of ten state-of-the-art deep learning models—VGG16, VGG19, ResNet50, ResNet101, DenseNet, GoogLeNet V1, MobileNet V1, EfficientNet V2, ShuffleNet V2, and RepVGG—on the task of facial expression recognition using the FER2013 dataset. Key performance metrics, including test accuracy, training time, and weight file size, were analyzed to assess the learning efficiency, generalization capabilities, and architectural innovations of each model. EfficientNet V2 and ResNet50 emerged as top performers, achieving high accuracy and stable convergence using compound scaling and residual connections, enabling them to capture complex emotional features with minimal overfitting. DenseNet, GoogLeNet V1, and RepVGG also demonstrated strong performance, leveraging dense connectivity, inception modules, and re-parameterization techniques, though they exhibited slower initial convergence. In contrast, lightweight models such as MobileNet V1 and ShuffleNet V2, while excelling in computational efficiency, faced limitations in accuracy, particularly in challenging emotion categories like “fear” and “disgust”. The results highlight the critical trade-offs between computational efficiency and predictive accuracy, emphasizing the importance of selecting appropriate architecture based on application-specific requirements. This research contributes to ongoing advancements in deep learning, particularly in domains such as facial expression recognition, where capturing subtle and complex patterns is essential for high-performance outcomes. Full article
(This article belongs to the Section Internet of Things)
Show Figures

Figure 1

Figure 1
<p>Standard Samples of the FER2013 Database.</p>
Full article ">Figure 2
<p>Non-standard Samples of the FER2013 Database.</p>
Full article ">Figure 3
<p>Confusion matrix of VGG16.</p>
Full article ">Figure 4
<p>Confusion matrix of VGG19.</p>
Full article ">Figure 5
<p>Confusion matrix of Resnet50.</p>
Full article ">Figure 6
<p>Confusion matrix of Resnet101.</p>
Full article ">Figure 7
<p>Confusion matrix of DenseNet.</p>
Full article ">Figure 8
<p>Confusion matrix of GoogLeNet V1.</p>
Full article ">Figure 9
<p>Confusion matrix of MobileNet V1.</p>
Full article ">Figure 10
<p>Confusion matrix of EfficientNet V2.</p>
Full article ">Figure 11
<p>Confusion matrix of ShuffleNet V2.</p>
Full article ">Figure 12
<p>Confusion matrix of RepVGG.</p>
Full article ">Figure 13
<p>Accuracy for each epoch on models.</p>
Full article ">
13 pages, 2215 KiB  
Article
Disease Infection Classification in Coconut Tree Based on an Enhanced Visual Geometry Group Model
by Xiaocun Huang, Mustafa Muwafak Alobaedy, Yousef Fazea, S. B. Goyal and Zilong Deng
Processes 2025, 13(3), 689; https://doi.org/10.3390/pr13030689 - 27 Feb 2025
Viewed by 231
Abstract
The coconut is a perennial, evergreen tree in the palm family that belongs to the monocotyledonous group. The coconut plant holds significant economic value due to the diverse functions served by each of its components. Any ailment that impacts the productivity of the [...] Read more.
The coconut is a perennial, evergreen tree in the palm family that belongs to the monocotyledonous group. The coconut plant holds significant economic value due to the diverse functions served by each of its components. Any ailment that impacts the productivity of the coconut plantation will ultimately have repercussions on the associated industries and the sustenance of the families reliant on the coconut economy. Deep learning has the potential to significantly alter the landscape of plant disease detection. Convolutional neural networks are trained using extensive datasets that include annotated images of plant diseases. This training enables the models to develop high-level proficiency in identifying complex patterns and extracting disease-specific features with exceptional accuracy. To address the need for a large dataset for training, an Enhanced Visual Geometry Group (EVGG16) model utilizing transfer learning was developed for detecting disease infections in coconut trees. The EVGG16 model achieves effective training with a limited quantity of data, utilizing the weight parameters of the convolution layer and pooling layer from the pre-training model to perform transfer Visual Geometry Group (VGG16) network model. Through hyperparameter tuning and optimized training batch configurations, we achieved enhanced recognition accuracy, facilitating the development of more robust and stable predictive models. Experimental results demonstrate that the EVGG16 model achieved a 97.70% accuracy rate, highlighting its strong performance and suitability for practical applications in disease detection for plantations. Full article
(This article belongs to the Special Issue Transfer Learning Methods in Equipment Reliability Management)
Show Figures

Figure 1

Figure 1
<p>Components of the EVGG16 model for disease infection classification in coconut trees.</p>
Full article ">Figure 2
<p>Examples of coconut tree disease.</p>
Full article ">Figure 3
<p>CNN based on VGG16 [<a href="#B23-processes-13-00689" class="html-bibr">23</a>].</p>
Full article ">Figure 4
<p>Confusion matrix of the prediction results.</p>
Full article ">Figure 5
<p>Recognition accuracy, precision, f1-score, and recall of different models.</p>
Full article ">Figure 6
<p>Convergence comparison.</p>
Full article ">
26 pages, 4102 KiB  
Article
A New Hybrid ConvViT Model for Dangerous Farm Insect Detection
by Anil Utku, Mahmut Kaya and Yavuz Canbay
Appl. Sci. 2025, 15(5), 2518; https://doi.org/10.3390/app15052518 - 26 Feb 2025
Viewed by 276
Abstract
This study proposes a novel hybrid convolution and vision transformer model (ConvViT) designed to detect harmful insect species that adversely affect agricultural production and play a critical role in global food security. By utilizing a dataset comprising images of 15 distinct insect species, [...] Read more.
This study proposes a novel hybrid convolution and vision transformer model (ConvViT) designed to detect harmful insect species that adversely affect agricultural production and play a critical role in global food security. By utilizing a dataset comprising images of 15 distinct insect species, the suggested approach combines the strengths of traditional convolutional neural networks (CNNs) with vision transformer (ViT) architectures. This integration aims to capture local-level morphological features effectively while analyzing global spatial relationships more comprehensively. While the CNN structure excels at discerning fine morphological details of insects, the ViT’s self-attention mechanism enables a holistic evaluation of their overall configurations. Several data preprocessing steps were implemented to enhance the model’s performance, including data augmentation techniques and strategies to ensure class balance. In addition, hyperparameter optimization contributed to more stable and robust model training. Experimental results indicate that the ConvViT model outperforms commonly used benchmark architectures such as EfficientNetB0, DenseNet201, ResNet-50, VGG-16, and standalone ViT, achieving a classification accuracy of 93.61%. This hybrid approach improves accuracy and strengthens generalization capabilities, delivering steady performance during training and testing phases, thereby increasing its reliability for field applications. The findings highlight that the ConvViT model achieves high efficiency in pest detection by integrating local and global feature learning. Consequently, this scalable artificial intelligence solution can support sustainable agricultural practices by enabling the early and accurate identification of pests and reducing the need for intensive pesticide use. Full article
Show Figures

Figure 1

Figure 1
<p>Image samples of each class in the dataset.</p>
Full article ">Figure 2
<p>Number of images per class.</p>
Full article ">Figure 3
<p>The architecture of ConvViT.</p>
Full article ">Figure 4
<p>Confusion matrix for multi-class classification problems.</p>
Full article ">Figure 5
<p>The confusion matrix for the ViT.</p>
Full article ">Figure 6
<p>The confusion matrix for DenseNet201.</p>
Full article ">Figure 7
<p>The confusion matrix for ResNet-50.</p>
Full article ">Figure 8
<p>The confusion matrix for VGG-16.</p>
Full article ">Figure 9
<p>The confusion matrix for EfficientNetB0.</p>
Full article ">Figure 10
<p>The confusion matrix for ConvViT.</p>
Full article ">Figure 11
<p>Comparative analysis of experimental results.</p>
Full article ">Figure 12
<p>Accuracy graphs of the compared models.</p>
Full article ">Figure 13
<p>Loss graphs of the compared models.</p>
Full article ">
18 pages, 4555 KiB  
Technical Note
GD-Det: Low-Data Object Detection in Foggy Scenarios for Unmanned Aerial Vehicle Imagery Using Re-Parameterization and Cross-Scale Gather-and-Distribute Mechanisms
by Rui Shi, Lili Zhang, Gaoxu Wang, Shutong Jia, Ning Zhang and Chensu Wang
Remote Sens. 2025, 17(5), 783; https://doi.org/10.3390/rs17050783 - 24 Feb 2025
Viewed by 194
Abstract
Unmanned Aerial Vehicles (UAVs) play an extremely important role in real-time object detection for maritime emergency rescue missions. However, marine accidents often occur in low-visibility weather conditions, resulting in poor image quality and a lack of object detection samples, which significantly reduces detection [...] Read more.
Unmanned Aerial Vehicles (UAVs) play an extremely important role in real-time object detection for maritime emergency rescue missions. However, marine accidents often occur in low-visibility weather conditions, resulting in poor image quality and a lack of object detection samples, which significantly reduces detection accuracy. To tackle these issues, we propose GD-Det, a low-data object detection model with high accuracy, specifically designed to handle limited sample sizes and low-quality images. The model is primarily composed of three components: (i) A lightweight re-parameterization feature extraction module which integrates RepVGG blocks into multi-concat blocks to enhance the model’s spatial perception and feature diversity during training. Meanwhile, it reduces computational cost in the inference phase through the re-parameterization mechanism. (ii) A cross-scale gather-and-distribute pyramid module, which helps to augment the relationship representation of four-scale features via flexible skip fusion and distribution strategies. (iii) A decoupled prediction module with three branches is to implement classification and regression, enhancing detection accuracy by combining the prediction values from tri-level features. (iv) We also use a domain-adaptive training strategy with knowledge transfer to handle low-data issues. We conducted low-data training and comparison experiments using our constructed dataset AFO-fog. Our model achieved an overall detection accuracy of 84.8%, which is superior to other models. Full article
Show Figures

Figure 1

Figure 1
<p>The network structure of a GD-Det model.</p>
Full article ">Figure 2
<p>Schematic diagram of the multi-concat block module.</p>
Full article ">Figure 3
<p>Schematic diagram of the gather-and-distribute module. Different colors in the grid represent different weight values, and the darker the color, the greater the weight.</p>
Full article ">Figure 4
<p>Diagram of the inject module.</p>
Full article ">Figure 5
<p>Schematic diagram of decoupled prediction module.</p>
Full article ">Figure 6
<p>Domain-adaptive training strategy.</p>
Full article ">Figure 7
<p>The loss function curve of the GD-Det model.</p>
Full article ">Figure 8
<p>The dataset of AFO.</p>
Full article ">Figure 9
<p>Synthetic images in foggy weathers.</p>
Full article ">Figure 10
<p>The comparison of different lightweight networks. The red boxes indicate that the identified object is “human”, while the yellow boxes indicate that the identified object is “surfboard”.</p>
Full article ">Figure 11
<p>Visualization of Ablation Experiments.</p>
Full article ">Figure 11 Cont.
<p>Visualization of Ablation Experiments.</p>
Full article ">
21 pages, 2125 KiB  
Article
VGGNet and Attention Mechanism-Based Image Quality Assessment Algorithm in Symmetry Edge Intelligence Systems
by Fanfan Shen, Haipeng Liu, Chao Xu, Lei Ouyang, Jun Zhang, Yong Chen and Yanxiang He
Symmetry 2025, 17(3), 331; https://doi.org/10.3390/sym17030331 - 22 Feb 2025
Viewed by 277
Abstract
With the rapid development of Internet of Things (IoT) technology, the number of devices connected to the network is exploding. How to improve the performance of edge devices has become an important challenge. Research on quality evaluation algorithms for brain tumor images remains [...] Read more.
With the rapid development of Internet of Things (IoT) technology, the number of devices connected to the network is exploding. How to improve the performance of edge devices has become an important challenge. Research on quality evaluation algorithms for brain tumor images remains scarce within symmetry edge intelligence systems. Additionally, the data volume in brain tumor datasets is frequently inadequate to support the training of neural network models. Most existing non-reference image quality assessment methods are based on natural statistical laws or construct a single-network model without considering visual perception characteristics, resulting in significant differences between the final evaluation results and subjective perception. To address these issues, we propose the AM-VGG-IQA (Attention Module Visual Geometry Group Image Quality Assessment) algorithm and extend the brain tumor MRI dataset. Visual saliency features with attention mechanism modules are integrated into AM-VGG-IQA. The integration of visual saliency features brings the evaluation outcomes of the model more in line with human perception. Meanwhile, the attention mechanism module cuts down on network parameters and expedites the training speed. For the brain tumor MRI dataset, our model achieves 85% accuracy, enabling it to effectively accomplish the task of evaluating brain tumor images in edge intelligence systems. Additionally, we carry out cross-dataset experiments. It is worth noting that, under varying training and testing ratios, the performance of AM-VGG-IQA remains relatively stable, which effectively demonstrates its remarkable robustness for edge applications. Full article
(This article belongs to the Special Issue Symmetry and Asymmetry in Embedded Systems)
Show Figures

Figure 1

Figure 1
<p>Algorithm flowchart of AM-VGG-IQA.</p>
Full article ">Figure 2
<p>Specific process of AM-VGG-IQA.</p>
Full article ">Figure 3
<p>Image quality reduction processing.</p>
Full article ">Figure 4
<p>CBAM model diagram.</p>
Full article ">Figure 5
<p>Channel attention module.</p>
Full article ">Figure 6
<p>Spatial attention module.</p>
Full article ">Figure 7
<p>Comparison of prediction accuracy.</p>
Full article ">Figure 8
<p>Comparison of accuracy between models.</p>
Full article ">
26 pages, 15489 KiB  
Article
Weighted Feature Fusion Network Based on Multi-Level Supervision for Migratory Bird Counting in East Dongting Lake
by Haojie Zou, Hai Zhou, Guo Liu, Yingchun Kuang, Qiang Long and Haoyu Zhou
Appl. Sci. 2025, 15(5), 2317; https://doi.org/10.3390/app15052317 - 21 Feb 2025
Viewed by 250
Abstract
East Dongting Lake is an important habitat for migratory birds. Accurately counting the number of migratory birds is crucial to assessing the health of the wetland ecological environment. Traditional manual observation and low-precision methods make it difficult to meet this demand. To this [...] Read more.
East Dongting Lake is an important habitat for migratory birds. Accurately counting the number of migratory birds is crucial to assessing the health of the wetland ecological environment. Traditional manual observation and low-precision methods make it difficult to meet this demand. To this end, this paper proposes a weighted feature fusion network based on multi-level supervision (MS-WFFNet) to count migratory birds. MS-WFFNet consists of three parts: an EEMA-VGG16 sub-network, a multi-source feature aggregation (MSFA) module, and a density map regression (DMR) module. Among them, the EEMA-VGG16 sub-network cross-injects enhanced efficient multi-scale attention (EEMA) into the truncated VGG16 structure. It uses multi-head attention to nonlinearly learn the relative importance of different positions in the same direction. With only a few parameters added, EEMA effectively suppresses the noise interference caused by a cluttered background. The MSFA module integrates a weighted mechanism to fully preserve low-level detail information and high-level semantic information. It achieves this by aggregating multi-source features and enhancing the expression of key features. The DMR module applies density map regression to the output of each path in the MSFA module. It ensures local consistency and spatial correlation among multiple regression results by using distributed supervision. In addition, this paper presents the migratory bird counting dataset DTH, collected using local monitoring equipment in East Dongting Lake. It is combined with other object counting datasets for extensive experiments, showcasing the proposed method’s excellent performance and generalization capability. Full article
(This article belongs to the Special Issue Deep Learning for Image Processing and Computer Vision)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) The data collection locations, where the red marker with an arrow represents East Dongting Lake, and the blue marker indicates Yingtian Town; (<b>b</b>) the data collection equipment; (<b>c</b>) the visualization of a frame from a source video.</p>
Full article ">Figure 2
<p>Image examples of various migratory birds in the DTH dataset.</p>
Full article ">Figure 3
<p>(<b>a</b>) Histogram of the number of images and the number of migratory birds in the DTH training set. (<b>b</b>) Histogram of the number of images and the number of migratory birds in the DTH testing set.</p>
Full article ">Figure 4
<p>The architecture of MS-WFFNet.</p>
Full article ">Figure 5
<p>The structure of the EEMA.</p>
Full article ">Figure 6
<p>The structure of the MSFA module (<b>left</b>) and LSKA block (<b>right</b>).</p>
Full article ">Figure 7
<p>Visualization results of the various methods on the DTH dataset.</p>
Full article ">Figure 8
<p>Visualization results of the various methods on the ShanghaiTech PartA dataset.</p>
Full article ">Figure 9
<p>Visualization results of the various methods on the URCAL dataset.</p>
Full article ">Figure 10
<p>Visualization results of the various methods on the PUCPR+ dataset.</p>
Full article ">Figure 11
<p>Visualization of heatmaps using EMA and EEMA in the backbone network.</p>
Full article ">Figure 12
<p>Visualization of results using different feature aggregation modules.</p>
Full article ">
16 pages, 7337 KiB  
Article
Automatic Extraction of Discolored Tree Crowns Based on an Improved Faster-RCNN Algorithm
by Haoyang Ma, Banghui Yang, Ruirui Wang, Qiang Yu, Yaoyao Yang and Jiahao Wei
Forests 2025, 16(3), 382; https://doi.org/10.3390/f16030382 - 20 Feb 2025
Viewed by 177
Abstract
The precise prevention and control of forest pests and diseases has always been a research hotspot in ecological environmental protection. With the continuous advancement of sensor technology, the fine-grained identification of discolored tree crowns based on UAV technology has become increasingly important in [...] Read more.
The precise prevention and control of forest pests and diseases has always been a research hotspot in ecological environmental protection. With the continuous advancement of sensor technology, the fine-grained identification of discolored tree crowns based on UAV technology has become increasingly important in forest monitoring. Existing deep learning models face challenges such as prolonged training time and low recognition accuracy when identifying discolored tree crowns caused by pests or diseases from airborne images. To address these issues, this study improves the Faster-RCNN model by using Inception-ResNet-V2 as the feature extractor, replacing the traditional VGG16 feature extractor, aiming to enhance the accuracy of discolored tree crown recognition. Experiments and analyses were conducted using UAV aerial imagery data from Jilin Changbai Mountain. The improved model effectively identified discolored tree crowns caused by pine wood nematodes, achieving a precision of 90.22%, a mean average precision (mAP) of 83.63%, and a recall rate of 92.33%. Compared to the original RCNN model, the mAP of the improved model increased by 4.68%, precision improved by 10.11%, and recall improved by 5.23%, significantly enhancing the recognition performance of discolored tree crowns. This method provides crucial technical support and scientific basis for the prevention and control of forest pests and diseases, facilitating early detection and precise management of forest pest outbreaks. Full article
(This article belongs to the Section Forest Health)
Show Figures

Figure 1

Figure 1
<p>Location map of the study area.</p>
Full article ">Figure 2
<p>Study area data.</p>
Full article ">Figure 3
<p>Comparison of data pretreatment results.</p>
Full article ">Figure 4
<p>Research technology flowchart.</p>
Full article ">Figure 5
<p>Labeling of discolored tree crowns.</p>
Full article ">Figure 6
<p>Learning curve.</p>
Full article ">Figure 7
<p>Comparison of training and validation loss: VGG16 vs. ResNet50 FPN V2 (with fluctuations).</p>
Full article ">Figure 8
<p>Extraction results of discolored tree crowns.</p>
Full article ">Figure 9
<p>Early detection results of discolored tree crowns.</p>
Full article ">Figure 10
<p>Dry tree crown detection results.</p>
Full article ">
21 pages, 1794 KiB  
Article
Research on Anti-Interference Performance of Spiking Neural Network Under Network Connection Damage
by Yongqiang Zhang, Haijie Pang, Jinlong Ma, Guilei Ma, Xiaoming Zhang and Menghua Man
Brain Sci. 2025, 15(3), 217; https://doi.org/10.3390/brainsci15030217 - 20 Feb 2025
Viewed by 320
Abstract
Background: With the development of artificial intelligence, memristors have become an ideal choice to optimize new neural network architectures and improve computing efficiency and energy efficiency due to their combination of storage and computing power. In this context, spiking neural networks show the [...] Read more.
Background: With the development of artificial intelligence, memristors have become an ideal choice to optimize new neural network architectures and improve computing efficiency and energy efficiency due to their combination of storage and computing power. In this context, spiking neural networks show the ability to resist Gaussian noise, spike interference, and AC electric field interference by adjusting synaptic plasticity. The anti-interference ability to spike neural networks has become an important direction of electromagnetic protection bionics research. Methods: Therefore, this research constructs two types of spiking neural network models with LIF model as nodes: VGG-SNN and FCNN-SNN, and combines pruning algorithm to simulate network connection damage during the training process. By comparing and analyzing the millimeter wave radar human motion dataset and MNIST dataset with traditional artificial neural networks, the anti-interference performance of spiking neural networks and traditional artificial neural networks under the same probability of edge loss was deeply explored. Results: The experimental results show that on the millimeter wave radar human motion dataset, the accuracy of the spiking neural network decreased by 5.83% at a sparsity of 30%, while the accuracy of the artificial neural network decreased by 18.71%. On the MNIST dataset, the accuracy of the spiking neural network decreased by 3.91% at a sparsity of 30%, while the artificial neural network decreased by 10.13%. Conclusions: Therefore, under the same network connection damage conditions, spiking neural networks exhibit unique anti-interference performance advantages. The performance of spiking neural networks in information processing and pattern recognition is relatively more stable and outstanding. Further analysis reveals that factors such as network structure, encoding method, and learning algorithm have a significant impact on the anti-interference performance of both. Full article
Show Figures

Figure 1

Figure 1
<p>Network structure diagram of VGG-SNN model.</p>
Full article ">Figure 2
<p>Network pruning process.</p>
Full article ">Figure 3
<p>Network structure diagram of FCNN-SNN model.</p>
Full article ">Figure 4
<p>Radar action dataset collection and processing process. The yellow color in the figure represents higher Doppler frequencies, while the blue part represents lower Doppler frequencies.</p>
Full article ">Figure 5
<p>F1-score histogram of VGG-SNN and VGG on Radar action dataset.</p>
Full article ">Figure 6
<p>F1-score histogram of FCNN-SNN and FCNN on MNIST dataset.</p>
Full article ">
Back to TopTop