Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,448)

Search Parameters:
Keywords = DenseNet201

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
40 pages, 36566 KiB  
Article
Web-Based AI System for Detecting Apple Leaf and Fruit Diseases
by Serra Aksoy, Pinar Demircioglu and Ismail Bogrekci
AgriEngineering 2025, 7(3), 51; https://doi.org/10.3390/agriengineering7030051 - 20 Feb 2025
Abstract
The present study seeks to improve the accuracy and reliability of disease identification in apple fruits and leaves through the use of state-of-the-art deep learning techniques. The research investigates several state-of-the-art architectures, such as Xception, InceptionV3, InceptionResNetV2, EfficientNetV2M, MobileNetV3Large, ResNet152V2, DenseNet201, and NASNetLarge. [...] Read more.
The present study seeks to improve the accuracy and reliability of disease identification in apple fruits and leaves through the use of state-of-the-art deep learning techniques. The research investigates several state-of-the-art architectures, such as Xception, InceptionV3, InceptionResNetV2, EfficientNetV2M, MobileNetV3Large, ResNet152V2, DenseNet201, and NASNetLarge. Among the models evaluated, ResNet152V2 performed best in the classification of apple fruit diseases, with a rate of 92%, whereas Xception proved most effective in the classification of apple leaf diseases, with 99% accuracy. The models were able to correctly recognize familiar apple diseases like blotch, scab, rot, and other leaf infections, showing their applicability in agriculture diagnosis. An important by-product of this research is the creation of a web application, easily accessible using Gradio, to conduct real-time disease detection through the upload of apple fruit and leaf images by users. The app gives predicted disease labels along with confidence values and elaborate information on symptoms and management. The system also includes a visualization tool for the inner workings of the neural network, thereby enabling higher transparency and trust in the diagnostic process. Future research will aim to widen the scope of the system to other crop species, with larger disease databases, and to improve explainability further to facilitate real-world agricultural application. Full article
(This article belongs to the Special Issue Implementation of Artificial Intelligence in Agriculture)
Show Figures

Figure 1

Figure 1
<p>Architecture of the Proposed Model for Apple Fruit Diseases Classification.</p>
Full article ">Figure 2
<p>Architecture of the Proposed Model for Apple Leaves Diseases Classification.</p>
Full article ">Figure 3
<p>Experimental Workflow (on the <b>left</b>); Dataset Structure (on the <b>right</b>).</p>
Full article ">
24 pages, 5275 KiB  
Article
Force Map-Enhanced Segmentation of a Lightweight Model for the Early Detection of Cervical Cancer
by Sabina Umirzakova, Shakhnoza Muksimova, Jushkin Baltayev and Young Im Cho
Diagnostics 2025, 15(5), 513; https://doi.org/10.3390/diagnostics15050513 - 20 Feb 2025
Abstract
Background/Objectives: Accurate and efficient segmentation of cervical cells is crucial for the early detection of cervical cancer, enabling timely intervention and treatment. Existing segmentation models face challenges with complex cellular arrangements, such as overlapping cells and indistinct boundaries, and are often computationally intensive, [...] Read more.
Background/Objectives: Accurate and efficient segmentation of cervical cells is crucial for the early detection of cervical cancer, enabling timely intervention and treatment. Existing segmentation models face challenges with complex cellular arrangements, such as overlapping cells and indistinct boundaries, and are often computationally intensive, which limits their deployment in resource-constrained settings. Methods: In this study, we introduce a lightweight and efficient segmentation model specifically designed for cervical cell analysis. The model employs a MobileNetV2 architecture for feature extraction, ensuring a minimal parameter count conducive to real-time processing. To enhance boundary delineation, we propose a novel force map approach that drives pixel adjustments inward toward the centers of cells, thus improving cell separation in densely packed areas. Additionally, we integrate extreme point supervision to refine segmentation outcomes using minimal boundary annotations, rather than full pixel-wise labels. Results: Our model was rigorously trained and evaluated on a comprehensive dataset of cervical cell images. It achieved a Dice Coefficient of 0.87 and a Boundary F1 Score of 0.84, performances that are comparable to those of advanced models but with considerably lower inference times. The optimized model operates at approximately 50 frames per second on standard low-power hardware. Conclusions: By effectively balancing segmentation accuracy with computational efficiency, our model addresses critical barriers to the widespread adoption of automated cervical cell segmentation tools. Its ability to perform in real time on low-cost devices makes it an ideal candidate for clinical applications and deployment in low-resource environments. This advancement holds significant potential for enhancing access to cervical cancer screening and diagnostics worldwide, thereby supporting broader healthcare initiatives. Full article
Show Figures

Figure 1

Figure 1
<p>Improved contextual integration and feature extraction with MobileNetV2 for accurate cell segmentation.</p>
Full article ">Figure 2
<p>Description of the SipakMed dataset.</p>
Full article ">Figure 3
<p>Results of image segmentation using the proposed model.</p>
Full article ">Figure 4
<p>Performance curves for model robustness and overfitting prevention.</p>
Full article ">
28 pages, 5098 KiB  
Article
A Methodological Framework for AI-Assisted Diagnosis of Ovarian Masses Using CT and MR Imaging
by Pratik Adusumilli, Nishant Ravikumar, Geoff Hall and Andrew F. Scarsbrook
J. Pers. Med. 2025, 15(2), 76; https://doi.org/10.3390/jpm15020076 - 19 Feb 2025
Abstract
Background: Ovarian cancer encompasses a diverse range of neoplasms originating in the ovaries, fallopian tubes, and peritoneum. Despite being one of the commonest gynaecological malignancies, there are no validated screening strategies for early detection. A diagnosis typically relies on imaging, biomarkers, and multidisciplinary [...] Read more.
Background: Ovarian cancer encompasses a diverse range of neoplasms originating in the ovaries, fallopian tubes, and peritoneum. Despite being one of the commonest gynaecological malignancies, there are no validated screening strategies for early detection. A diagnosis typically relies on imaging, biomarkers, and multidisciplinary team discussions. The accurate interpretation of CTs and MRIs may be challenging, especially in borderline cases. This study proposes a methodological pipeline to develop and evaluate deep learning (DL) models that can assist in classifying ovarian masses from CT and MRI data, potentially improving diagnostic confidence and patient outcomes. Methods: A multi-institutional retrospective dataset was compiled, supplemented by external data from the Cancer Genome Atlas. Two classification workflows were examined: (1) whole-volume input and (2) lesion-focused region of interest. Multiple DL architectures, including ResNet, DenseNet, transformer-based UNeST, and Attention Multiple-Instance Learning (MIL), were implemented within the PyTorch-based MONAI framework. The class imbalance was mitigated using focal loss, oversampling, and dynamic class weighting. The hyperparameters were optimised with Optuna, and balanced accuracy was the primary metric. Results: For a preliminary dataset, the proposed framework demonstrated feasibility for the multi-class classification of ovarian masses. The initial experiments highlighted the potential of transformers and MIL for identifying the relevant imaging features. Conclusions: A reproducible methodological pipeline for DL-based ovarian mass classification using CT and MRI scans has been established. Future work will leverage a multi-institutional dataset to refine these models, aiming to enhance clinical workflows and improve patient outcomes. Full article
(This article belongs to the Special Issue Artificial Intelligence Applications in Precision Oncology)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Patient selection flow diagram.</p>
Full article ">Figure 2
<p>Model development process.</p>
Full article ">Figure 3
<p>Matplotlib visualisation of input data.</p>
Full article ">Figure 4
<p>TensorBoard visualisation of the model training process, illustrating real-time monitoring of metrics and performance.</p>
Full article ">Figure 5
<p>Schematic of the 3D ResNet architecture. The pipeline begins with a Conv 7 × 7 and a 3D MaxPool operation to reduce the spatial resolution, followed by multiple ResBlocks (each comprising 1 × 1 and 3 × 3 × 3 convolutions, batch normalisation, ReLU activations, and skip connections). A global AvgPool layer then condenses the volumetric features into a single vector, which is passed to an FC layer for final classification. The inset illustrates the typical ResBlock layout with Conv–BN–ReLU sequences, skip connections, and a final post-addition ReLU. Conv: convolution; BN: batch normalisation; ReLU: Rectified Linear Unit; AvgPool: global average pooling; FC: fully connected.</p>
Full article ">Figure 6
<p>Schematic of 3D DenseNet architecture. The pipeline begins with a Conv 7 × 7 for initial feature extraction, followed by a 3D MaxPool to reduce spatial resolution. A series of dense blocks (purple) and transition blocks (orange) then iteratively expand and reduce the feature dimensions. Each dense block comprises successive Conv–BN–ReLU operations, concatenating outputs from all preceding layers, while each transition block (Conv 1 × 1 + AvgPool) reduces feature map size and channels. A global AvgPool layer aggregates the final volumetric features into a single vector, which is passed to a fully connected (FC) layer for classification. The insets illustrate the typical internal layout of a dense block and transition block.</p>
Full article ">Figure 7
<p>Schematic of 3D UNesT architecture. The 3D input volume is passed through a 3-level NesT to produce multi-scale feature maps {x1,x2,x3,x4}{x1,x2,x3,x4}, as well as a final feature map, xx. Each intermediate NesT output is fed into a corresponding encoder block (encoder1–encoder4encoder1–encoder4), comprising 3D convolutions (with BN/ReLU) and/or upsampling, yielding encoded features enc0–enc3enc0–enc3. The final NesT output, xx, is further processed by an additional encoder (e.g., encoder10) to expand feature depth. In the decoder phase, skip connections fuse the encoder outputs with transposed convolution layers (decoder5–decoder1) to reconstruct intermediate 3D representations. A classification head (global average pooling followed by a dense layer and softmax) aggregates the final 3D representation to produce class probabilities. NesT: nested transformer; BN: batch normalisation; dec: decoder; enc: encoder; ReLU: Rectified Linear Unit; AvgPool: global average pooling.</p>
Full article ">Figure 8
<p>Schematic of the 3D attention-based Multiple-Instance Learning (MIL) pipeline. The pipeline starts with a 3D input volume (CT or MRI) split into multiple patches (forming a “bag”). Each patch is processed by a 3D ResNet3D ResNet-50 (Residual Network) to extract the patch-level feature embedding, which is then flattened spatially. These embeddings are passed through an attention block comprising an MLP (Multi-Layer Perceptron) and a softmax over NN patches to yield the attention weights. The weighted sum of the patch embeddings is computed according to these attention weights, producing a single aggregated feature vector. Finally, this aggregated representation is fed into an FC (fully connected) layer for the final classification. 3D: three-Dimensional; ResNet: Residual Network; MLP: Multi-Layer Perceptron; FC: fully connected.</p>
Full article ">Figure 9
<p>Grad-Cam visualisation: the upper image demonstrates a “hot” (red) malignant lesion and the lower image demonstrates “hot” (red) regions of omental disease.</p>
Full article ">
33 pages, 3144 KiB  
Article
CNN-Based Optimization for Fish Species Classification: Tackling Environmental Variability, Class Imbalance, and Real-Time Constraints
by Amirhosein Mohammadisabet, Raza Hasan, Vishal Dattana, Salman Mahmood and Saqib Hussain
Information 2025, 16(2), 154; https://doi.org/10.3390/info16020154 - 19 Feb 2025
Abstract
Automated fish species classification is essential for marine biodiversity monitoring, fisheries management, and ecological research. However, challenges such as environmental variability, class imbalance, and computational demands hinder the development of robust classification models. This study investigates the effectiveness of convolutional neural network (CNN)-based [...] Read more.
Automated fish species classification is essential for marine biodiversity monitoring, fisheries management, and ecological research. However, challenges such as environmental variability, class imbalance, and computational demands hinder the development of robust classification models. This study investigates the effectiveness of convolutional neural network (CNN)-based models and hybrid approaches to address these challenges. Eight CNN architectures, including DenseNet121, MobileNetV2, and Xception, were compared alongside traditional classifiers like support vector machines (SVMs) and random forest. DenseNet121 achieved the highest accuracy (90.2%), leveraging its superior feature extraction and generalization capabilities, while MobileNetV2 balanced accuracy (83.57%) with computational efficiency, processing images in 0.07 s, making it ideal for real-time deployment. Advanced preprocessing techniques, such as data augmentation, turbidity simulation, and transfer learning, were employed to enhance dataset robustness and address class imbalance. Hybrid models combining CNNs with traditional classifiers achieved intermediate accuracy with improved interpretability. Optimization techniques, including pruning and quantization, reduced model size by 73.7%, enabling real-time deployment on resource-constrained devices. Grad-CAM visualizations further enhanced interpretability by identifying key image regions influencing predictions. This study highlights the potential of CNN-based models for scalable, interpretable fish species classification, offering actionable insights for sustainable fisheries management and biodiversity conservation. Full article
(This article belongs to the Special Issue Machine Learning and Data Mining: Innovations in Big Data Analytics)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Framework for the research methodology in fish species classification.</p>
Full article ">Figure 2
<p>Example of augmented images.</p>
Full article ">Figure 3
<p>Confusion matrix for DenseNet121 performance.</p>
Full article ">Figure 4
<p>Training and validation loss for DenseNet121.</p>
Full article ">Figure 5
<p>Grad-CAM heatmap analysis for a single class.</p>
Full article ">Figure 6
<p>Comparative Grad-CAM visualizations across multiple classes.</p>
Full article ">Figure 7
<p>Turbidity simulation results and model predictions.</p>
Full article ">
15 pages, 3085 KiB  
Article
Early Detection of Skin Diseases Across Diverse Skin Tones Using Hybrid Machine Learning and Deep Learning Models
by Akasha Aquil, Faisal Saeed, Souad Baowidan, Abdullah Marish Ali and Nouh Sabri Elmitwally
Information 2025, 16(2), 152; https://doi.org/10.3390/info16020152 - 19 Feb 2025
Abstract
Skin diseases in melanin-rich skin often present diagnostic challenges due to the unique characteristics of darker skin tones, which can lead to misdiagnosis or delayed treatment. This disparity impacts millions within diverse communities, highlighting the need for accurate, AI-based diagnostic tools. In this [...] Read more.
Skin diseases in melanin-rich skin often present diagnostic challenges due to the unique characteristics of darker skin tones, which can lead to misdiagnosis or delayed treatment. This disparity impacts millions within diverse communities, highlighting the need for accurate, AI-based diagnostic tools. In this paper, we investigated the performance of three machine learning methods -Support Vector Machines (SVMs), Random Forest (RF), and Decision Trees (DTs)-combined with state-of-the-art (SOTA) deep learning models, EfficientNet, MobileNetV2, and DenseNet121, for predicting skin conditions using dermoscopic images from the HAM10000 dataset. The features were extracted using the deep learning models, with the labels encoded numerically. To address the data imbalance, SMOTE and resampling techniques were applied. Additionally, Principal Component Analysis (PCA) was used for feature reduction, and fine-tuning was performed to optimize the models. The results demonstrated that RF with DenseNet121 achieved a superior accuracy of 98.32%, followed by SVM with MobileNetV2 at 98.08%, and Decision Tree with MobileNetV2 at 85.39%. The proposed methods overcome the SVM with the SOTA EfficientNet model, validating the robustness of the proposed approaches. Evaluation metrics such as accuracy, precision, recall, and F1-score were used to benchmark performance, showcasing the potential of these methods in advancing skin disease diagnostics for diverse populations. Full article
(This article belongs to the Special Issue AI-Based Image Processing and Computer Vision)
Show Figures

Figure 1

Figure 1
<p>Phases of CRISP-DM.</p>
Full article ">Figure 2
<p>Frequency of skin lesion types.</p>
Full article ">Figure 3
<p>Distribution of age.</p>
Full article ">Figure 4
<p>Sex distribution.</p>
Full article ">Figure 5
<p>Distribution of lesion localization.</p>
Full article ">Figure 6
<p>Age distribution across lesion types.</p>
Full article ">Figure 7
<p>Distribution of lesion types by sex.</p>
Full article ">Figure 8
<p>Model accuracy graph for SVM-MobileNetV2.</p>
Full article ">Figure 9
<p>Random Forest DenseNet121 Accuracy Model.</p>
Full article ">Figure 10
<p>Validation Accuracy of Decision Tree.</p>
Full article ">
22 pages, 2391 KiB  
Article
Terrestrial Storage of Biomass (Biomass Burial): A Natural, Carbon-Efficient, and Low-Cost Method for Removing CO2 from Air
by Jeffrey A. Amelse
Appl. Sci. 2025, 15(4), 2183; https://doi.org/10.3390/app15042183 - 18 Feb 2025
Abstract
Terrestrial Storage of Biomass (TSB) is a Negative Emission Technology for removing CO2 already in the atmosphere. TSB is compared to other NETs and is shown to be a natural, carbon-efficient, and low-cost option. Nature performs the work of removal by growing [...] Read more.
Terrestrial Storage of Biomass (TSB) is a Negative Emission Technology for removing CO2 already in the atmosphere. TSB is compared to other NETs and is shown to be a natural, carbon-efficient, and low-cost option. Nature performs the work of removal by growing biomass via photosynthesis. The key to permanent sequestration is to bury the biomass in pits designed to minimize the decomposition. The chemistry of biomass formation and decomposition is reviewed to provide best practices for the TSB burial pit design. Methane formation from even a small amount of decomposition has been raised as a concern. This concern is shown to be unfounded due to a great difference in time constants for methane formation and its removal from the air by ozone oxidation. Methane has a short lifetime in air of only about 12 years. Woody biomass decomposition undergoes exponential decay spread over hundreds to thousands of years. It is inherently slow due to the cross-linking and dense packing of cellulose, which means that the attack can only occur at the surface. A model that couples the slow and exponential decay of the rate of methane formation with the fast removal by oxidation shows that methane will peak at a very small fraction of the buried biomass carbon within about 10 years and then rapidly decline towards zero. The implication is that no additional equipment needs to be added to TSB to collect and burn the methane. Certified carbon credits are listed on various exchanges. The US DOE has recently issued grants for TSB development. Full article
(This article belongs to the Special Issue CCUS: Paving the Way to Net Zero Emissions Technologies)
Show Figures

Figure 1

Figure 1
<p>Relative species concentrations as a function of pH (Sturm [<a href="#B28-applsci-15-02183" class="html-bibr">28</a>]).</p>
Full article ">Figure 2
<p>The effect of adding Ca<sup>2+</sup> to the carbonic acid buffer system (Sturm [<a href="#B28-applsci-15-02183" class="html-bibr">28</a>]).</p>
Full article ">Figure 3
<p>US bioethanol build-up compared to gasoline and crude oil exports. Data source: EIA [<a href="#B38-applsci-15-02183" class="html-bibr">38</a>].</p>
Full article ">Figure 4
<p>Pathways for the decomposition of complex municipal waste (Emcon Associates [<a href="#B46-applsci-15-02183" class="html-bibr">46</a>]).</p>
Full article ">Figure 5
<p>Evolution of biogas composition during waste decomposition (Emcon Associates [<a href="#B46-applsci-15-02183" class="html-bibr">46</a>]).</p>
Full article ">Figure 6
<p>The fraction of biomass carbon converted to methane (blue) and the fraction of biomass carbon converted to methane that remains in the atmosphere (magenta).</p>
Full article ">
20 pages, 7127 KiB  
Article
Cross-Attention Adaptive Feature Pyramid Network with Uncertainty Boundary Modeling for Mass Detection in Digital Breast Tomosynthesis
by Xinyu Ma, Haotian Sun, Gang Yuan, Yufei Tang, Jie Liu, Shuangqing Chen and Jian Zheng
Bioengineering 2025, 12(2), 196; https://doi.org/10.3390/bioengineering12020196 - 17 Feb 2025
Abstract
Computer-aided detection (CADe) of masses in digital breast tomosynthesis (DBT) is crucial for early breast cancer diagnosis. However, the variability in the size and morphology of breast masses and their resemblance to surrounding tissues present significant challenges. Current CNN-based CADe methods, particularly those [...] Read more.
Computer-aided detection (CADe) of masses in digital breast tomosynthesis (DBT) is crucial for early breast cancer diagnosis. However, the variability in the size and morphology of breast masses and their resemblance to surrounding tissues present significant challenges. Current CNN-based CADe methods, particularly those that use Feature Pyramid Networks (FPN), often fail to integrate multi-scale information effectively and struggle to handle dense glandular tissue with high-density or iso-density mass lesions due to the unidirectional integration and progressive attenuation of features, leading to high false positive rates. Additionally, the commonly indistinct boundaries of breast masses introduce uncertainty in boundary localization, which makes traditional Dirac boundary modeling insufficient for precise boundary regression. To address these issues, we propose the CU-Net network, which efficiently fuses multi-scale features and accurately models blurred boundaries. Specifically, the CU-Net introduces the Cross-Attention Adaptive Feature Pyramid Network (CA-FPN), which enhances the effectiveness and accuracy of feature interactions through a cross-attention mechanism to capture global correlations across multi-scale feature maps. Simultaneously, the Breast Density Perceptual Module (BDPM) incorporates breast density information to weight intermediate features, thereby improving the network’s focus on dense breast regions susceptible to false positives. For blurred mass boundaries, we introduce Uncertainty Boundary Modeling (UBM) to model the positional distribution function of predicted bounding boxes for masses with uncertain boundaries. In comparative experiments on an in-house clinical DBT dataset and the BCS-DBT dataset, the proposed method achieved sensitivities of 89.68% and 72.73% at 2 false positives per DBT volume (FPs/DBT), respectively, significantly outperforming existing state-of-the-art detection methods. This method offers clinicians rapid, accurate, and objective diagnostic assistance, demonstrating substantial potential for clinical application. Full article
(This article belongs to the Section Biosignal Processing)
Show Figures

Figure 1

Figure 1
<p>The illustration of blurred mass edges or obscured by dense glandular tissue. Blurry edges are indicated by red dashed ellipses.</p>
Full article ">Figure 2
<p>Overall architecture of proposed method. The BDPM and CAFPN are used to further integrate and collect the features extracted by the backbone. The UBM is placed in the regression branch of the detection head to predict more accurate 2D bounding boxes.</p>
Full article ">Figure 3
<p>Architecture of CA-FPN. The module directly connects deep features with shallow features, preventing the gradual attenuation of feature transmission seen in traditional FPN.</p>
Full article ">Figure 4
<p>Architecture of BDPM. The module weights the intermediate features of the network using breast density information to enhance the network’s focus on dense breast regions.</p>
Full article ">Figure 5
<p>Three-dimensional aggregation. These 2D detection results are fused along the z-axis to yield the final 3D detection results.</p>
Full article ">Figure 6
<p>The FROC curves of the comparison methods [<a href="#B25-bioengineering-12-00196" class="html-bibr">25</a>,<a href="#B27-bioengineering-12-00196" class="html-bibr">27</a>,<a href="#B35-bioengineering-12-00196" class="html-bibr">35</a>,<a href="#B36-bioengineering-12-00196" class="html-bibr">36</a>,<a href="#B37-bioengineering-12-00196" class="html-bibr">37</a>,<a href="#B38-bioengineering-12-00196" class="html-bibr">38</a>,<a href="#B39-bioengineering-12-00196" class="html-bibr">39</a>] on mass-detection task.</p>
Full article ">Figure 7
<p>The detection result visualization of different models, in which the green boxes represent ground truth, the aqua blue boxes represent true positive results, and the yellow boxes represent false positive results.</p>
Full article ">Figure 8
<p>The heatmap visualization of the detection results using Grad-CAM, in which regions highlighted in red indicate areas of the feature map that receive higher attention from the network and regions depicted in blue represent areas with lower attention from the network. The green boxes represent ground truth. The proposed method effectively focuses on the ground truth regions, thereby achieving accurate detection results.</p>
Full article ">
19 pages, 11226 KiB  
Article
Evaluation of Weed Infestations in Row Crops Using Aerial RGB Imaging and Deep Learning
by Plamena D. Nikolova, Boris I. Evstatiev, Atanas Z. Atanasov and Asparuh I. Atanasov
Agriculture 2025, 15(4), 418; https://doi.org/10.3390/agriculture15040418 - 16 Feb 2025
Abstract
One of the important factors negatively affecting the yield of row crops is weed infestations. Using non-contact detection methods allows for a rapid assessment of weed infestations’ extent and management decisions for practical weed control. This study aims to develop and demonstrate a [...] Read more.
One of the important factors negatively affecting the yield of row crops is weed infestations. Using non-contact detection methods allows for a rapid assessment of weed infestations’ extent and management decisions for practical weed control. This study aims to develop and demonstrate a methodology for early detection and evaluation of weed infestations in maize using UAV-based RGB imaging and pixel-based deep learning classification. An experimental study was conducted to determine the extent of weed infestations on two tillage technologies, plowing and subsoiling, tailored to the specific soil and climatic conditions of Southern Dobrudja. Based on an experimental study with the DeepLabV3 classification algorithm, it was found that the ResNet-34-backed model ensures the highest performance compared to different versions of ResNet, DenseNet, and VGG backbones. The achieved performance reached precision, recall, F1 score, and Kappa, respectively, 0.986, 0.986, 0.986, and 0.957. After applying the model in the field with the investigated tillage technologies, it was found that a higher level of weed infestation is observed in subsoil deepening areas, where 4.6% of the area is infested, compared to 0.97% with the plowing treatment. This work contributes novel insights into weed management during the critical early growth stages of maize, providing a robust framework for optimizing weed control strategies in this region. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

Figure 1
<p>Geographic location of the experimental field area: (<b>a</b>) approximate location on the map of Bulgaria; (<b>b</b>) the experimental field.</p>
Full article ">Figure 2
<p>Overview of the applied methodology.</p>
Full article ">Figure 3
<p>Sample training data creation: two weed-infested ROIs (the blue regions) and one uninfested area (the white region).</p>
Full article ">Figure 4
<p>Types of weed infestations in the experimental field: (<b>a</b>) <span class="html-italic">Chenopodium album</span>, (<b>b</b>) <span class="html-italic">Cirsium arvense</span>, (<b>c</b>) <span class="html-italic">Polygonum aviculare</span>, (<b>d</b>) <span class="html-italic">Sorghum halepense</span>.</p>
Full article ">Figure 5
<p>The merged training images and the marked regions of interest: weed-infested areas in blue and weed-free areas in white.</p>
Full article ">Figure 6
<p>The generated orthomosaic of the investigated field, the camera locations (yellow squares), and the flight path (orange line).</p>
Full article ">Figure 7
<p>Classification map based on the trained ResNet-34-based model: “weed” area in green and “no-weed” area in white for subsoil deepening; “weed” area in red and “no-weed” area in yellow for plowing treatment.</p>
Full article ">Figure 8
<p>Close-up image of the shadow dropped by the high-voltage electric pole. The area classified as “weed” by the ResNet-34 model is marked in green.</p>
Full article ">Figure 9
<p>Examples from the orthomosaic map with the overlayed “weed” polygons: (<b>a</b>) Subsoil Deepening; (<b>b</b>) Plowing.</p>
Full article ">
14 pages, 10065 KiB  
Article
Automatic Evaluation of Bone Age Using Hand Radiographs and Pancorporal Radiographs in Adolescent Idiopathic Scoliosis
by Ifrah Andleeb, Bilal Zahid Hussain, Julie Joncas, Soraya Barchi, Marjolaine Roy-Beaudry, Stefan Parent, Guy Grimard, Hubert Labelle and Luc Duong
Diagnostics 2025, 15(4), 452; https://doi.org/10.3390/diagnostics15040452 - 13 Feb 2025
Abstract
Background/Objectives: Adolescent idiopathic scoliosis (AIS) is a complex, three-dimensional spinal deformity that requires monitoring of skeletal maturity for effective management. Accurate bone age assessment is important for evaluating developmental progress in AIS. Traditional methods rely on ossification center observations, but recent advances in [...] Read more.
Background/Objectives: Adolescent idiopathic scoliosis (AIS) is a complex, three-dimensional spinal deformity that requires monitoring of skeletal maturity for effective management. Accurate bone age assessment is important for evaluating developmental progress in AIS. Traditional methods rely on ossification center observations, but recent advances in deep learning (DL) might pave the way for automatic grading of bone age. Methods: The goal of this research is to propose a new deep neural network (DNN) and evaluate class activation maps for bone age assessment in AIS using hand radiographs. We developed a custom neural network based on DenseNet201 and trained it on the RSNA Bone Age dataset. Results: The model achieves an average mean absolute error (MAE) of 4.87 months on more than 250 clinical testing AIS patient dataset. To enhance transparency and trust, we introduced Score-CAM, an explainability tool that reveals the regions of interest contributing to accurate bone age predictions. We compared our model with the BoneXpert system, demonstrating similar performance, which signifies the potential of our approach to reduce inter-rater variability and expedite clinical decision-making. Conclusions: This study outlines the role of deep learning in improving the precision and efficiency of bone age assessment, particularly for AIS patients. Future work involves the detection of other regions of interest and the integration of other ossification centers. Full article
(This article belongs to the Section Medical Imaging and Theranostics)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) EOS imaging system Pancorporal, (<b>b</b>) Radiograph of AIS patients, (<b>c</b>) Matching hand/wrist images [<a href="#B4-diagnostics-15-00452" class="html-bibr">4</a>].</p>
Full article ">Figure 2
<p>Sample hand and wrist radiograph from the RSNA Bone Age dataset.</p>
Full article ">Figure 3
<p>Distribution of bone age of the children in months.</p>
Full article ">Figure 4
<p>Deep Learning Model Architecture for BoneAge Prediction.</p>
Full article ">Figure 5
<p>DenseNet201 Model Architecture for BoneAge Prediction.</p>
Full article ">Figure 6
<p>Graphical Comparison of VGG16, VGG19, Inception and Proposed Model trained on the same dataset.</p>
Full article ">Figure 7
<p>Actual Age versus Predicted Age results.</p>
Full article ">Figure 8
<p>Loss and MAE month curve plots for proposed model.</p>
Full article ">Figure 9
<p>Score-CAM Results of the proposed neural network, showing its attention which highlights the important areas found in the images for determining boneage. In this color map, blue shows the most significant areas, and red shows the least significant ones.</p>
Full article ">Figure 10
<p>Comparison of the performance of the Proposed Model with BoneXpert.</p>
Full article ">
22 pages, 11164 KiB  
Article
Acoustic Emission-Based Pipeline Leak Detection and Size Identification Using a Customized One-Dimensional DenseNet
by Faisal Saleem, Zahoor Ahmad, Muhammad Farooq Siddique, Muhammad Umar and Jong-Myon Kim
Sensors 2025, 25(4), 1112; https://doi.org/10.3390/s25041112 - 12 Feb 2025
Abstract
Effective leak detection and leak size identification are essential for maintaining the operational safety, integrity, and longevity of industrial pipelines. Traditional methods often suffer from high noise sensitivity, limited adaptability to non-stationary signals, and excessive computational costs, which limits their feasibility for real-time [...] Read more.
Effective leak detection and leak size identification are essential for maintaining the operational safety, integrity, and longevity of industrial pipelines. Traditional methods often suffer from high noise sensitivity, limited adaptability to non-stationary signals, and excessive computational costs, which limits their feasibility for real-time monitoring applications. This study presents a novel acoustic emission (AE)-based pipeline monitoring approach, integrating Empirical Wavelet Transform (EWT) for adaptive frequency decomposition with customized one-dimensional DenseNet architecture to achieve precise leak detection and size classification. The methodology begins with EWT-based signal segmentation, which isolates meaningful frequency bands to enhance leak-related feature extraction. To further improve signal quality, adaptive thresholding and denoising techniques are applied, filtering out low-amplitude noise while preserving critical diagnostic information. The denoised signals are processed using a DenseNet-based deep learning model, which combines convolutional layers and densely connected feature propagation to extract fine-grained temporal dependencies, ensuring the accurate classification of leak presence and severity. Experimental validation was conducted on real-world AE data collected under controlled leak and non-leak conditions at varying pressure levels. The proposed model achieved an exceptional leak detection accuracy of 99.76%, demonstrating its ability to reliably differentiate between normal operation and multiple leak severities. This method effectively reduces computational costs while maintaining robust performance across diverse operating environments. Full article
(This article belongs to the Special Issue Feature Papers in Fault Diagnosis & Sensors 2025)
Show Figures

Figure 1

Figure 1
<p>Graphical workflow of the proposed methodology.</p>
Full article ">Figure 2
<p>Flowchart of the signal preprocessing steps.</p>
Full article ">Figure 3
<p>Intrinsic mode functions for (<b>a</b>) non-leak signal and (<b>b</b>) leak signal.</p>
Full article ">Figure 4
<p>One-dimensional CNN architecture.</p>
Full article ">Figure 5
<p>DenseNet architecture.</p>
Full article ">Figure 6
<p>Experimental setup for pipeline leak detection.</p>
Full article ">Figure 7
<p>Pipeline architecture for the experiment.</p>
Full article ">Figure 8
<p>AE signals at 13-bar pressure: (<b>a</b>) normal; (<b>b</b>) leak.</p>
Full article ">Figure 9
<p>AE signals at 18-bar pressure: (<b>a</b>) normal; (<b>b</b>) leak.</p>
Full article ">Figure 10
<p>Confusion matrices for leak detection of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">Figure 10 Cont.
<p>Confusion matrices for leak detection of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">Figure 11
<p>Confusion matrices for leak size identification of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">Figure 12
<p>t-SNE plots for leak detection of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">Figure 13
<p>t-SNE plots for leak size identification of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">Figure 13 Cont.
<p>t-SNE plots for leak size identification of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">
18 pages, 8926 KiB  
Article
Research on Damage Detection Methods for Concrete Beams Based on Ground Penetrating Radar and Convolutional Neural Networks
by Ning Liu, Ya Ge, Xin Bai, Zi Zhang, Yuhao Shangguan and Yan Li
Appl. Sci. 2025, 15(4), 1882; https://doi.org/10.3390/app15041882 - 12 Feb 2025
Abstract
Ground penetrating radar (GPR) is a mature and important research method in the field of structural non-destructive testing. However, when the detection target scale is small and the amount of data collected is limited, it poses a serious challenge for this research method. [...] Read more.
Ground penetrating radar (GPR) is a mature and important research method in the field of structural non-destructive testing. However, when the detection target scale is small and the amount of data collected is limited, it poses a serious challenge for this research method. In order to verify the applicability of typical one-dimensional radar signals combined with convolutional neural networks (CNN) in the non-destructive testing of concrete structures, this study created concrete specimens with embedded defects (voids, non-dense solids, and cracks) commonly found in concrete structures in a laboratory setting. High-frequency GPR equipment is used for data acquisition, A-scan data corresponding to different defects is extracted as a training set, and appropriate labeling is carried out. The extracted original radar signals were taken as the input of the CNN model. At the same time, in order to improve the sensitivity of the CNN models to specific damage types, the spectrums of A-scan are also used as part of the training datasets of the CNN models. In this paper, two CNN models with different dimensions are used to train the datasets and evaluate the classification results; one is the traditional one-dimensional CNN model, and the other is the classical two-dimensional CNN architecture AlexNet. In addition, the finite difference time domain (FDTD) model of three-dimensional complex media is established by gprMax, and the propagation characteristics of GPR in concrete media are simulated. The results of applying this method to both simulated and experimental data show that combining the A-scan data of ground penetrating radar and their spectrums as input with the CNN model can effectively identify different types of damage and defects inside the concrete structure. Compared with the one-dimensional CNN model, AlexNet has obvious advantages in extracting complex signal features and processing high-dimensional data. The feasibility of this method in the research field of damage detection of concrete structures has been verified. Full article
(This article belongs to the Special Issue Ground Penetrating Radar: Data, Imaging, and Signal Analysis)
Show Figures

Figure 1

Figure 1
<p>Diagram of the GPR detection concrete distress principle.</p>
Full article ">Figure 2
<p>Process of the one-dimensional convolution layer.</p>
Full article ">Figure 3
<p>Process of the two-dimensional convolution layer.</p>
Full article ">Figure 4
<p>Process of the pooling layer.</p>
Full article ">Figure 5
<p>The diagram is one-dimensional.</p>
Full article ">Figure 6
<p>The diagram is two-dimensional.</p>
Full article ">Figure 7
<p>(<b>a</b>) H_B0: reinforced concrete beam; (<b>b</b>) H_BF1: 40 mm diameter PVC pipe and 60 mm side length of non-confined solid; (<b>c</b>) H_BF2: 25 mm diameter PVC pipe and 30 mm side length plastic foam.</p>
Full article ">Figure 8
<p>A heterogeneous numerical model of concrete.</p>
Full article ">Figure 9
<p>Models of concrete beams with defects: (<b>a</b>) H_B0; (<b>b</b>) H_BF1; (<b>c</b>) H_BF2.</p>
Full article ">Figure 10
<p>Concrete surface line tracks.</p>
Full article ">Figure 11
<p>The setting of prefabricated defects in concrete beams: (<b>a</b>) non-dense material, void, and (<b>b</b>) cracks generated during the experiment.</p>
Full article ">Figure 12
<p>(<b>a</b>) GSSI GPR equipment and (<b>b</b>) measurement process.</p>
Full article ">Figure 13
<p>Radar profile features of different defects (Figure (<b>a</b>–<b>c</b>) are simulation results, while Figure (<b>d</b>–<b>f</b>) are experiment results).</p>
Full article ">Figure 14
<p>Comparison of (<b>a</b>) simulated A-scan and (<b>b</b>) spectrum of simulated A-scan for three different defect types; (<b>c</b>) experimental A-scan and (<b>d</b>) spectrum of experimental A-scan for three different defect types.</p>
Full article ">Figure 15
<p>The fitting curves of the training process of two kinds of convolutional networks for (<b>a</b>) the one-dimensional CNN and (<b>b</b>) the two-dimensional CNN.</p>
Full article ">Figure 16
<p>Classification results of one-dimensional CNN model for simulated data for (<b>a</b>) the A-scan and (<b>b</b>) the spectrum of the A-scan.</p>
Full article ">Figure 17
<p>Classification results of two-dimensional CNN model for simulated data for (<b>a</b>) the A-scan and (<b>b</b>) the spectrum of the A-scan.</p>
Full article ">Figure 18
<p>Classification results of one-dimensional CNN model for experimental data for (<b>a</b>) the A-scan and (<b>b</b>) the spectrum of the A-scan.</p>
Full article ">Figure 19
<p>Classification results of two-dimensional CNN model for experimental data for (<b>a</b>) the A-scan and (<b>b</b>) the spectrum of the A-scan.</p>
Full article ">Figure 20
<p>Classification results of two-dimensional CNN model for merged data for (<b>a</b>) the A-scan and (<b>b</b>) the spectrum of the A-scan.</p>
Full article ">Figure 21
<p>Accuracy values for training on experimental and simulated data using (<b>a</b>) the one-dimensional CNN model and (<b>b</b>) the two-dimensional CNN model.</p>
Full article ">
24 pages, 16681 KiB  
Article
A Deep Ensemble Learning Approach Based on a Vision Transformer and Neural Network for Multi-Label Image Classification
by Anas W. Abulfaraj and Faisal Binzagr
Big Data Cogn. Comput. 2025, 9(2), 39; https://doi.org/10.3390/bdcc9020039 - 11 Feb 2025
Abstract
Convolutional Neural Networks (CNNs) have proven to be very effective in image classification due to their status as a powerful feature learning algorithm. Traditional approaches have considered the problem of multiclass classification, where the goal is to classify a set of objects at [...] Read more.
Convolutional Neural Networks (CNNs) have proven to be very effective in image classification due to their status as a powerful feature learning algorithm. Traditional approaches have considered the problem of multiclass classification, where the goal is to classify a set of objects at once. However, co-occurrence can make the discriminative features of the target less salient and may lead to overfitting of the model, resulting in lower performance. To address this, we propose a multi-label classification ensemble model including a Vision Transformer (ViT) and CNN for directly detecting one or multiple objects in an image. First, we improve the MobileNetV2 and DenseNet201 models using extra convolutional layers to strengthen image classification. In detail, three convolution layers are applied in parallel at the end of both models. ViT can learn dependencies among distant positions and local detail, making it an effective tool for multi-label classification. Finally, an ensemble learning algorithm is used to combine the classification predictions of the ViT, the modified MobileNetV2, and DenseNet201 bands for increased image classification accuracy using a voting system. The performance of the proposed model is examined on four benchmark datasets, achieving accuracies of 98.24%, 98.89%, 99.91%, and 96.69% on ASCAL VOC 2007, PASCAL VOC 2012, MS-COCO, and NUS-WIDE 318, respectively, showing that our framework can enhance current state-of-the-art methods. Full article
Show Figures

Figure 1

Figure 1
<p>Graphical illustration of proposed model.</p>
Full article ">Figure 2
<p>Impact of preprocessing steps. These images are sourced from publicly available datasets PASCAL VOC 2007 [<a href="#B62-BDCC-09-00039" class="html-bibr">62</a>], PASCAL VOC 2012 [<a href="#B63-BDCC-09-00039" class="html-bibr">63</a>], MS-COCO [<a href="#B64-BDCC-09-00039" class="html-bibr">64</a>], and NUS-WIDE [<a href="#B65-BDCC-09-00039" class="html-bibr">65</a>].</p>
Full article ">Figure 3
<p>Encoder-based transformer with multi-head attention and MLP attention blocks.</p>
Full article ">Figure 4
<p>Proposed modified MobileNetV2 model blocks.</p>
Full article ">Figure 5
<p>Proposed modified DenseNet201 model.</p>
Full article ">Figure 6
<p>Sample images in each dataset. These images are sourced from publicly available datasets PASCAL VOC 2007 [<a href="#B62-BDCC-09-00039" class="html-bibr">62</a>], PASCAL VOC 2012 [<a href="#B63-BDCC-09-00039" class="html-bibr">63</a>], MS-COCO [<a href="#B64-BDCC-09-00039" class="html-bibr">64</a>], and NUS-WIDE [<a href="#B65-BDCC-09-00039" class="html-bibr">65</a>].</p>
Full article ">Figure 7
<p>Training and validation graph comparison for (<b>a</b>) Dataset A, (<b>b</b>) Dataset B, (<b>c</b>) subDataset C, and (<b>d</b>) subDataset D.</p>
Full article ">Figure 8
<p>Comparison of mAP on all selected datasets using ViT, modified MobileNetV2, modified DenseNet201, and the proposed model.</p>
Full article ">Figure 9
<p>Predicted labels on each dataset using the proposed model. These images are sourced from publicly available datasets PASCAL VOC 2007 [<a href="#B62-BDCC-09-00039" class="html-bibr">62</a>], PASCAL VOC 2012 [<a href="#B63-BDCC-09-00039" class="html-bibr">63</a>], MS-COCO [<a href="#B64-BDCC-09-00039" class="html-bibr">64</a>], and NUS-WIDE [<a href="#B65-BDCC-09-00039" class="html-bibr">65</a>].</p>
Full article ">
18 pages, 16173 KiB  
Article
Comparative Analysis of Deep Learning Architectures for Macular Hole Segmentation in OCT Images: A Performance Evaluation of U-Net Variants
by H. M. S. S. Herath, S. L. P. Yasakethu, Nuwan Madusanka, Myunggi Yi and Byeong-Il Lee
J. Imaging 2025, 11(2), 53; https://doi.org/10.3390/jimaging11020053 - 11 Feb 2025
Abstract
This study presents a comprehensive comparison of U-Net variants with different backbone architectures for Macular Hole (MH) segmentation in optical coherence tomography (OCT) images. We evaluated eleven architectures, including U-Net combined with InceptionNetV4, VGG16, VGG19, ResNet152, DenseNet121, EfficientNet-B7, MobileNetV2, Xception, and Transformer. Models [...] Read more.
This study presents a comprehensive comparison of U-Net variants with different backbone architectures for Macular Hole (MH) segmentation in optical coherence tomography (OCT) images. We evaluated eleven architectures, including U-Net combined with InceptionNetV4, VGG16, VGG19, ResNet152, DenseNet121, EfficientNet-B7, MobileNetV2, Xception, and Transformer. Models were assessed using the Dice coefficient and HD95 metrics on the OIMHS dataset. While HD95 proved unreliable for small regions like MH, often returning ‘nan’ values, the Dice coefficient provided consistent performance evaluation. InceptionNetV4 + U-Net achieved the highest Dice coefficient (0.9672), demonstrating superior segmentation accuracy. Although considered state-of-the-art, Transformer + U-Net showed poor performance in MH and intraretinal cyst (IRC) segmentation. Analysis of computational resources revealed that MobileNetV2 + U-Net offered the most efficient performance with minimal parameters, while InceptionNetV4 + U-Net balanced accuracy with moderate computational demands. Our findings suggest that CNN-based backbones, particularly InceptionNetV4, are more effective than Transformer architectures for OCT image segmentation, with InceptionNetV4 + U-Net emerging as the most promising model for clinical applications. Full article
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) U-net replaced by a CNN encoder; (<b>b</b>) Workflow of this study.</p>
Full article ">Figure 2
<p>Dice coefficient trends.</p>
Full article ">Figure 3
<p>Metrics across models.</p>
Full article ">Figure 4
<p>Performance comparison.</p>
Full article ">
21 pages, 3599 KiB  
Article
Using Deep Learning to Identify Deepfakes Created Using Generative Adversarial Networks
by Jhanvi Jheelan and Sameerchand Pudaruth
Computers 2025, 14(2), 60; https://doi.org/10.3390/computers14020060 - 10 Feb 2025
Abstract
Generative adversarial networks (GANs) have revolutionised various fields by creating highly realistic images, videos, and audio, thus enhancing applications such as video game development and data augmentation. However, this technology has also given rise to deepfakes, which pose serious challenges due to their [...] Read more.
Generative adversarial networks (GANs) have revolutionised various fields by creating highly realistic images, videos, and audio, thus enhancing applications such as video game development and data augmentation. However, this technology has also given rise to deepfakes, which pose serious challenges due to their potential to create deceptive content. Thousands of media reports have informed us of such occurrences, highlighting the urgent need for reliable detection methods. This study addresses the issue by developing a deep learning (DL) model capable of distinguishing between real and fake face images generated by StyleGAN. Using a subset of the 140K real and fake face dataset, we explored five different models: a custom CNN, ResNet50, DenseNet121, MobileNet, and InceptionV3. We leveraged the pre-trained models to utilise their robust feature extraction and computational efficiency, which are essential for distinguishing between real and fake features. Through extensive experimentation with various dataset sizes, preprocessing techniques, and split ratios, we identified the optimal ones. The 20k_gan_8_1_1 dataset produced the best results, with MobileNet achieving a test accuracy of 98.5%, followed by InceptionV3 at 98.0%, DenseNet121 at 97.3%, ResNet50 at 96.1%, and the custom CNN at 86.2%. All of these models were trained on only 16,000 images and validated and tested on 2000 images each. The custom CNN model was built with a simpler architecture of two convolutional layers and, hence, lagged in accuracy due to its limited feature extraction capabilities compared with deeper networks. This research work also included the development of a user-friendly web interface that allows deepfake detection by uploading images. The web interface backend was developed using Flask, enabling real-time deepfake detection, allowing users to upload images for analysis and demonstrating a practical use for platforms in need of quick, user-friendly verification. This application demonstrates significant potential for practical applications, such as on social media platforms, where the model can help prevent the spread of fake content by flagging suspicious images for review. This study makes important contributions by comparing different deep learning models, including a custom CNN, to understand the balance between model complexity and accuracy in deepfake detection. It also identifies the best dataset setup that improves detection while keeping computational costs low. Additionally, it introduces a user-friendly web tool that allows real-time deepfake detection, making the research useful for social media moderation, security, and content verification. Nevertheless, identifying specific features of GAN-generated deepfakes remains challenging due to their high realism. Future works will aim to expand the dataset by using all 140,000 images, refine the custom CNN model to increase its accuracy, and incorporate more advanced techniques, such as Vision Transformers and diffusion models. The outcomes of this study contribute to the ongoing efforts to counteract the negative impacts of GAN-generated images. Full article
Show Figures

Figure 1

Figure 1
<p>Example of a GAN [<a href="#B12-computers-14-00060" class="html-bibr">12</a>].</p>
Full article ">Figure 2
<p>Fake images generated by StyleGAN from [<a href="#B10-computers-14-00060" class="html-bibr">10</a>].</p>
Full article ">Figure 3
<p>Examples of images of faces in the 140K real and fake face dataset [<a href="#B9-computers-14-00060" class="html-bibr">9</a>].</p>
Full article ">Figure 4
<p>Cropping operation on an image.</p>
Full article ">Figure 5
<p>Detailed architecture of the system.</p>
Full article ">Figure 6
<p>Website interacting with the server.</p>
Full article ">Figure 7
<p>Web interface of the application showing that a correct prediction has been made.</p>
Full article ">Figure 8
<p>Flowchart showing the image prediction process for the user.</p>
Full article ">Figure 9
<p>Graph for the analysis of the results of gan_8_1_1.</p>
Full article ">Figure 10
<p>Bar chart comparing the accuracy of all models.</p>
Full article ">
17 pages, 1944 KiB  
Article
Pediatric Pneumonia Recognition Using an Improved DenseNet201 Model with Multi-Scale Convolutions and Mish Activation Function
by Petra Radočaj, Dorijan Radočaj and Goran Martinović
Algorithms 2025, 18(2), 98; https://doi.org/10.3390/a18020098 - 10 Feb 2025
Abstract
Pediatric pneumonia remains a significant global health issue, particularly in low- and middle-income countries, where it contributes substantially to mortality in children under five. This study introduces a deep learning model for pediatric pneumonia diagnosis from chest X-rays that surpasses the performance of [...] Read more.
Pediatric pneumonia remains a significant global health issue, particularly in low- and middle-income countries, where it contributes substantially to mortality in children under five. This study introduces a deep learning model for pediatric pneumonia diagnosis from chest X-rays that surpasses the performance of state-of-the-art methods reported in the recent literature. Using a DenseNet201 architecture with a Mish activation function and multi-scale convolutions, the model was trained on a dataset of 5856 chest X-ray images, achieving high performance: 0.9642 accuracy, 0.9580 precision, 0.9506 sensitivity, 0.9542 F1 score, and 0.9507 specificity. These results demonstrate a significant advancement in diagnostic precision and efficiency within this domain. By achieving the highest accuracy and F1 score compared to other recent work using the same dataset, our approach offers a tangible improvement for resource-constrained environments where access to specialists and sophisticated equipment is limited. While the need for high-quality datasets and adequate computational resources remains a general consideration for deep learning applications, our model’s demonstrably superior performance establishes a new benchmark and offers the delivery of more timely and precise diagnoses, with the potential to significantly enhance patient outcomes. Full article
(This article belongs to the Special Issue Machine Learning in Medical Signal and Image Processing (3rd Edition))
Show Figures

Figure 1

Figure 1
<p>The study workflow for pneumonia recognition, utilizing a DenseNet architecture, Mish activation function, and multi-scale convolutions, proceeded through the following phases: (1) input data preparation, encompassing both healthy and pneumonia image classes; (2) implementation of the proposed deep learning model; and (3) evaluation of the implemented model’s performance in pneumonia recognition, including accuracy assessment.</p>
Full article ">Figure 2
<p>Samples of the pneumonia and healthy chest X-ray images used in this study.</p>
Full article ">Figure 3
<p>The confusion matrix for the proposed methodology. The classified values are represented by the upper numerical entry in each cell, with the corresponding class-specific percentage shown below.</p>
Full article ">Figure 4
<p>Grad-CAM heatmaps demonstrating model focus in pediatric pneumonia diagnosis.</p>
Full article ">
Back to TopTop