Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,388)

Search Parameters:
Keywords = deep context

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
16 pages, 3864 KiB  
Article
Structural and Functional Differences in the Bacterial Community of Chernozem Soil Under Conventional and Organic Farming Conditions
by Darya V. Poshvina, Alexander S. Balkin, Anastasia V. Teslya, Diana S. Dilbaryan, Artyom A. Stepanov, Sergey V. Kravchenko and Alexey S. Vasilchenko
Agriculture 2024, 14(12), 2127; https://doi.org/10.3390/agriculture14122127 (registering DOI) - 24 Nov 2024
Viewed by 135
Abstract
The conventional farming system, which predominates in most countries, is based on the use of agrochemical deep ploughing and other special methods. However, intensive farming has several negative impacts, including soil and water pollution and reduced biodiversity. The microbial community plays a crucial [...] Read more.
The conventional farming system, which predominates in most countries, is based on the use of agrochemical deep ploughing and other special methods. However, intensive farming has several negative impacts, including soil and water pollution and reduced biodiversity. The microbial community plays a crucial role in maintaining the health of agricultural ecosystems. In this context, we need to study how different agricultural practices affect the structural and functional characteristics of agricultural ecosystems. This study assessed the diversity, structure, and functional characteristics of the soil bacterial community in two different cropping systems. The subjects of the study were soil samples from Chernozem, which had been cultivated using the organic method for 11 years and the conventional method for 20 years. The fields are located in the southern part of the Russian Federation. Our results indicated minimal differences in the microbial diversity and soil community composition between the two systems studied. The profiling of the soil bacterial community revealed differences in the abundances of Proteobacteria, Bacteroidota, and Cyanobacteria, which were predominated in the conventional farming system (CFS), while Methylomirabilota and Fusobacteriota were more abundant in the organic farming system (OFS). Bacterial taxa and functional genes associated with nitrogen, phosphorus, and sulphur cycling were found to be more abundant in CFS soils than in OFS soils. The instrumental measurement of soil metabolic activity and microbial biomass content showed that CFS soils had higher microbiome activity than OFS soils. Overall, the study found that the agronomic practices used in conventional farming not only help to maintain the functional properties of the soil microbiome, but also significantly increase its microbiological activity and nutrient bioconversion, compared to organic farming practices. Full article
Show Figures

Figure 1

Figure 1
<p>Richness and diversity of bacteria in various cropping systems. The average levels of soil bacterial richness and diversity, measured by the Chao1 and Shannon index, respectively, were compared between two studied groups (<b>a</b>). Principal coordinate analysis (PCoA) plots of beta diversity of the bacterial community structures between the two farming systems (<b>b</b>) (n = 8, per farming system).</p>
Full article ">Figure 2
<p>The relative abundance of bacterial taxa at phylum (<b>a</b>) and genus (<b>b</b>) levels. Top 10 phyla and top 20 genera are shown, and the rest are merged into others.</p>
Full article ">Figure 3
<p>The heat map shows the difference in abundance of bacterial genera between organic and conventional systems. The genera were clustered using Euclidean distance. The relative abundance of each taxon is shown using a color gradient from blue (indicating low abundance) to red (indicating high abundance) (<span class="html-italic">p</span> &lt; 0.05). The samples were grouped using Euclidean distance and complete linkage method.</p>
Full article ">Figure 4
<p>The heatmap of selected KEGG genes predicted with PICRUSt. The normalized relative abundance of each gene is indicated by a color (blue—low abundance; red—high abundance).</p>
Full article ">
15 pages, 767 KiB  
Article
A Model to Strengthen the Quality of Midwifery Education: A Grounded Theory Approach
by Waleola B. Ige and Winnie B. Ngcobo
Int. Med. Educ. 2024, 3(4), 473-487; https://doi.org/10.3390/ime3040036 (registering DOI) - 24 Nov 2024
Viewed by 170
Abstract
A well-educated midwifery workforce is critical to providing quality health services. However, the quality of midwifery education in Nigeria is identified as a factor contributing to the country’s poor maternal and neonatal health outcomes and inability to meet global development goals. This study [...] Read more.
A well-educated midwifery workforce is critical to providing quality health services. However, the quality of midwifery education in Nigeria is identified as a factor contributing to the country’s poor maternal and neonatal health outcomes and inability to meet global development goals. This study aimed to analyse the process used to strengthen the quality of midwifery education with the aim of generating a middle-range model to prepare competent and confident midwifery graduates. The Strauss and Corbin version of the Grounded Theory approach that is underpinned by the Social Constructivism Paradigm was adopted for this qualitative study. Strengthening the quality of midwifery education (SQME) emerged as the model’s core phenomenon. Major concepts, including the midwifery education context, nature of the curriculum, SQME process, pillars, and outcomes, supported the core phenomenon. Strengthening the quality of midwifery education can be achieved over a long time provided the pillars of SQME are deep-rooted to sustain the process of strengthening the quality of midwifery education. The model can be used to strengthen the quality of midwifery education and may be adapted to nursing/allied health programmes in Nigeria and other developing countries. Full article
Show Figures

Figure 1

Figure 1
<p>Visual representation of the summary of findings in line with Stauss and Corbin’s paradigm.</p>
Full article ">Figure 2
<p>A model to strengthen the quality of midwifery education.</p>
Full article ">
14 pages, 1209 KiB  
Article
Investigation of Nonlinear Relations Among Flow Profiles Using Artificial Neural Networks
by Shiming Yuan, Caixia Chen, Yong Yang and Yonghua Yan
Fluids 2024, 9(12), 276; https://doi.org/10.3390/fluids9120276 (registering DOI) - 23 Nov 2024
Viewed by 139
Abstract
This study investigated the ability of artificial neural networks (ANNs) to resolve the nonlinear dynamics inherent in the behavior of complex fluid flows, which often exhibit multifaceted characteristics that challenge traditional analytical or numerical methods. By employing flow profile pairs that are generated [...] Read more.
This study investigated the ability of artificial neural networks (ANNs) to resolve the nonlinear dynamics inherent in the behavior of complex fluid flows, which often exhibit multifaceted characteristics that challenge traditional analytical or numerical methods. By employing flow profile pairs that are generated through high-fidelity numerical simulations, encompassing both the one-dimensional benchmark problems and the more intricate three-dimensional boundary layer transition problem, this research convincingly demonstrates that neural networks possess a remarkable capacity to effectively capture the discontinuities and the subtle wave characteristics that occur at small scales within complex fluid flows, thereby showcasing their robustness in handling intricate fluid dynamics phenomena. Furthermore, even in the context of challenging three-dimensional problems, this study reveals that the average velocity profiles can be predicted with a high degree of accuracy, utilizing a limited number of input profiles during the training phase, which underscores the efficiency and efficacy of the model in understanding complex systems. The findings of this study significantly underscore the immense potential that artificial neural networks, along with deep learning methodologies, hold in advancing our comprehension of the fundamental physics that govern complex fluid dynamics systems, while concurrently demonstrating their applicability across a variety of flow scenarios and their capacity to yield insightful revelations regarding the nonlinear relationships that exist among diverse flow parameters, thus paving the way for future research in this critical area of study. Full article
15 pages, 4989 KiB  
Article
End-to-End Latency Optimization for Resilient Distributed Convolutional Neural Network Inference in Resource-Constrained Unmanned Aerial Vehicle Swarms
by Jeongho Kim, Joonho Seon, Soohyun Kim, Seongwoo Lee, Jinwook Kim, Byungsun Hwang, Youngghyu Sun and Jinyoung Kim
Appl. Sci. 2024, 14(23), 10832; https://doi.org/10.3390/app142310832 - 22 Nov 2024
Viewed by 313
Abstract
An unmanned aerial vehicle (UAV) swarm has emerged as a powerful tool for mission execution in a variety of applications supported by deep neural networks (DNNs). In the context of UAV swarms, conventional methods for efficient data processing involve transmitting data to cloud [...] Read more.
An unmanned aerial vehicle (UAV) swarm has emerged as a powerful tool for mission execution in a variety of applications supported by deep neural networks (DNNs). In the context of UAV swarms, conventional methods for efficient data processing involve transmitting data to cloud and edge servers. However, these methods often face limitations in adapting to real-time applications due to the low latency of cloud-based approaches and weak mobility of edge-based approaches. In this paper, a new system called deep reinforcement learning-based resilient layer distribution (DRL-RLD) for distributed inference is designed to minimize end-to-end latency in UAV swarm, considering the resource constraints of UAVs. The proposed system dynamically allocates CNN layers based on UAV-to-UAV and UAV-to-ground communication links to minimize end-to-end latency. It can also enhance resilience to maintain mission continuity by reallocating layers when inoperable UAVs occur. The performance of the proposed system was verified through simulations in terms of latency compared to the comparison baselines, and its robustness was demonstrated in the presence of inoperable UAVs. Full article
(This article belongs to the Special Issue Novel Advances in Internet of Vehicles)
17 pages, 1492 KiB  
Article
Deep Learning-Based Infrared Image Segmentation for Aircraft Honeycomb Water Ingress Detection
by Hang Fei, Hongfu Zuo, Han Wang, Yan Liu, Zhenzhen Liu and Xin Li
Aerospace 2024, 11(12), 961; https://doi.org/10.3390/aerospace11120961 - 22 Nov 2024
Viewed by 319
Abstract
The presence of water accumulation on aircraft surfaces constitutes a considerable hazard to both performance and safety, necessitating vigilant inspection and maintenance protocols. In this study, we introduce an innovative semantic segmentation model, grounded in deep learning principles, for the precise identification and [...] Read more.
The presence of water accumulation on aircraft surfaces constitutes a considerable hazard to both performance and safety, necessitating vigilant inspection and maintenance protocols. In this study, we introduce an innovative semantic segmentation model, grounded in deep learning principles, for the precise identification and delineation of water accumulation areas within infrared images of aircraft exteriors. Our proposed model harnesses the robust features of ResNet, serving as the foundational architecture for U-Net, thereby augmenting the model’s capacity for comprehensive feature characterization. The incorporation of channel attention mechanisms, spatial attention mechanisms, and depthwise separable convolution further refines the network structure, contributing to enhanced segmentation performance. Through rigorous experimentation, our model surpasses existing benchmarks, yielding a commendable 22.44% reduction in computational effort and a substantial 38.89% reduction in parameter count. The model’s outstanding performance is particularly noteworthy, registering a 92.67% mean intersection over union and a 97.97% mean pixel accuracy. The hallmark of our innovation lies in the model’s efficacy in the precise detection and segmentation of water accumulation areas on aircraft skin. Beyond this, our approach holds promise for addressing analogous challenges in aviation and related domains. The enumeration of specific quantitative outcomes underscores the superior efficacy of our model, rendering it a compelling solution for precise detection and segmentation tasks. The demonstrated reductions in computational effort and parameter count underscore the model’s efficiency, fortifying its relevance in broader contexts. Full article
(This article belongs to the Section Aeronautics)
Show Figures

Figure 1

Figure 1
<p>Structure of aircraft skin.</p>
Full article ">Figure 2
<p>Schematic of aircraft skin water accumulation detection via deep learning.</p>
Full article ">Figure 3
<p>Structure of the proposed network.</p>
Full article ">Figure 4
<p>Structure of CBAM.</p>
Full article ">Figure 5
<p>Processes in the channel attention module.</p>
Full article ">Figure 6
<p>Processes in the spatial attention module.</p>
Full article ">Figure 7
<p>Processes in depthwise separable convolution.</p>
Full article ">Figure 8
<p>Diagram of research method.</p>
Full article ">Figure 9
<p>The FLIR E8 infrared thermal camera.</p>
Full article ">Figure 10
<p>Example of the dataset. (<b>a</b>) Original image. (<b>b</b>) Processed image. (<b>c</b>) Ground truth.</p>
Full article ">Figure 11
<p>Diagram of the evaluation index.</p>
Full article ">Figure 12
<p>Examples of segmentation results.</p>
Full article ">
16 pages, 5582 KiB  
Article
Evaluating Brain Tumor Detection with Deep Learning Convolutional Neural Networks Across Multiple MRI Modalities
by Ioannis Stathopoulos, Luigi Serio, Efstratios Karavasilis, Maria Anthi Kouri, Georgios Velonakis, Nikolaos Kelekis and Efstathios Efstathopoulos
J. Imaging 2024, 10(12), 296; https://doi.org/10.3390/jimaging10120296 - 21 Nov 2024
Viewed by 298
Abstract
Central Nervous System (CNS) tumors represent a significant public health concern due to their high morbidity and mortality rates. Magnetic Resonance Imaging (MRI) has emerged as a critical non-invasive modality for the detection, diagnosis, and management of brain tumors, offering high-resolution visualization of [...] Read more.
Central Nervous System (CNS) tumors represent a significant public health concern due to their high morbidity and mortality rates. Magnetic Resonance Imaging (MRI) has emerged as a critical non-invasive modality for the detection, diagnosis, and management of brain tumors, offering high-resolution visualization of anatomical structures. Recent advancements in deep learning, particularly convolutional neural networks (CNNs), have shown potential in augmenting MRI-based diagnostic accuracy for brain tumor detection. In this study, we evaluate the diagnostic performance of six fundamental MRI sequences in detecting tumor-involved brain slices using four distinct CNN architectures enhanced with transfer learning techniques. Our dataset comprises 1646 MRI slices from the examinations of 62 patients, encompassing both tumor-bearing and normal findings. With our approach, we achieved a classification accuracy of 98.6%, underscoring the high potential of CNN-based models in this context. Additionally, we assessed the performance of each MRI sequence across the different CNN models, identifying optimal combinations of MRI modalities and neural networks to meet radiologists’ screening requirements effectively. This study offers critical insights into the integration of deep learning with MRI for brain tumor detection, with implications for improving diagnostic workflows in clinical settings. Full article
Show Figures

Figure 1

Figure 1
<p>Six different MRI sequences of a normal brain examination. From left to right and top to bottom: T1, T2, FLAIR, T1+C, Diffusion, apparent diffusion coefficient (ADC) map.</p>
Full article ">Figure 2
<p>Six different MRI sequences of a verified Brain Tumor examination. From left to right and top to bottom: T1, T2, FLAIR, T1+C, Diffusion, and ADC.</p>
Full article ">Figure 3
<p>Image representation of the preprocessing steps.</p>
Full article ">Figure 4
<p>One normal and two tumor examinations are shown for all six MRI sequences. In all images, the original image is displayed on the left, and the overlap with the heatmap produced from the last convolutional layer of the VGG16 model is displayed on the right. In the titles, N represents the Normal class, and T represents the Tumor class, both followed by the prediction probability for the respective class. Misclassified cases are highlighted in red.</p>
Full article ">Scheme 1
<p>ROCs for FLAIR sequence.</p>
Full article ">Scheme 2
<p>ROCs for T1+C sequence.</p>
Full article ">Scheme 3
<p>ROCs for ADC sequence.</p>
Full article ">Scheme 4
<p>ROCs for T1 sequence.</p>
Full article ">Scheme 5
<p>ROCs for Diffusion sequence.</p>
Full article ">Scheme 6
<p>ROCs for T2 sequence.</p>
Full article ">Scheme 7
<p>(<b>Left</b>): The evaluation metrics results of the experiment are in the whole dataset. (<b>Right</b>): the corresponding ROC curve.</p>
Full article ">
22 pages, 4118 KiB  
Article
Empirical Evidence Regarding Few-Shot Learning for Scene Classification in Remote Sensing Images
by Valdivino Alexandre de Santiago Júnior
Appl. Sci. 2024, 14(23), 10776; https://doi.org/10.3390/app142310776 - 21 Nov 2024
Viewed by 294
Abstract
Few-shot learning (FSL) is a learning paradigm which aims to address the issue of machine/deep learning techniques which traditionally need huge amounts of labelled data to work out. The remote sensing (RS) community has explored this paradigm with numerous published studies to date. [...] Read more.
Few-shot learning (FSL) is a learning paradigm which aims to address the issue of machine/deep learning techniques which traditionally need huge amounts of labelled data to work out. The remote sensing (RS) community has explored this paradigm with numerous published studies to date. Nevertheless, there is still a need for clear pieces of evidence on FSL-related issues in the RS context, such as which of the inference approaches is more suitable: inductive or transductive? Moreover, how does the number of epochs used during training, based on the meta-training (base) dataset, relate to the number of unseen classes during inference? This study aims to address these and other relevant questions in the context of FSL for scene classification in RS images. A comprehensive evaluation was conducted considering eight FSL approaches (three inductive and five transductive) and six scene classification databases. Some conclusions of this research are as follows: (1) transductive approaches are better than inductive ones. In particular, the transductive technique Transductive Information Maximisation (TIM) presented the best overall performance, where in 20 cases it got the first place; (2) a larger number of training epochs is more beneficial when there are more unseen classes during the inference phase. The most impressive gains occurred particularly considering the AID (6-way) and RESISC-45 (9-way) datasets. Notably, in the AID dataset, a remarkable 58.412% improvement was achieved in 1-shot tasks going from 10 to 200 epochs; (3) using five samples in the support set is statistically significantly better than using only one; and (4) a higher similarity between unseen classes (during inference) and some of the training classes does not lead to an improved performance. These findings can guide RS researchers and practitioners in selecting optimal solutions/strategies for developing their applications demanding few labelled samples. Full article
(This article belongs to the Topic Computational Intelligence in Remote Sensing: 2nd Edition)
Show Figures

Figure 1

Figure 1
<p>An arrangement for a 2-way 1-shot few-shot task.</p>
Full article ">Figure 2
<p>The workflow of the method related to this study.</p>
Full article ">Figure 3
<p>Samples from EuroSAT (top row: (<b>a</b>–<b>e</b>)) and XAI4SAR (bottom row: (<b>f</b>–<b>j</b>)) datasets.</p>
Full article ">Figure 4
<p>Samples from UC Merced (top row: (<b>a</b>–<b>e</b>)) and WHU-RS19 (bottom row: (<b>f</b>–<b>j</b>)) datasets. Caption: resid. = residential.</p>
Full article ">Figure 5
<p>Samples from AID (top row: (<b>a</b>–<b>e</b>)) and RESISC-45 (bottom row: (<b>f</b>–<b>j</b>)) datasets.</p>
Full article ">Figure 6
<p>Q-Q plots for the 1-shot and 5-shot sets. (<b>a</b>) 1-shot set. (<b>b</b>) 5-shot set.</p>
Full article ">Figure 7
<p>Average accuracies: 5-shot and 1-shot sets.</p>
Full article ">
27 pages, 443456 KiB  
Article
ImageOP: The Image Dataset with Religious Buildings in the World Heritage Town of Ouro Preto for Deep Learning Classification
by André Luiz Carvalho Ottoni and Lara Toledo Cordeiro Ottoni
Heritage 2024, 7(11), 6499-6525; https://doi.org/10.3390/heritage7110302 - 20 Nov 2024
Viewed by 298
Abstract
Artificial intelligence has significant applications in computer vision studies for cultural heritage. In this research field, visual inspection of historical buildings and the digitization of heritage using machine learning models stand out. However, the literature still lacks datasets for the classification and identification [...] Read more.
Artificial intelligence has significant applications in computer vision studies for cultural heritage. In this research field, visual inspection of historical buildings and the digitization of heritage using machine learning models stand out. However, the literature still lacks datasets for the classification and identification of Brazilian religious buildings using deep learning, particularly with images from the historic town of Ouro Preto. It is noteworthy that Ouro Preto was the first Brazilian World Heritage Site recognized by UNESCO in 1980. In this context, this paper aims to address this gap by proposing a new image dataset, termed ImageOP: The Image Dataset with Religious Buildings in the World Heritage Town of Ouro Preto for Deep Learning Classification. This new dataset comprises 1613 images of facades from 32 religious monuments in the historic town of Ouro Preto, categorized into five classes: fronton (pediment), door, window, tower, and church. The experiments to validate the ImageOP dataset were conducted in two stages: simulations and computer vision using smartphones. Furthermore, two deep learning structures (MobileNet V2 and EfficientNet B0) were evaluated using Edge Impulse software. MobileNet V2 and EfficientNet B0 are architectures of convolutional neural networks designed for computer vision applications aiming at low computational cost, real-time classification on mobile devices. The results indicated that the models utilizing EfficientNet achieved the best outcomes in the simulations, with accuracy = 94.5%, precision = 96.0%, recall = 96.0%, and F-score = 96.0%. Additionally, superior accuracy values were obtained in detecting the five classes: fronton (96.4%), church (97.1%), window (89.2%), door (94.7%), and tower (95.4%). The results from the experiments with computer vision and smartphones reinforced the effectiveness of the proposed dataset, showing an average accuracy of 88.0% in detecting building elements across nine religious monuments tested for real-time mobile device application. The dataset is available in the Mendeley Data repository. Full article
Show Figures

Figure 1

Figure 1
<p>Methodology for the development of the ImageOP dataset.</p>
Full article ">Figure 2
<p>Historic Town of Ouro Preto. (<b>a</b>) Chapel of the Governors Palace and Museum of Inconfidence. (<b>b</b>) Church of Saint Francis of Assisi and Pico do Itacolomi. (<b>c</b>) Church of Saint Efigenia and historic houses. (<b>d</b>) Mountains and historic buildings.</p>
Full article ">Figure 3
<p>Regions of the historic town of Ouro Preto visited for the development of the ImageOP dataset. Source: modified from Google Maps.</p>
Full article ">Figure 4
<p>Religious monuments of Ouro Preto (Part I): (<b>a</b>) Chapel of Lord of Bonfim; (<b>b</b>) Chapel of the Dry Bridge Pass; (<b>c</b>) Chapel of the Governors Palace; (<b>d</b>) Chapel of Saint Anthony; (<b>e</b>) Chapel of the Saint Kings; (<b>f</b>) Chapel of Saint Joseph; (<b>g</b>) Chapel of Our Lady of Piety; (<b>h</b>) Chapel of Our Lady of Conception; (<b>i</b>) Church of Our Lady of Mercy; (<b>j</b>) Chapel of Our Lady of Good Dispatch; (<b>k</b>) Chapel of Saint Luzia; (<b>l</b>) Church of Our Lady of Piety.</p>
Full article ">Figure 5
<p>Religious monuments of Ouro Preto (Part II): (<b>a</b>) Church of Our Lady of Nazareth; (<b>b</b>) Church of Our Lady of Sorrows; (<b>c</b>) Basilica of Our Lady of Pilar; (<b>d</b>) Church of Saint Francis of Assisi; (<b>e</b>) Church of Our Lady of Mercy and Pardons; (<b>f</b>) Church of Saint Francis of Paula; (<b>g</b>) Church of Our Lady of Mount Carmel; (<b>h</b>) Sanctuary of Our Lady of Conception; (<b>i</b>) Church of Saint Anthony of Leite; (<b>j</b>) Church of Saint Anthony of Casa Branca; (<b>k</b>) Church of Good Jesus of Matosinhos and Saint Michael and Souls; (<b>l</b>) Church of Saint Efigenia; (<b>m</b>) Church of Our Lady of Mercy and Compassion; (<b>n</b>) Church of Saint Gonçalo; (<b>o</b>) Church of Our Lady of the Rosary; (<b>p</b>) Church of Saint Bartholomew; (<b>q</b>) Church of Our Lady of Mercy (Cachoeira do Campo); (<b>r</b>) Church of Our Lady of Sorrows of Mount Calvary; (<b>s</b>) Church of Saint Joseph; (<b>t</b>) Church of Our Lady of Mercy (São Bartolomeu).</p>
Full article ">Figure 6
<p>Image collection process in the historic town of Ouro Preto. (<b>a</b>) Data collection of small church. (<b>b</b>) Data collection of church.</p>
Full article ">Figure 7
<p>Kodak<sup>®</sup> PIXPRO AZ255 digital camera. (<b>a</b>) Front view of the camera. (<b>b</b>) Camera display.</p>
Full article ">Figure 8
<p>(<b>a</b>) Building components in a church: (1) fronton; (2) door; (3) window; (4) tower. (<b>b</b>) Building components in a small church (chapel): (1) fronton; (2) door.</p>
Full article ">Figure 9
<p>Examples of images from the fronton class. (<b>a</b>–<b>l</b>) Photographs of church pediments in the Historic Town of Ouro Preto.</p>
Full article ">Figure 10
<p>Examples of images from the church class. (<b>a</b>–<b>l</b>) Photographs of churches in the Historic Town of Ouro Preto.</p>
Full article ">Figure 11
<p>Examples of images from the window class. (<b>a</b>–<b>l</b>) Photographs of church windows in the Historic Town of Ouro Preto.</p>
Full article ">Figure 12
<p>Examples of images from the door class. (<b>a</b>–<b>l</b>) Photographs of church doors in the Historic Town of Ouro Preto.</p>
Full article ">Figure 13
<p>Examples of images from the tower class. (<b>a</b>–<b>l</b>) Photographs of church towers in the Historic Town of Ouro Preto.</p>
Full article ">Figure 14
<p>Method of dataset benchmarking for deep learning classification.</p>
Full article ">Figure 15
<p>Graph of the train and validation history for the MobileNet architecture: (<b>a</b>) accuracy; (<b>b</b>) loss.</p>
Full article ">Figure 16
<p>Confusion matrix for the MobileNet architecture.</p>
Full article ">Figure 17
<p>Graph of the train and validation history for the EfficientNet architecture: (<b>a</b>) accuracy; (<b>b</b>) loss.</p>
Full article ">Figure 17 Cont.
<p>Graph of the train and validation history for the EfficientNet architecture: (<b>a</b>) accuracy; (<b>b</b>) loss.</p>
Full article ">Figure 18
<p>Confusion matrix for the EfficientNet architecture.</p>
Full article ">Figure 19
<p>Proposed procedure for applying computer vision using smartphones to recognize elements of religious buildings.</p>
Full article ">Figure 20
<p>Process of computer vision using mobile device in the historic town of São João del-Rei. (<b>a</b>) Church door detection. (<b>b</b>) Church Detection.</p>
Full article ">Figure 21
<p>Example of deep learning classification using computer vision with a mobile device and Edge Impulse software. Detected class: window (<span class="html-italic">janela</span> in Portuguese).</p>
Full article ">Figure 22
<p>Examples of real-time classification from screenshots of the Edge Impulse graphical interface accessed on the mobile device.</p>
Full article ">
13 pages, 860 KiB  
Article
Multi-Scale 3D Cephalometric Landmark Detection Based on Direct Regression with 3D CNN Architectures
by Chanho Song, Yoosoo Jeong, Hyungkyu Huh, Jee-Woong Park, Jun-Young Paeng, Jaemyung Ahn, Jaebum Son and Euisung Jung
Diagnostics 2024, 14(22), 2605; https://doi.org/10.3390/diagnostics14222605 - 20 Nov 2024
Viewed by 321
Abstract
Background: Cephalometric analysis is important in diagnosing and planning treatments for patients, traditionally relying on 2D cephalometric radiographs. With advancements in 3D imaging, automated landmark detection using deep learning has gained prominence. However, 3D imaging introduces challenges due to increased network complexity and [...] Read more.
Background: Cephalometric analysis is important in diagnosing and planning treatments for patients, traditionally relying on 2D cephalometric radiographs. With advancements in 3D imaging, automated landmark detection using deep learning has gained prominence. However, 3D imaging introduces challenges due to increased network complexity and computational demands. This study proposes a multi-scale 3D CNN-based approach utilizing direct regression to improve the accuracy of maxillofacial landmark detection. Methods: The method employs a coarse-to-fine framework, first identifying landmarks in a global context and then refining their positions using localized 3D patches. A clinical dataset of 150 CT scans from maxillofacial surgery patients, annotated with 30 anatomical landmarks, was used for training and evaluation. Results: The proposed method achieved an average RMSE of 2.238 mm, outperforming conventional 3D CNN architectures. The approach demonstrated consistent detection without failure cases. Conclusions: Our multi-scale-based 3D CNN framework provides a reliable method for automated landmark detection in maxillofacial CT images, showing potential for other clinical applications. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

Figure 1
<p>Diagram of multi-scale cephalometric landmark detection architecture. Stage 1 involves coarse detection to identify coordinates and classes from the entire input volume, followed by generating local volumes through 3D region of interest (ROI) processing. Stage 2 focuses on fine localization using these local volumes and classes.</p>
Full article ">Figure 2
<p>Steps for data preprocessing to convert volumetric CT data into normalized voxel data.</p>
Full article ">Figure 3
<p>Convolutional Neural Network (CNN) architectures including ResNet, DenseNet, Inception, and InceptionResNet.</p>
Full article ">Figure 4
<p>DenseNet169-based multi-output learning model for coarse detection.</p>
Full article ">Figure 5
<p>DenseNet169-based multi-input learning model for fine localization.</p>
Full article ">
18 pages, 4942 KiB  
Article
Unsupervised Anomaly Detection and Explanation in Network Traffic with Transformers
by André Kummerow, Esrom Abrha, Markus Eisenbach and Dennis Rösch
Electronics 2024, 13(22), 4570; https://doi.org/10.3390/electronics13224570 - 20 Nov 2024
Viewed by 338
Abstract
Deep learning-based autoencoders represent a promising technology for use in network-based attack detection systems. They offer significant benefits in managing unknown network traces or novel attack signatures. Specifically, in the context of critical infrastructures, such as power supply systems, AI-based intrusion detection systems [...] Read more.
Deep learning-based autoencoders represent a promising technology for use in network-based attack detection systems. They offer significant benefits in managing unknown network traces or novel attack signatures. Specifically, in the context of critical infrastructures, such as power supply systems, AI-based intrusion detection systems must meet stringent requirements concerning model accuracy and trustworthiness. For the intrusion response, the activation of suitable countermeasures can greatly benefit from additional transparency information (e.g., attack causes). Transformers represent the state of the art for learning from sequential data and provide important model insights through the widespread use of attention mechanisms. This paper introduces a two-stage transformer-based autoencoder for learning meaningful information from network traffic at the packet and sequence level. Based on this, we present a sequential attention weight perturbation method to explain benign and malicious network packets. We evaluate our method against benchmark models and expert-based explanations using the CIC-IDS-2017 benchmark dataset. The results show promising results in terms of detecting and explaining FTP and SSH brute-force attacks, highly outperforming the results of the benchmark model. Full article
(This article belongs to the Special Issue Applications of Deep Learning in Cyber Threat Detection)
Show Figures

Figure 1

Figure 1
<p>The encoder structure of the transformer-based network traffic autoencoder.</p>
Full article ">Figure 2
<p>The decoder structure of the transformer-based network traffic autoencoder.</p>
Full article ">Figure 3
<p>The structure of a multi-head self-attention block (K: keys; Q: queries; V: values).</p>
Full article ">Figure 4
<p>Explanation model architecture with attention perturbation.</p>
Full article ">Figure 5
<p>Influence values for exemplary positional losses <math display="inline"><semantics> <mrow> <msubsup> <mrow> <mi>L</mi> </mrow> <mrow> <mi>P</mi> <mi>I</mi> <mi>F</mi> <mo>,</mo> <mi>D</mi> </mrow> <mrow> <mo>*</mo> </mrow> </msubsup> </mrow> </semantics></math> using a Sigmoid function with different bandwidth parameters (<math display="inline"><semantics> <mrow> <msub> <mrow> <mi>L</mi> </mrow> <mrow> <mi>P</mi> <mi>I</mi> <mi>F</mi> <mo>,</mo> <mi>D</mi> </mrow> </msub> <mo>=</mo> <mn>0.5</mn> </mrow> </semantics></math>).</p>
Full article ">Figure 6
<p>Explanation procedure with perturbation of attention weights.</p>
Full article ">Figure 7
<p>Training results of a T-NAE model (left: loss values; right: reconstruction values).</p>
Full article ">Figure 8
<p>Reconstruction results of using a T-NAE model on the test dataset.</p>
Full article ">Figure 9
<p>Influence values (top left: discrete inputs; top right: continuous inputs; bottom: packet sequence) for an exemplary benign network packet sequence.</p>
Full article ">Figure 10
<p>Influence values (top left: discrete inputs; top right: continuous inputs; bottom: packet sequence) for an exemplary malicious network packet sequence.</p>
Full article ">Figure A1
<p>Histogram-based threshold calculation for exemplary Gaussian errors.</p>
Full article ">
18 pages, 9613 KiB  
Article
Toward Versatile Small Object Detection with Temporal-YOLOv8
by Martin C. van Leeuwen, Ella P. Fokkinga, Wyke Huizinga, Jan Baan and Friso G. Heslinga
Sensors 2024, 24(22), 7387; https://doi.org/10.3390/s24227387 - 20 Nov 2024
Viewed by 291
Abstract
Deep learning has become the preferred method for automated object detection, but the accurate detection of small objects remains a challenge due to the lack of distinctive appearance features. Most deep learning-based detectors do not exploit the temporal information that is available in [...] Read more.
Deep learning has become the preferred method for automated object detection, but the accurate detection of small objects remains a challenge due to the lack of distinctive appearance features. Most deep learning-based detectors do not exploit the temporal information that is available in video, even though this context is often essential when the signal-to-noise ratio is low. In addition, model development choices, such as the loss function, are typically designed around medium-sized objects. Moreover, most datasets that are acquired for the development of small object detectors are task-specific and lack diversity, and the smallest objects are often not well annotated. In this study, we address the aforementioned challenges and create a deep learning-based pipeline for versatile small object detection. With an in-house dataset consisting of civilian and military objects, we achieve a substantial improvement in YOLOv8 (baseline mAP = 0.465) by leveraging the temporal context in video and data augmentations specifically tailored to small objects (mAP = 0.839). We also show the benefit of having a carefully curated dataset in comparison with public datasets and find that a model trained on a diverse dataset outperforms environment-specific models. Our findings indicate that small objects can be detected accurately in a wide range of environments while leveraging the speed of the YOLO architecture. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>An illustration of the T-YOLO concept. Instead of using a single video frame as input, multiple frames are stacked from different time steps. In this example, the RGB channels are replaced with three gray frames. However, by slightly altering the input layer of the YOLO model, the number of frames stacked can be extended, enabling Color-T-YOLO and Manyframe-YOLO. This combined 3-channel image is provided to the model, allowing temporal context to be exploited. Assuming a 30-frames-per-second (FPS) source video, around 15 frames are sampled before and after the current frame.</p>
Full article ">Figure 2
<p>An overview of dataset preprocessing. The images are first blurred with a box blur to prevent aliasing effects. The kernel size of this blur is determined by the downscaling factor, for which the mapping is listed in <a href="#sensors-24-07387-t001" class="html-table">Table 1</a>. The downscaling factor itself is determined based on the size of the annotated small objects, i.e., the bounding box sizes, and the target resolution for the mosaicking composition. After blurring and downscaling, the images are padded and cropped to fully fit in the balanced mosaicking composition, for which an example is shown in <a href="#sensors-24-07387-f003" class="html-fig">Figure 3</a>b.</p>
Full article ">Figure 3
<p>A comparison of mosaicking techniques on the Airborne Object Tracking (AOT) dataset [<a href="#B46-sensors-24-07387" class="html-bibr">46</a>]. The red annotations indicate the presence of a small object. (<b>a</b>) The built-in mosaicking in YOLOv8 with default settings. (<b>b</b>) Balanced mosaicking, where crops with varying sizes are used.</p>
Full article ">Figure 4
<p>Our metric considers these examples correct, while typical object detection metrics will flag these as either a false positive or a false negative. <b>Left</b>: a single detection covering multiple annotations does not result in a false negative. <b>Right</b>: multiple detections within a single annotation do not produce a false positive.</p>
Full article ">Figure 5
<p>Frequency of bounding box areas found in the dataset after the processing shown in <a href="#sensors-24-07387-f002" class="html-fig">Figure 2</a> is applied.</p>
Full article ">Figure 6
<p>Image enhancement during annotation allows the discovery of additional tiny objects in the dataset that could otherwise easily be missed. <b>Left</b>: the original image, where the small objects are barely visible and thus very difficult to annotate. <b>Right</b>: the colored overlay based on frame differences, highlighting the small objects and facilitating more accurate annotation.</p>
Full article ">Figure 7
<p>The precision–recall curves computed for each YOLOv8 variant in the ablation study based on the complete test set. For a description of each experiment, refer to <a href="#sensors-24-07387-t003" class="html-table">Table 3</a>.</p>
Full article ">Figure 8
<p>The precision–recall curves computed for each dataloader variation in the ablation study based on the complete test set. For a description of each experiment, refer to <a href="#sensors-24-07387-t003" class="html-table">Table 3</a>.</p>
Full article ">Figure 9
<p>mAP scores from the specificity study. Each group of bars represents the results on a subset of the test set, while the color indicates which model was used for evaluation. The <span class="html-italic">Diverse</span> model was trained on the complete training dataset, while the <span class="html-italic">Specific</span> model was trained only on training data from the test domain.</p>
Full article ">Figure 10
<p>mAP scores from the public dataset study. Each bar represents the result on the Nano-VID test set based on the training set given on the x-axis.</p>
Full article ">Figure 11
<p>Example detections from the proposed Temporal-YOLOv8 model. These crops have been resized to four times their original size. The value drawn in each bounding box refers to the confidence of the prediction.</p>
Full article ">
23 pages, 4260 KiB  
Article
Application of Machine Learning and Deep Neural Visual Features for Predicting Adult Obesity Prevalence in Missouri
by Butros M. Dahu, Carlos I. Martinez-Villar, Imad Eddine Toubal, Mariam Alshehri, Anes Ouadou, Solaiman Khan, Lincoln R. Sheets and Grant J. Scott
Int. J. Environ. Res. Public Health 2024, 21(11), 1534; https://doi.org/10.3390/ijerph21111534 - 19 Nov 2024
Viewed by 349
Abstract
This research study investigates and predicts the obesity prevalence in Missouri, utilizing deep neural visual features extracted from medium-resolution satellite imagery (Sentinel-2). By applying a deep convolutional neural network (DCNN), the study aims to predict the obesity rate of census tracts based on [...] Read more.
This research study investigates and predicts the obesity prevalence in Missouri, utilizing deep neural visual features extracted from medium-resolution satellite imagery (Sentinel-2). By applying a deep convolutional neural network (DCNN), the study aims to predict the obesity rate of census tracts based on visual features in the satellite imagery that covers each tract. The study utilizes Sentinel-2 satellite images, processed using the ResNet-50 DCNN, to extract deep neural visual features (DNVF). Obesity prevalence data, sourced from the CDC’s 2022 estimates, is analyzed at the census tract level. The datasets were integrated to apply a machine learning model to predict the obesity rates in 1052 different census tracts in Missouri. The analysis reveals significant associations between DNVF and obesity prevalence. The predictive models show moderate success in estimating and predicting obesity rates in various census tracts within Missouri. The study emphasizes the potential of using satellite imagery and advanced machine learning in public health research. It points to environmental factors as significant determinants of obesity, suggesting the need for targeted health interventions. Employing DNVF to explore and predict obesity rates offers valuable insights for public health strategies and calls for expanded research in diverse geographical contexts. Full article
Show Figures

Figure 1

Figure 1
<p>Flowchart illustrating the estimation of obesity rates from satellite imagery using a combination of deep learning with ResNet-50 architecture and machine learning regression analysis.</p>
Full article ">Figure 2
<p>Data processing workflow for Sentinel-2 satellite imagery within Missouri in 2022. (<b>A</b>) Displays the geographic coverage of 33 Sentinel-2 images across Missouri, with county boundaries. The central diagram outlines the normalization process and the cropping of images into 224 × 224 pixel chips. (<b>B</b>) Illustrates the distribution of 82,500 resultant image chips. The red box represents Mid-Missouri area and Boone County.</p>
Full article ">Figure 3
<p>Choropleth map displaying the distribution of obesity rates percentage for individuals across Missouri census tracts in 2022. The variations in the color intensity reflect the range of obesity prevalence, with darker red indicating higher obesity rates. The color scale to the right quantifies the obesity rates corresponding to each color shade.</p>
Full article ">Figure 4
<p>Multiscale analysis of satellite image chips and census tracts in Missouri. (<b>A</b>) exhibits a statewide view with image chips overlaying 1052 census tracts, indicating extensive data coverage. (<b>B</b>) zooms into the Boone County area, detailing the alignment of image chips to local geography. (<b>C</b>) details individual image chips boundaries, illustrating their overlap with seven distinct census tracts (numbered for reference). (<b>D</b>) further narrows down to Census Tract 0608, demonstrating the intersection with 150 specific image chips for granular analysis. The figure highlights the granularity and density of data distribution within the geographic study area.</p>
Full article ">Figure 5
<p>Scatter plots show the relationship between actual and predicted obesity rates using a GLM machine learning model across 10 distinct cross-validation folds. The red dashed line represents perfect prediction accuracy.</p>
Full article ">Figure 6
<p>Scatter plot displaying the relationship between actual and predicted obesity rates using Generalized Linear Regression (GLM), illustrating a moderate degree of correlation with an <math display="inline"><semantics> <msup> <mi>R</mi> <mn>2</mn> </msup> </semantics></math> value of 0.44 and an adjusted <math display="inline"><semantics> <msup> <mi>R</mi> <mn>2</mn> </msup> </semantics></math> of 0.43. The close fit is further evidenced by a moderate Mean Squared Error (MSE) of 18.64. These metrics are provided to assess the accuracy of the model predictions.</p>
Full article ">Figure 7
<p>Scatter plot of the relationship between actual and predicted obesity rates using random forest, illustrating a moderate correlation with an <math display="inline"><semantics> <msup> <mi>R</mi> <mn>2</mn> </msup> </semantics></math> value of 0.48 and an adjusted <math display="inline"><semantics> <msup> <mi>R</mi> <mn>2</mn> </msup> </semantics></math> of 0.47. The close fit is further evidenced by a moderate Mean Squared Error (MSE) of 17.35.</p>
Full article ">Figure 8
<p>The left map shows spatial distribution of Feature 1112 across Missouri, with red circles highlighting areas of high values (urban areas). The right map depicts actual obesity rates (%) across the state, with blue circles indicating regions with lower obesity prevalence. Notable discrepancies between feature values and obesity rates can be observed in several regions.</p>
Full article ">Figure 9
<p>The spatial distribution of the 2nd and 3rd most important features across Missouri, (<b>left</b>) Feature 0095th and (<b>right</b>) Feature 1314. Urban areas, particularly around Kansas City for Feature 0095 and St. Louis for Feature 1314, show significant concentrations of these features.</p>
Full article ">Figure 10
<p>Spatial distribution of the 4th and 5th most important features across Missouri, (<b>left</b>) Feature 0767 and (<b>right</b>) Feature 0239.</p>
Full article ">Figure 11
<p>Geospatial analysis of obesity rates and prediction accuracy in Missouri. (<b>A</b>) displays the actual obesity rates, while (<b>B</b>) shows the predicted rates, both using a color gradient to represent percentages. (<b>C</b>) highlights areas with significant predictive errors by filtering out RMSE values below 4, focusing on regions where the model’s accuracy is lower. (<b>D</b>) refines this analysis by presenting a broader error distribution, including RMSE values of 2.5 and above, using the same color gradient for consistency. (<b>E</b>) Curve illustrating the signed error distribution of predicted obesity rates across census tracts. Negative signed errors indicate underpredictions (shown in red) and positive errors indicate overpredictions (shown in green). The census tracts are ranked by the magnitude of error, highlighting the asymmetry in predictive accuracy and potential systematic bias in the model.</p>
Full article ">Figure A1
<p>Bar chart displays the top 10 visual features ranked by their importance as used in our obesity rates prediction model. The x-axis represents the feature numbers, which are specific identifiers for each visual feature. The y-axis indicates the importance of each feature in percentage terms.</p>
Full article ">
21 pages, 12271 KiB  
Article
Detection of Marine Oil Spill from PlanetScope Images Using CNN and Transformer Models
by Jonggu Kang, Chansu Yang, Jonghyuk Yi and Yangwon Lee
J. Mar. Sci. Eng. 2024, 12(11), 2095; https://doi.org/10.3390/jmse12112095 - 19 Nov 2024
Viewed by 391
Abstract
The contamination of marine ecosystems by oil spills poses a significant threat to the marine environment, necessitating the prompt and effective implementation of measures to mitigate the associated damage. Satellites offer a spatial and temporal advantage over aircraft and unmanned aerial vehicles (UAVs) [...] Read more.
The contamination of marine ecosystems by oil spills poses a significant threat to the marine environment, necessitating the prompt and effective implementation of measures to mitigate the associated damage. Satellites offer a spatial and temporal advantage over aircraft and unmanned aerial vehicles (UAVs) in oil spill detection due to their wide-area monitoring capabilities. While oil spill detection has traditionally relied on synthetic aperture radar (SAR) images, the combined use of optical satellite sensors alongside SAR can significantly enhance monitoring capabilities, providing improved spatial and temporal coverage. The advent of deep learning methodologies, particularly convolutional neural networks (CNNs) and Transformer models, has generated considerable interest in their potential for oil spill detection. In this study, we conducted a comprehensive and objective comparison to evaluate the suitability of CNN and Transformer models for marine oil spill detection. High-resolution optical satellite images were used to optimize DeepLabV3+, a widely utilized CNN model; Swin-UPerNet, a representative Transformer model; and Mask2Former, which employs a Transformer-based architecture for both encoding and decoding. The results of cross-validation demonstrate a mean Intersection over Union (mIoU) of 0.740, 0.840 and 0.804 for all the models, respectively, indicating their potential for detecting oil spills in the ocean. Additionally, we performed a histogram analysis on the predicted oil spill pixels, which allowed us to classify the types of oil. These findings highlight the considerable promise of the Swin Transformer models for oil spill detection in the context of future marine disaster monitoring. Full article
(This article belongs to the Special Issue Remote Sensing Applications in Marine Environmental Monitoring)
Show Figures

Figure 1

Figure 1
<p>Examples of image processing steps: (<b>a</b>) original satellite images, (<b>b</b>) images after gamma correction and histogram adjustment, and (<b>c</b>) labeled images.</p>
Full article ">Figure 2
<p>Flowchart of this study, illustrating the processes of labeling, modeling, optimization, and evaluation using the DeepLabV3+, Swin-UPerNet, and Mask2Former models [<a href="#B23-jmse-12-02095" class="html-bibr">23</a>,<a href="#B24-jmse-12-02095" class="html-bibr">24</a>,<a href="#B25-jmse-12-02095" class="html-bibr">25</a>].</p>
Full article ">Figure 3
<p>Concept of the 5-fold cross-validation in this study.</p>
Full article ">Figure 4
<p>Examples of image data augmentation using the Albumentations library. The example images include random 90-degree rotation, horizontal flip, vertical flip, optical distortion, grid distortion, RGB shift, and random brightness/contrast adjustment.</p>
Full article ">Figure 5
<p>Randomly selected examples from fold 1, including PlanetScope RGB images, segmentation labels, and predictions from DeepLabV3+ (DL), Swin-UPerNet (Swin), and Mask2Former (M2F).</p>
Full article ">Figure 6
<p>Randomly selected examples from fold 2, including PlanetScope RGB images, segmentation labels, and predictions from DeepLabV3+ (DL), Swin-UPerNet (Swin), and Mask2Former (M2F).</p>
Full article ">Figure 7
<p>Randomly selected examples from fold 3, including PlanetScope RGB images, segmentation labels, and predictions from DeepLabV3+ (DL), Swin-UPerNet (Swin), and Mask2Former (M2F).</p>
Full article ">Figure 8
<p>Randomly selected examples from fold 4, including PlanetScope RGB images, segmentation labels, and predictions from DeepLabV3+ (DL), Swin-UPerNet (Swin), and Mask2Former (M2F).</p>
Full article ">Figure 9
<p>Randomly selected examples from fold 5, including PlanetScope RGB images, segmentation labels, and predictions from DeepLabV3+ (DL), Swin-UPerNet (Swin), and Mask2Former (M2F).</p>
Full article ">Figure 10
<p>Thick oil layers with a dark black tone: histogram distribution graph and box plot of oil spill pixels extracted from the labels, DeepLabV3+, Swin-UPerNet, and Mask2Former. The <span class="html-italic">x</span>-axis values represent the digital numbers (DNs) from PlanetScope images. (<b>a</b>) Oil mask, (<b>b</b>) histogram, and (<b>c</b>) box plot.</p>
Full article ">Figure 11
<p>Thin oil layers with a bright silver tone: histogram distribution graph and box plot of oil spill pixels extracted from the labels, DeepLabV3+, Swin-UPerNet, and Mask2Former. The <span class="html-italic">x</span>-axis values represent the digital numbers (DNs) from PlanetScope images. (<b>a</b>) Oil mask, (<b>b</b>) histogram, and (<b>c</b>) box plot.</p>
Full article ">Figure 12
<p>Thin oil layers with a bright rainbow tone: histogram distribution graph and box plot of oil spill pixels extracted from the labels, DeepLabV3+, Swin-UPerNet, and Mask2Former. The <span class="html-italic">x</span>-axis values represent the digital numbers (DNs) from PlanetScope images. (<b>a</b>) Oil mask, (<b>b</b>) histogram, and (<b>c</b>) box plot.</p>
Full article ">
22 pages, 4472 KiB  
Article
Enhancing Data Privacy Protection and Feature Extraction in Secure Computing Using a Hash Tree and Skip Attention Mechanism
by Zizhe Zhou, Yaqi Wang, Lin Cong, Yujing Song, Tianyue Li, Meishu Li, Keyi Xu and Chunli Lv
Appl. Sci. 2024, 14(22), 10687; https://doi.org/10.3390/app142210687 - 19 Nov 2024
Viewed by 345
Abstract
This paper addresses the critical challenge of secure computing in the context of deep learning, focusing on the pressing need for effective data privacy protection during transmission and storage, particularly in sensitive fields such as finance and healthcare. To tackle this issue, we [...] Read more.
This paper addresses the critical challenge of secure computing in the context of deep learning, focusing on the pressing need for effective data privacy protection during transmission and storage, particularly in sensitive fields such as finance and healthcare. To tackle this issue, we propose a novel deep learning model that integrates a hash tree structure with a skip attention mechanism. The hash tree is employed to ensure data integrity and security, enabling the rapid verification of data changes, while the skip attention mechanism enhances computational efficiency by allowing the model to selectively focus on important features, thus minimizing unnecessary processing. The primary objective of our research is to develop a secure computing model that not only safeguards data privacy but also optimizes feature extraction capabilities. Our experimental results on the CIFAR-10 dataset demonstrate significant improvements over traditional models, achieving a precision of 0.94, a recall of 0.89, an accuracy of 0.92, and an F1-score of 0.91, notably outperforming standard self-attention and CBAM. Additionally, the visualization of results confirms that our approach effectively balances efficient feature extraction with robust data privacy protection. This research contributes a new framework for secure computing, addressing both the security and efficiency concerns prevalent in current methodologies. Full article
(This article belongs to the Special Issue Cloud Computing: Privacy Protection and Data Security)
Show Figures

Figure 1

Figure 1
<p>Visualization of hash algorithms.</p>
Full article ">Figure 2
<p>Visualization of attention mechanisms.</p>
Full article ">Figure 3
<p>Image dataset augmentation methods: (<b>A</b>) is CutOut; (<b>B</b>) is CutMix; (<b>C</b>) is Mosaic.</p>
Full article ">Figure 4
<p>Hash-tree-based transformer model.</p>
Full article ">Figure 5
<p>Hash tree node-insertion structure diagram: This figure shows the process of node insertion within the hash tree structure. After a new node is inserted, the hash values of the relevant subtrees are updated sequentially, ultimately reflecting in the root node’s hash value. This process ensures data integrity, as any change to a node will be mirrored throughout the hash tree structure, thereby guaranteeing data security and verifiability.</p>
Full article ">Figure 6
<p>Hash tree node-deletion structure diagram: This figure illustrates the node-deletion process within the hash tree structure. When a node is removed, the hash values of the associated subtrees are correspondingly updated up to the root node. This process ensures the consistency of hash values across the entire hash tree, preserving data integrity and verifiability after the deletion operation.</p>
Full article ">Figure 7
<p>Skip attention structure diagram: This figure illustrates the basic structure of the proposed skip attention mechanism, which introduces skip connections to optimize information flow and computational efficiency within the model. The structure includes multi-head attention and feedforward network modules, where skip connections enable direct information transfer between feature layers, effectively reducing redundant computations and allowing the model to more efficiently extract and utilize key information.</p>
Full article ">Figure 8
<p>ROCs of different models.</p>
Full article ">Figure 9
<p>ROCs of different models.</p>
Full article ">Figure 10
<p>Encryption effect visualization.</p>
Full article ">
19 pages, 4245 KiB  
Article
Lightweight UAV Small Target Detection and Perception Based on Improved YOLOv8-E
by Yongjuan Zhao, Lijin Wang, Guannan Lei, Chaozhe Guo and Qiang Ma
Drones 2024, 8(11), 681; https://doi.org/10.3390/drones8110681 - 19 Nov 2024
Viewed by 363
Abstract
Traditional unmanned aerial vehicle (UAV) detection methods struggle with multi-scale variations during flight, complex backgrounds, and low accuracy, whereas existing deep learning detection methods have high accuracy but high dependence on equipment, making it difficult to detect small UAV targets efficiently. To address [...] Read more.
Traditional unmanned aerial vehicle (UAV) detection methods struggle with multi-scale variations during flight, complex backgrounds, and low accuracy, whereas existing deep learning detection methods have high accuracy but high dependence on equipment, making it difficult to detect small UAV targets efficiently. To address the above challenges, this paper proposes an improved lightweight high-precision model, YOLOv8-E (Enhanced YOLOv8), for the fast and accurate detection and identification of small UAVs in complex environments. First, a Sobel filter is introduced to enhance the C2f module to form the C2f-ESCFFM (Edge-Sensitive Cross-Stage Feature Fusion Module) module, which achieves higher computational efficiency and feature representation capacity while preserving detection accuracy as much as possible by fusing the SobelConv branch for edge extraction and the convolution branch to extract spatial information. Second, the neck network is based on the HSFPN (High-level Screening-feature Pyramid Network) architecture, and the CAA (Context Anchor Attention) mechanism is introduced to enhance the semantic parsing of low-level features to form a new CAHS-FPN (Context-Augmented Hierarchical Scale Feature Pyramid Network) network, enabling the fusion of deep and shallow features. This improves the feature representation capability of the model, allowing it to detect targets of different sizes efficiently. Finally, the optimized detail-enhanced convolution (DEConv) technique is introduced into the head network, forming the LSCOD (Lightweight Shared Convolutional Object Detector Head) module, enhancing the generalization ability of the model by integrating a priori information and adopting the strategy of shared convolution. This ensures that the model enhances its localization and classification performance without increasing parameters or computational costs, thus effectively improving the detection performance of small UAV targets. The experimental results show that compared with the baseline model, the YOLOv8-E model achieved (mean average precision at IoU = 0.5) an [email protected] improvement of 6.3%, reaching 98.4%, whereas the model parameter scale was reduced by more than 50%. Overall, YOLOv8-E significantly reduces the demand for computational resources while ensuring high-precision detection. Full article
Show Figures

Figure 1

Figure 1
<p>YOLOv8-E network architecture. ⊗ represents that the weight information generated by CAA attention is multiplied with the feature map of the corresponding scale to generate a filtered feature map.</p>
Full article ">Figure 2
<p>C2f-ESCFFM structure.</p>
Full article ">Figure 3
<p>CAHS-FPN structure.</p>
Full article ">Figure 4
<p>LSCOD structure.</p>
Full article ">Figure 5
<p>DEConv structure. In this figure, VC, ADC, CDC, VDC, and HDC represent the five parallel deployed convolution layers that are included in the DEConv operation, which are the standard convolution, angle differential convolution, center differential convolution, vertical differential convolution, and horizontal differential convolution, respectively.</p>
Full article ">Figure 6
<p>Partial dataset. The dataset contains images of different-sized UAVs with various backgrounds (sky, buildings, trees, occlusions, strong lighting, etc.).</p>
Full article ">Figure 7
<p>Experimental results. (<b>a</b>) Ablation experiment results; (<b>b</b>) parallel experiment results.</p>
Full article ">Figure 8
<p>Heatmap of error type contribution. The figure shows the error contribution of different algorithms for error analysis using the TIDE toolbox, where a, b, c, d, e, and f are YOLOv8, YOLOv8 + C2f-ESCFFM, YOLOv8 + CAHS-FPN, YOLOv8 + LSCOD, YOLOv8 + C2f-ESCFFM + CAHS-FPN, and YOLOv8 + C2f-ESCFFM + CAHS-FPN + LSCOD (ours). ECls: correctly localized but incorrectly categorized. ELoc: correctly categorized but incorrectly localized. EBoth: incorrectly categorized and incorrectly localized. EDupe: duplicate detection error. EBkg: detected background as foreground. EMiss: missed GT (ground truth) error.</p>
Full article ">Figure 9
<p>Gain plots. The figure shows the experimental results gain plot of the YOLOv8 model and our (YOLOv8-E) model on six different datasets. Specifically, A, B, C, D, E, and F represent the DUT-Anti-UAV, TIB-Net, Drone Dataset, USC Drone Dataset, Drone Dataset (UAV), and Drone-vs-bird datasets, respectively.</p>
Full article ">Figure 10
<p>Visualized heatmap of different networks, where (<b>a</b>) is the original image; (<b>b</b>) is the heatmap of the YOLOv8 model; (<b>c</b>) is the heatmap of the YOLOv8 + C2f-ESCFFM model; (<b>d</b>) is the heatmap of the YOLOv8 + C2f-ESCFFM + CAHS-FPN model; (<b>e</b>) is the heatmap of the YOLOv8 + C2f-ESCFFM + CAHS-FPN + LSCOD model.</p>
Full article ">
Back to TopTop