Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (431)

Search Parameters:
Keywords = fine-grained classification

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 2207 KiB  
Article
A Novel TLS-Based Fingerprinting Approach That Combines Feature Expansion and Similarity Mapping
by Amanda Thomson, Leandros Maglaras and Naghmeh Moradpoor
Future Internet 2025, 17(3), 120; https://doi.org/10.3390/fi17030120 (registering DOI) - 7 Mar 2025
Abstract
Malicious domains are part of the landscape of the internet but are becoming more prevalent and more dangerous both to companies and to individuals. They can be hosted on various technologies and serve an array of content, including malware, command and control and [...] Read more.
Malicious domains are part of the landscape of the internet but are becoming more prevalent and more dangerous both to companies and to individuals. They can be hosted on various technologies and serve an array of content, including malware, command and control and complex phishing sites that are designed to deceive and expose. Tracking, blocking and detecting such domains is complex, and very often it involves complex allowlist or denylist management or SIEM integration with open-source TLS fingerprinting techniques. Many fingerprinting techniques, such as JARM and JA3, are used by threat hunters to determine domain classification, but with the increase in TLS similarity, particularly in CDNs, they are becoming less useful. The aim of this paper was to adapt and evolve open-source TLS fingerprinting techniques with increased features to enhance granularity and to produce a similarity-mapping system that would enable the tracking and detection of previously unknown malicious domains. This was achieved by enriching TLS fingerprints with HTTP header data and producing a fine-grain similarity visualisation that represented high-dimensional data using MinHash and Locality-Sensitive Hashing. Influence was taken from the chemistry domain, where the problem of high-dimensional similarity in chemical fingerprints is often encountered. An enriched fingerprint was produced, which was then visualised across three separate datasets. The results were analysed and evaluated, with 67 previously unknown malicious domains being detected based on their similarity to known malicious domains and nothing else. The similarity-mapping technique produced demonstrates definite promise in the arena of early detection of malware and phishing domains. Full article
Show Figures

Figure 1

Figure 1
<p>A flow diagram of the end-to-end fingerprint processing pipeline.</p>
Full article ">Figure 2
<p>The raw fingerprint produced from the active scan.</p>
Full article ">Figure 3
<p>A screenshot of the HEAD request being made, as seen within Wireshark. The HTTP Protocol is highlighted in green.</p>
Full article ">Figure 4
<p>A screenshot of a typical set of HTTP headers received in response to a HEAD request during the header enrichment process. The HEAD request is seen in red, the HTTP response is blue.</p>
Full article ">Figure 5
<p>Graph displaying TLS features enriched with HTTP header data. The resulting feature matrix <math display="inline"><semantics> <mrow> <mi>M</mi> <mo>∈</mo> <msup> <mrow> <mo>{</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>}</mo> </mrow> <mrow> <mi>n</mi> <mo>×</mo> <mi>d</mi> </mrow> </msup> </mrow> </semantics></math> has dimensions <span class="html-italic">n</span> = 16,254 (fingerprints) and <span class="html-italic">d</span> = 2124 (features), representing the complete binary feature space of the TLS and HTTP characteristics. Known good domains are coloured green, known bad domains, red and unknown domains, orange.</p>
Full article ">Figure 6
<p>The Mixed Host dataset displays a diverse number of distance metrics and a broader distribution of similarity scores across the sample space. Each line represents a different domain, with a range of colors to aid in differentiation.</p>
Full article ">Figure 7
<p>The Cloudflare CDN dataset displays less diversity in similarity. All k-nearest neighbours maintain distances below 0.30. This shows closer similarity between domains. Each line represents a different domain, with a range of colors to aid in differentiation.</p>
Full article ">Figure 8
<p>A typical domain with strong indicators of malicious intent. The domain was sourced from the unknown category and registered within 30 days of the scan taking place. At the time of evaluation, 12 security vendors had flagged the domain as malicious, including Sophos, Fortinet, ESET and Bitdefender.</p>
Full article ">Figure 9
<p>An example of a domain on the threshold for further investigation. The domain has three vendors confirmed as malicious—BitDefender, CRDF and G-Data—but a further suspicious flag from vendor Trustwave. The left-hand shows the heuristic scan performed by URLQuery, indicating that ClearFake malicious JavaScript library was detected.</p>
Full article ">Figure 10
<p>The LSH forest of dataset A visualised using Fearun. Known bad domains are colored red, known good are colored blue and unknown domains are colored orange.</p>
Full article ">Figure 11
<p>The LSH forest of dataset B (Cloudflare CDN domains) visualised using Fearun. The TLS fingerprints have been enriched with HTTP header data. Known bad domains are colored red, known good are colored blue and unknown domains are colored orange.</p>
Full article ">Figure 12
<p>The LSH forest of dataset B (Cloudflare CDN domains) visualised using Fearun. The TLS fingerpritns are not enriched and contain only TLS features. Known bad domains are colored red, known good are colored blue and unknown domains are colored orange.</p>
Full article ">Figure 13
<p>The LSH visualisation of dataset C, known malicious domains. Clear similarity patterns can be seen forming by capability. Go Phish domains are seen in yellow, Cert Pl orange, Metasploit pink, Tactical RRM purple and Burp Collaborator Blue.</p>
Full article ">
25 pages, 152810 KiB  
Article
QEDetr: DETR with Query Enhancement for Fine-Grained Object Detection
by Chenguang Dong, Shan Jiang, Haijiang Sun, Jiang Li, Zhenglei Yu, Jiasong Wang and Jiacheng Wang
Remote Sens. 2025, 17(5), 893; https://doi.org/10.3390/rs17050893 - 3 Mar 2025
Viewed by 248
Abstract
Fine-grained object detection aims to accurately localize the object bounding box while identifying the specific model of the object, which is more challenging than conventional remote sensing object detection. Transformer-based object detector (DETR) can capture remote inter-feature dependencies by using attention, which is [...] Read more.
Fine-grained object detection aims to accurately localize the object bounding box while identifying the specific model of the object, which is more challenging than conventional remote sensing object detection. Transformer-based object detector (DETR) can capture remote inter-feature dependencies by using attention, which is suitable for fine-grained object detection tasks. However, most existing DETR-like object detectors are not specifically optimized for remote sensing detection tasks. Therefore, we propose an oriented fine-grained object detection method based on transformers. First, we combine denoising training and angle coding to propose a baseline DETR-like object detector for oriented object detection. Next, we propose a new attention mechanism for extracting finer-grained features by constraining the angle of sampling points during the attentional process, ensuring that the sampling points are more evenly distributed across the object features. Then, we propose a multiscale fusion method based on bilinear pooling to obtain the enhanced query and initialize a more accurate object bounding box. Finally, we combine the localization accuracy of each query with its classification accuracy and propose a new classification loss to further enhance the high-quality queries. Evaluation results on the FAIR1M dataset show that our method achieves an average accuracy of 48.5856 mAP and the highest accuracy of 49.7352 mAP in object detection, outperforming other methods. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Figure 1

Figure 1
<p>Overall structure of QEDetr: (<b>a</b>) extraction of input image features and inputting to encoder uses the same process as Deformable DETR; (<b>b</b>) multiscale fusion of feature maps and screening of top-k queries; (<b>c</b>) the RADA module used for the decoder process generates the categories layer-by-layer, iteratively refining the reference box and angle; (<b>d</b>) QEDetr uses the regression and IoU losses for the reference box and angle along with our proposed AIL for the categorization loss.</p>
Full article ">Figure 2
<p>Decoder of QEDetr, showing the whole process of the decoder and its iterative layer-by-layer refinement of the reference box and angle coding.</p>
Full article ">Figure 3
<p>Demonstration of the alignment process for each type of attention: (<b>a</b>) the alignment process for deformable attention, which does not include sample point rotations; (<b>b</b>) the alignment process for RDA, which includes sample point rotations but does not take into account the shape distribution of the sample points themselves; and (<b>c</b>) the alignment process for RADA, which we present here.</p>
Full article ">Figure 4
<p>In multiscale bilinear fusion, a shared MLP is used to compute the foreground scores for each scale feature map, then the high-level feature map and foreground scores are upsampled and fused with the low-level information, and the output features and foregrounds from each layer are finally spliced into the outputs.</p>
Full article ">Figure 5
<p>Number of instances per category in the FAIR1M2.0 dataset after multiscale cropping.</p>
Full article ">Figure 6
<p>Loss curve maps of QEDetr.</p>
Full article ">Figure 7
<p>Results on the FAIR1M dataset.</p>
Full article ">Figure 8
<p>Comparison visualizing the results of different object detection algorithms on FAIR1M: (<b>a</b>) baseline (DN+PSC), (<b>b</b>) ARSDetr, and (<b>c</b>) QEDetr.</p>
Full article ">Figure 9
<p>Heatmap visualization of the backbone.</p>
Full article ">Figure 10
<p>Sample point visualization results: (<b>a</b>) deformable attention and (<b>b</b>) RADA. The colors of the sampled points represent their weights in the attention process.</p>
Full article ">
19 pages, 3572 KiB  
Article
MOSSNet: A Lightweight Dual-Branch Multiscale Attention Neural Network for Bryophyte Identification
by Haixia Luo, Xiangfen Zhang, Feiniu Yuan, Jing Yu, Hao Ding, Haoyu Xu and Shitao Hong
Symmetry 2025, 17(3), 347; https://doi.org/10.3390/sym17030347 - 25 Feb 2025
Viewed by 150
Abstract
Bryophytes, including liverworts, mosses, and hornworts, play an irreplaceable role in soil moisture retention, erosion prevention, and pollution monitoring. The precise identification of bryophyte species enhances our understanding and utilization of their ecological functions. However, their complex morphology and structural symmetry make identification [...] Read more.
Bryophytes, including liverworts, mosses, and hornworts, play an irreplaceable role in soil moisture retention, erosion prevention, and pollution monitoring. The precise identification of bryophyte species enhances our understanding and utilization of their ecological functions. However, their complex morphology and structural symmetry make identification difficult. Although deep learning improves classification efficiency, challenges remain due to limited datasets and the inadequate adaptation of existing methods to multi-scale features, causing poor performance in fine-grained multi-classification. Thus, we propose MOSSNet, a lightweight neural network for bryophyte feature detection. It has a four-stage architecture that efficiently extracts multi-scale features using a modular design with symmetry consideration in feature representation. At the input stage, the Convolutional Patch Embedding (CPE) module captures representative features through a two-layer convolutional structure. In each subsequent stage, Dual-Branch Multi-scale (DBMS) modules are employed, with one branch utilizing convolutional operations and the other utilizing the Dilated Convolution Enhanced Attention (DCEA) module for multi-scale feature fusion. The DBMS module extracts fine-grained and coarse-grained features by a weighted fusion of the outputs from two branches. Evaluating MOSSNet on the self-constructed dataset BryophyteFine reveals a Top-1 accuracy of 99.02% in classifying 26 bryophyte species, 7.13% higher than the best existing model, while using only 1.58 M parameters, 0.07 G FLOPs. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

Figure 1
<p>Demonstration of interclass similarity and intraclass variability.</p>
Full article ">Figure 2
<p>The overall MOSSNet framework.</p>
Full article ">Figure 3
<p>DBMS module detailed structure.</p>
Full article ">Figure 4
<p>Image types in the BryophyteFine dataset.</p>
Full article ">Figure 5
<p>Heat map of model classification confusion matrix.</p>
Full article ">Figure 6
<p>Distribution of model parameters and Mean Average Precision.</p>
Full article ">
16 pages, 3967 KiB  
Article
Potato Disease and Pest Question Classification Based on Prompt Engineering and Gated Convolution
by Wentao Tang and Zelin Hu
Agriculture 2025, 15(5), 493; https://doi.org/10.3390/agriculture15050493 - 25 Feb 2025
Viewed by 154
Abstract
Currently, there is no publicly available dataset for the classification of potato pest and disease-related queries. Moreover, traditional query classification models generally adopt a single maximum-pooling strategy when performing down-sampling operations. This mechanism only extracts the extreme value responses within the local receptive [...] Read more.
Currently, there is no publicly available dataset for the classification of potato pest and disease-related queries. Moreover, traditional query classification models generally adopt a single maximum-pooling strategy when performing down-sampling operations. This mechanism only extracts the extreme value responses within the local receptive field, which leads to the degradation of fine-grained feature representation and significantly amplifies text noise. To address these issues, a dataset construction method based on prompt engineering is proposed, along with a question classification method utilizing a gated fusion–convolutional neural network (GF-CNN). By interacting with large language models, prompt words are used to generate potato disease and pest question templates and efficiently construct the Potato Pest and Disease Question Classification Dataset (PDPQCD) by batch importing named entities. The GF-CNN combines outputs from convolutional kernels of varying sizes, and after processing with max-pooling and average-pooling, a gating mechanism is employed to regulate the flow of information, thereby optimizing the text feature extraction process. Experiments using GF-CNN on the PDPQCD, Subj, and THUCNews datasets show F1 scores of 100.00%, 96.70%, and 93.55%, respectively, outperforming other models. The prompt engineering-based method provides a new paradigm for constructing question classification datasets, and the GF-CNN can also be extended for application in other domains. Full article
(This article belongs to the Special Issue Computational, AI and IT Solutions Helping Agriculture)
Show Figures

Figure 1

Figure 1
<p>Prompt and model replies.</p>
Full article ">Figure 2
<p>Entity import algorithm logic.</p>
Full article ">Figure 3
<p>Data distribution.</p>
Full article ">Figure 4
<p>Model structure.</p>
Full article ">Figure 5
<p>GF-CNN model structure.</p>
Full article ">Figure 6
<p>Gating fusion unit.</p>
Full article ">Figure 7
<p>Confusion matrix.</p>
Full article ">Figure 8
<p>Performance comparison of different feature fusion.</p>
Full article ">
21 pages, 4398 KiB  
Article
Local Diversity-Guided Weakly Supervised Fine-Grained Image Classification Method
by Yuebo Meng, Xianglong Luo, Hua Zhan, Bo Wang, Shilong Su and Guanghui Liu
Appl. Sci. 2025, 15(5), 2437; https://doi.org/10.3390/app15052437 - 25 Feb 2025
Viewed by 374
Abstract
For fine-grained recognition, capturing distinguishable features and effectively utilizing local information play a key role, since the objects of recognition exhibit subtle differences in different subcategories. Finding subtle differences between subclasses is not straightforward. To address this problem, we propose a weakly supervised [...] Read more.
For fine-grained recognition, capturing distinguishable features and effectively utilizing local information play a key role, since the objects of recognition exhibit subtle differences in different subcategories. Finding subtle differences between subclasses is not straightforward. To address this problem, we propose a weakly supervised fine-grained classification network model with Local Diversity Guidance (LDGNet). We designed a Multi-Attention Semantic Fusion Module (MASF) to build multi-layer attention maps and channel–spatial interaction, which can effectively enhance the semantic representation of the attention maps. We also introduce a random selection strategy (RSS) that forces the network to learn more comprehensive and detailed information and more local features from the attention map by designing three feature extraction operations. Finally, both the attention map obtained by RSS and the feature map are employed for prediction through a fully connected layer. At the same time, a dataset of ancient towers is established, and our method is applied to ancient building recognition for practical applications of fine-grained image classification tasks in natural scenes. Extensive experiments conducted on four fine-grained datasets and explainable visualization demonstrate that the LDGNet can effectively enhance discriminative region localization and detailed feature acquisition for fine-grained objects, achieving competitive performance over other state-of-the-art algorithms. Full article
Show Figures

Figure 1

Figure 1
<p>Different views of the three towers. (<b>a</b>–<b>c</b>) Hangzhou Leifeng Tower; (<b>d</b>–<b>f</b>) ancient towers of Fogong Temple of Shanxi Yingxian.</p>
Full article ">Figure 2
<p>Local Diversity-Guided Classification Network (LDGNet) structure diagram. MASF: Multi-Attention Semantic Fusion Module. RSS: random selection strategy. BP: bilinear pooling. Given an image, the image features are first extracted through the designed ConvNext backbone. Then, feature maps from the third and fourth layers are extracted, and the MASF and RSS modules are applied to obtain reorganized feature maps, which are fused with the original features. Finally, the high- and low-level features are concatenated and sent to the classification head for outputting the specific category.</p>
Full article ">Figure 3
<p>Multi-Attention Semantic Fusion Module structure diagram. First, the module takes the feature maps from the third and fourth layers of the backbone as inputs. Next, channel interaction and spatial interaction are applied separately to the two feature maps. Both channel and spatial interactions include additional weight and cross-reorganization operations. Finally, the features outputted by the channel and spatial interactions are summed correspondingly to obtain the final two feature outputs, which are sent to the next stage.</p>
Full article ">Figure 4
<p>Random selection strategy structure. Given the features obtained from the previous stage as inputs, the features are split into refined data and passed through branches with different weights: “most significant suppression”, “feature enhancement”, and “no operation”. Finally, the outputs from these branches are concatenated and reorganized into new features for output.</p>
Full article ">Figure 5
<p>The ACC-Epochs curves of the LDGNet during the training and testing phases. (<b>a</b>) represents the training phase, and (<b>b</b>) represents the testing phase.</p>
Full article ">Figure 6
<p>The real-world environment of the ancient towers dataset.</p>
Full article ">Figure 7
<p>The structure of models, (<b>a</b>) original backbone, (<b>b</b>) backbone + MASF, (<b>c</b>) backbone + RSS, (<b>d</b>) backbone + LDGNet.</p>
Full article ">Figure 8
<p>Visualization of heat maps generated from different models. (<b>a</b>) original input image, (<b>b</b>) ConvNeXt backbone, (<b>c</b>) backbone + MASF, (<b>d</b>) backbone + RSS, (<b>e</b>) backbone + LDGNet.</p>
Full article ">Figure 9
<p>Experimental results of threshold δ on three fine-grained datasets. (<b>a</b>) represents the accuracy graph on CUB-200-2011. (<b>b</b>) represents the accuracy graph on NABirds. (<b>c</b>) represents the accuracy graph on Ancient Towers.</p>
Full article ">Figure 10
<p>Ablation on the values of hyper parameter <math display="inline"><semantics> <mi>δ</mi> </semantics></math> on the datasets.</p>
Full article ">
29 pages, 17294 KiB  
Article
Detail and Deep Feature Multi-Branch Fusion Network for High-Resolution Farmland Remote-Sensing Segmentation
by Zhankui Tang, Xin Pan, Xiangfei She, Jing Ma and Jian Zhao
Remote Sens. 2025, 17(5), 789; https://doi.org/10.3390/rs17050789 - 24 Feb 2025
Viewed by 125
Abstract
Currently, the demand for refined crop monitoring through remote sensing is increasing rapidly. Due to the similar spectral and morphological characteristics of different crops and vegetation, traditional methods often rely on deeper neural networks to extract meaningful features. However, deeper networks face a [...] Read more.
Currently, the demand for refined crop monitoring through remote sensing is increasing rapidly. Due to the similar spectral and morphological characteristics of different crops and vegetation, traditional methods often rely on deeper neural networks to extract meaningful features. However, deeper networks face a key challenge: while extracting deep features, they often lose some boundary details and small-plot characteristics, leading to inaccurate farmland boundary classifications. To address this issue, we propose the Detail and Deep Feature Multi-Branch Fusion Network for High-Resolution Farmland Remote-Sensing Segmentation (DFBNet). DFBNet introduces an new three-branch structure based on the traditional UNet. This structure enhances the detail of ground objects, deep features across multiple scales, and boundary features. As a result, DFBNet effectively preserves the overall characteristics of farmland plots while retaining fine-grained ground object details and ensuring boundary continuity. In our experiments, DFBNet was compared with five traditional methods and demonstrated significant improvements in overall accuracy and boundary segmentation. On the Hi-CNA dataset, DFBNet achieved 88.34% accuracy, 89.41% pixel accuracy, and an IoU of 78.75%. On the Netherlands Agricultural Land Dataset, it achieved 90.63% accuracy, 91.6% pixel accuracy, and an IoU of 83.67%. These results highlight DFBNet’s ability to accurately delineate farmland boundaries, offering robust support for agricultural yield estimation and precision farming decision-making. Full article
Show Figures

Figure 1

Figure 1
<p>The input of DFBNet is a multi-band remote-sensing image block <span class="html-italic">I<sub>image</sub></span>, and the segmentation result <span class="html-italic">I<sub>result</sub></span> is obtained after the processing of DFBNet.</p>
Full article ">Figure 2
<p>When the model input is <span class="html-italic">FM<sub>preprocessing</sub></span>, its main function is to extract detail features through a series of convolutions and fusions and generate the output <span class="html-italic">FM<sub>DBR</sub></span>.</p>
Full article ">Figure 3
<p>When the model input is <span class="html-italic">FM<sub>preprocessing</sub></span>, its main function is to mine deep features through pooling operations and generate the output <span class="html-italic">FM<sub>F</sub><sub>BR</sub></span>.</p>
Full article ">Figure 4
<p>When the model input is <span class="html-italic">FM<sub>preprocessing</sub></span>, its main function is to pay special attention to enhancing the boundary features through the attention mechanism and generate the output <span class="html-italic">FM<sub>BBR</sub></span>.</p>
Full article ">Figure 5
<p>One of the paths is mainly used to capture detail information (from the <span class="html-italic">M<sub>Detail-Brunch</sub></span> branch), and the other path is used to mine deep features (from the <span class="html-italic">M<sub>DeepFeature-Brunch</sub></span> branch). Moreover, there is a feedback mechanism between these two paths, which enables information to communicate and be enhanced among different levels.</p>
Full article ">Figure 6
<p>The bilinear fusion (BF) module achieves fusion by calculating the second-order statistical information between the features of the two branches and can effectively capture the correlations among the features.</p>
Full article ">Figure 7
<p>Training loss curves for different models in two datasets. (<b>a</b>) The Hi-CNA dataset. (<b>b</b>) The Netherlands Agricultural Land Remote-Sensing Image Dataset.</p>
Full article ">Figure 8
<p>Results comparison. (<b>a</b>) Input image. (<b>b</b>) Ground truth. (<b>c</b>) UNet. (<b>d</b>) DeepLabV3+. (<b>e</b>) SegFormer. (<b>f</b>) Mask2Former. (<b>g</b>) OVSeg. (<b>h</b>) DFBNet.</p>
Full article ">Figure 9
<p>OS and US result discount bar chart comparison. (<b>a</b>) Test 1. (<b>b</b>) Test 2. (<b>c</b>) Test 3. (<b>d</b>) Test 4.</p>
Full article ">Figure 10
<p>Results comparison. (<b>a</b>) Input image. (<b>b</b>) Ground truth. (<b>c</b>) UNet. (<b>d</b>) DeepLabV3+. (<b>e</b>) SegFormer. (<b>f</b>) Mask2Former. (<b>g</b>) OVSeg. (<b>h</b>) DFBNet.</p>
Full article ">Figure 11
<p>OS and US result discount bar chart comparison. (<b>a</b>) Test 1. (<b>b</b>) Test 2. (<b>c</b>) Test 3. (<b>d</b>) Test 4.</p>
Full article ">Figure 12
<p>The Hi-CNA dataset results comparison. (<b>a</b>) Input image. (<b>b</b>) Ground truth. (<b>c</b>) DFBNet. (<b>d</b>) DFBNet-DPF. (<b>e</b>) DFBNet-BF.</p>
Full article ">Figure 13
<p>A discounting bar chart comparison of OS and US results for the Hi-CNA dataset. (<b>a</b>) Test 1. (<b>b</b>) Test 2. (<b>c</b>) Test 3. (<b>d</b>) Test 4.</p>
Full article ">Figure 14
<p>The Netherlands Agricultural Land Remote-Sensing Image Dataset results comparison. (<b>a</b>) Input image. (<b>b</b>) Ground truth. (<b>c</b>) DFBNet. (<b>d</b>) DFBNet-DPF. (<b>e</b>) DFBNet-BF.</p>
Full article ">Figure 15
<p>A discounting bar chart comparison of OS and US results for the Netherlands Agricultural Land Remote-Sensing Image Dataset. (<b>a</b>) Test 1. (<b>b</b>) Test 2. (<b>c</b>) Test 3. (<b>d</b>) Test 4.</p>
Full article ">
23 pages, 1421 KiB  
Article
EmoBERTa-X: Advanced Emotion Classifier with Multi-Head Attention and DES for Multilabel Emotion Classification
by Farah Hassan Labib, Mazen Elagamy and Sherine Nagy Saleh
Big Data Cogn. Comput. 2025, 9(2), 48; https://doi.org/10.3390/bdcc9020048 - 19 Feb 2025
Viewed by 233
Abstract
The rising prevalence of social media turns them into huge, rich repositories of human emotions. Understanding and categorizing human emotion from social media content is of fundamental importance for many reasons, such as improvement of user experience, monitoring of public sentiment, support for [...] Read more.
The rising prevalence of social media turns them into huge, rich repositories of human emotions. Understanding and categorizing human emotion from social media content is of fundamental importance for many reasons, such as improvement of user experience, monitoring of public sentiment, support for mental health, and enhancement of focused marketing strategies. However, social media text is often unstructured and ambiguous; hence, extracting meaningful emotional information is difficult. Thus, effective emotion classification needs advanced techniques. This article proposes a novel model, EmoBERTa-X, to enhance performance in multilabel emotion classification, particularly in informal and ambiguous social media texts. Attention mechanisms combined with ensemble learning, supported by preprocessing steps, help in avoiding issues such as class imbalance of the dataset, ambiguity in short texts, and the inherent complexities of multilabel classification. The experimental results on the GoEmotions dataset indicate that EmoBERTa-X has outperformed state-of-the-art models on fine-grained emotion-detection tasks in social media expressions with an accuracy increase of 4.32% over some popular approaches. Full article
(This article belongs to the Special Issue Advances in Natural Language Processing and Text Mining)
Show Figures

Figure 1

Figure 1
<p>EmoBERTa-X model architecture: This diagram illustrates the sequential workflow of the EmoBERTa-X model, beginning with data loading and preprocessing, followed by model training, dynamic ensemble selection, and concluding with model evaluation.</p>
Full article ">Figure 2
<p>EmoBERTa-X model with integrated multi-head attention mechanism: The general model structure is constituted of sequential layers, where the model starts with embeddings and an encoder, followed by the multi-head attention module. This will involve attention output average pooling, a dense layer processed by dropout, and final classification layers that lead to the output layer for multilabel emotion classification. SDP is the Scale Dot-Product.</p>
Full article ">Figure 3
<p>EmoBERTa-X training and dynamic ensemble selection process: The training of several instances of EmoBERTa-X, each computing a competence score; the DES framework selects the top-performing EmoBERTa-X based on the competence scores, pools its predictions, and then moves on to model evaluation.</p>
Full article ">Figure 4
<p>Distribution of emotions to be classified by EmoBERTa-X across different categories.</p>
Full article ">Figure 5
<p>Trend of micro and macro F1-scores across experiments: This line chart shows the progress of the micro and macro F1-scores of the EmoBERTa-X model across different sets of experiments.</p>
Full article ">Figure 6
<p>Performance comparison of EmoBERTa-X with the state-of-the-art models: The following figure illustrates the accuracy, micro F1-score, and macro F1-score of EmoBERTa-X compared to the existing graph-based, transformer-based, and hybrid approaches.</p>
Full article ">
19 pages, 262 KiB  
Article
Fine-Grained Encrypted Traffic Classification Using Dual Embedding and Graph Neural Networks
by Zhengyang Liu, Qiang Wei, Qisong Song and Chaoyuan Duan
Electronics 2025, 14(4), 778; https://doi.org/10.3390/electronics14040778 - 17 Feb 2025
Viewed by 391
Abstract
Encrypted traffic classification poses significant challenges in network security due to the growing use of encryption protocols, which obscure packet payloads. This paper introduces a novel framework that leverages dual embedding mechanisms and Graph Neural Networks (GNNs) to model both temporal and spatial [...] Read more.
Encrypted traffic classification poses significant challenges in network security due to the growing use of encryption protocols, which obscure packet payloads. This paper introduces a novel framework that leverages dual embedding mechanisms and Graph Neural Networks (GNNs) to model both temporal and spatial dependencies in traffic flows. By utilizing metadata features such as packet size, inter-arrival times, and protocol attributes, the framework achieves robust classification without relying on payload content. The proposed framework demonstrates an average classification accuracy of 96.7%, F1-score of 96.0%, and AUC-ROC of 97.9% across benchmark datasets, including ISCX VPN-nonVPN, QUIC, and USTC-TFC2016. These results mark an improvement of up to 8% in F1-score and 10% in AUC-ROC compared to state-of-the-art baselines. Extensive experiments validate the framework’s scalability and robustness, confirming its potential for real-world applications like intrusion detection and network monitoring. The integration of dual embedding mechanisms and GNNs allows for accurate fine-grained classification of encrypted traffic flows, addressing critical challenges in modern network security. Full article
Show Figures

Figure 1

Figure 1
<p>ROC curves for all methods across three datasets with slightly different trends.</p>
Full article ">Figure 2
<p>Performance metrics of different model variants for ISCX VPN-nonVPN dataset with error bars.</p>
Full article ">
18 pages, 5593 KiB  
Article
Decoding Analyses Show Dynamic Waxing and Waning of Event-Related Potentials in Coma Patients
by Adianes Herrera-Diaz, Rober Boshra, Richard Kolesar, Netri Pajankar, Paniz Tavakoli, Chia-Yu Lin, Alison Fox-Robichaud and John F. Connolly
Brain Sci. 2025, 15(2), 189; https://doi.org/10.3390/brainsci15020189 - 13 Feb 2025
Viewed by 472
Abstract
Background/Objectives: Coma prognosis is challenging, as patient presentation can be misleading or uninformative when using behavioral assessments only. Event-related potentials have been shown to provide valuable information about a patient’s chance of survival and emergence from coma. Our prior work revealed that [...] Read more.
Background/Objectives: Coma prognosis is challenging, as patient presentation can be misleading or uninformative when using behavioral assessments only. Event-related potentials have been shown to provide valuable information about a patient’s chance of survival and emergence from coma. Our prior work revealed that the mismatch negativity (MMN) in particular waxes and wanes across 24 h in some coma patients. This “cycling” aspect of the presence/absence of neurophysiological responses may require fine-grained tools to increase the chances of detecting levels of neural processing in coma. This study implements multivariate pattern analysis (MVPA) to automatically quantify patterns of neural discrimination between duration deviant and standard tones over time at the single-subject level in seventeen healthy controls and in three comatose patients. Methods: One EEG recording, containing up to five blocks of an auditory oddball paradigm, was performed in controls over a 12 h period. For patients, two EEG sessions were conducted 3 days apart for up to 24 h, denoted as day 0 and day 3, respectively. MVPA was performed using a support-vector machine classifier. Results: Healthy controls exhibited reliable discrimination or classification performance during the latency intervals associated with MMN and P3a components. Two patients showed some intervals with significant discrimination around the second half of day 0, and all had significant results on day 3. Conclusions: These findings suggest that decoding analyses can accurately classify neural responses at a single-subject level in healthy controls and provide evidence of small but significant changes in auditory discrimination over time in coma patients. Further research is needed to confirm whether this approach represents an improved technology for assessing cognitive processing in coma. Full article
Show Figures

Figure 1

Figure 1
<p>Multivariate decoding results of a representative control subject for duration-deviant vs. standard comparison. (<b>A</b>) Classification performance across time. The shaded area is the standard deviation across trials. The thick line indicates the time points where decoding is significantly higher than the chance level. (<b>B</b>) Temporal generalization plot of decoding performance. Color bar indicates AUC scores.</p>
Full article ">Figure 2
<p>The correlation analysis between individual classification performance and ERP amplitude was significant for both the MMN and P3a components.</p>
Full article ">Figure 3
<p>Effect of a reduced number of electrodes on classification performance and searchlight analysis across control subjects. (<b>A</b>) The paired-sample <span class="html-italic">t</span>-test revealed no significant differences in classification performance using 64 electrodes in comparison to 11 electrodes. (<b>B</b>) The searchlight MVPA computed over the baseline and 50 ms time intervals after stimulus onset showed the electrodes that better discriminated between conditions.</p>
Full article ">Figure 4
<p>Multivariate decoding results of Patient 1 on day 0 and day 3. The shaded area (first and third columns) is the standard deviation across trials. The thick line indicates the time points where decoding is significantly higher than the chance level. Color bars in the temporal generalization matrices (second and fourth columns) indicate AUC scores.</p>
Full article ">Figure 5
<p>Multivariate decoding results of Patient 2 on day 0 and day 3. The shaded area (first and third columns) is the standard deviation across trials. The thick line indicates the time points where decoding is significantly higher than chance level. The color bar in the temporal generalization matrices (second and fourth columns) indicate AUC scores.</p>
Full article ">Figure 6
<p>Multivariate decoding results of Patient 3 on day 0 and day 3. The shaded area (first and third columns) is the standard deviation across trials. The thick line indicates the time points where decoding is significantly higher than the chance level. The color bar in the temporal generalization matrices (second and fourth columns) indicates AUC scores.</p>
Full article ">Figure 7
<p>Classification performance of Patient 1 at each single block on day 0 and day 3. The shaded area is the standard deviation across trials. The thick line indicates the time points whereat decoding is significantly higher than chance level. Black arrows indicate the blocks with reliable classification performance.</p>
Full article ">Figure 8
<p>Classification performance of Patient 2 at each single block on day 0 and day 3. The shaded area is the standard deviation across trials. The thick line indicates the time points where decoding is significantly higher than chance level. Black arrows indicate the blocks with reliable classification performance.</p>
Full article ">Figure 9
<p>Classification performance of Patient 3 at each single block on day 0 and day 3. The shaded area is the standard deviation across trials. The thick line indicates the time points where decoding is significantly higher than chance level. Black arrows indicate the blocks with reliable classification performance.</p>
Full article ">
21 pages, 8936 KiB  
Article
A Minority Sample Enhanced Sampler for Crop Classification in Unmanned Aerial Vehicle Remote Sensing Images with Class Imbalance
by Jiapei Cheng, Liang Huang, Bohui Tang, Qiang Wu, Meiqi Wang and Zixuan Zhang
Agriculture 2025, 15(4), 388; https://doi.org/10.3390/agriculture15040388 - 12 Feb 2025
Viewed by 356
Abstract
Deep learning techniques have become the mainstream approach for fine-grained crop classification in unmanned aerial vehicle (UAV) remote sensing imagery. However, a significant challenge lies in the long-tailed distribution of crop samples. This imbalance causes neural networks to focus disproportionately on majority class [...] Read more.
Deep learning techniques have become the mainstream approach for fine-grained crop classification in unmanned aerial vehicle (UAV) remote sensing imagery. However, a significant challenge lies in the long-tailed distribution of crop samples. This imbalance causes neural networks to focus disproportionately on majority class features during training, leading to biased decision boundaries and weakening model performance. We designed a minority sample enhanced sampling (MES) method with the goal of addressing the performance limitations that are caused by class imbalance in many crop classification models. The main principle of MES is to relate the re-sampling probability of each class to the sample pixel frequency, thereby achieving intensive re-sampling of minority classes and balancing the training sample distribution. Meanwhile, during re-sampling, data augmentation is performed on the sampled images to improve the generalization. MES is simple to implement, is highly adaptable, and can serve as a general-purpose sampler for semantic segmentation tasks, functioning as a plug-and-play component within network models. To validate the applicability of MES, experiments were conducted on four classic semantic segmentation networks. The results showed that MES achieved mIoU improvements of +1.54%, +4.14%, +2.44%, and +7.08% on the Dali dataset and +2.36%, +0.86%, +4.26%, and +2.75% on the Barley Remote Sensing Dataset compared with the respective benchmark models. Additionally, our hyperparameter sensitivity analysis confirmed the stability and reliability of the method. MES mitigates the impact of class imbalance on network performance, which facilitates the practical application of deep learning in fine-grained crop classification. Full article
(This article belongs to the Special Issue Applications of Remote Sensing in Agricultural Soil and Crop Mapping)
Show Figures

Figure 1

Figure 1
<p>The class distribution of a long-tailed dataset. The head class feature space that is learned on these samples is often larger than the tail classes, while the decision boundary is usually biased towards dominant classes.</p>
Full article ">Figure 2
<p>Dali dataset: (<b>a</b>) geographical location; (<b>b</b>) diagram illustrating UAV orthophoto image.</p>
Full article ">Figure 3
<p>Dali dataset: (<b>a</b>) manual annotation labels; (<b>b</b>) pixel statistics of training sample classes.</p>
Full article ">Figure 4
<p>Barley Remote Sensing Dataset: (<b>a</b>) manual annotation labels; (<b>b</b>) pixel statistics of training sample classes.</p>
Full article ">Figure 5
<p>Baseline network architecture: (<b>a</b>) Deeplabv3+ [<a href="#B26-agriculture-15-00388" class="html-bibr">26</a>]; (<b>b</b>) SegNeXt [<a href="#B27-agriculture-15-00388" class="html-bibr">27</a>]; (<b>c</b>) Segformer [<a href="#B28-agriculture-15-00388" class="html-bibr">28</a>]; (<b>d</b>) Swin Transformer [<a href="#B29-agriculture-15-00388" class="html-bibr">29</a>].</p>
Full article ">Figure 6
<p>Visualization of experimental results when using MES with benchmark networks on the Dali dataset.</p>
Full article ">Figure 7
<p>Training loss curves of each method with iteration periods.</p>
Full article ">Figure 8
<p>Variation curves of training IoU for Brassica chinensis (minority class) with iteration period.</p>
Full article ">Figure 9
<p>Examples of experimental results of different methods based on Swin Transformer baseline network on Dali dataset.</p>
Full article ">Figure 10
<p>Visualization of the experimental results of using MES on benchmark networks in the Barley Remote Sensing Dataset.</p>
Full article ">Figure 11
<p>Examples of experimental results for different methods based on the Swin Transformer baseline network and the Barley Remote Sensing Dataset, the details of how each method was addressed are highlighted in the red circles.</p>
Full article ">Figure 12
<p>Re-sampling frequencies for different <span class="html-italic">t</span> values in the Dali dataset.</p>
Full article ">Figure 13
<p>Number of sampled pixels for each crop class in Dali dataset: (<b>a</b>) number of sampled pixels corresponding to different <span class="html-italic">T</span> values (<span class="html-italic">α</span> = 1); (<b>b</b>) number of sampled pixels corresponding to different <span class="html-italic">α</span> values (<span class="html-italic">T</span> = 0.05).</p>
Full article ">
22 pages, 11164 KiB  
Article
Acoustic Emission-Based Pipeline Leak Detection and Size Identification Using a Customized One-Dimensional DenseNet
by Faisal Saleem, Zahoor Ahmad, Muhammad Farooq Siddique, Muhammad Umar and Jong-Myon Kim
Sensors 2025, 25(4), 1112; https://doi.org/10.3390/s25041112 - 12 Feb 2025
Viewed by 397
Abstract
Effective leak detection and leak size identification are essential for maintaining the operational safety, integrity, and longevity of industrial pipelines. Traditional methods often suffer from high noise sensitivity, limited adaptability to non-stationary signals, and excessive computational costs, which limits their feasibility for real-time [...] Read more.
Effective leak detection and leak size identification are essential for maintaining the operational safety, integrity, and longevity of industrial pipelines. Traditional methods often suffer from high noise sensitivity, limited adaptability to non-stationary signals, and excessive computational costs, which limits their feasibility for real-time monitoring applications. This study presents a novel acoustic emission (AE)-based pipeline monitoring approach, integrating Empirical Wavelet Transform (EWT) for adaptive frequency decomposition with customized one-dimensional DenseNet architecture to achieve precise leak detection and size classification. The methodology begins with EWT-based signal segmentation, which isolates meaningful frequency bands to enhance leak-related feature extraction. To further improve signal quality, adaptive thresholding and denoising techniques are applied, filtering out low-amplitude noise while preserving critical diagnostic information. The denoised signals are processed using a DenseNet-based deep learning model, which combines convolutional layers and densely connected feature propagation to extract fine-grained temporal dependencies, ensuring the accurate classification of leak presence and severity. Experimental validation was conducted on real-world AE data collected under controlled leak and non-leak conditions at varying pressure levels. The proposed model achieved an exceptional leak detection accuracy of 99.76%, demonstrating its ability to reliably differentiate between normal operation and multiple leak severities. This method effectively reduces computational costs while maintaining robust performance across diverse operating environments. Full article
(This article belongs to the Special Issue Feature Papers in Fault Diagnosis & Sensors 2025)
Show Figures

Figure 1

Figure 1
<p>Graphical workflow of the proposed methodology.</p>
Full article ">Figure 2
<p>Flowchart of the signal preprocessing steps.</p>
Full article ">Figure 3
<p>Intrinsic mode functions for (<b>a</b>) non-leak signal and (<b>b</b>) leak signal.</p>
Full article ">Figure 4
<p>One-dimensional CNN architecture.</p>
Full article ">Figure 5
<p>DenseNet architecture.</p>
Full article ">Figure 6
<p>Experimental setup for pipeline leak detection.</p>
Full article ">Figure 7
<p>Pipeline architecture for the experiment.</p>
Full article ">Figure 8
<p>AE signals at 13-bar pressure: (<b>a</b>) normal; (<b>b</b>) leak.</p>
Full article ">Figure 9
<p>AE signals at 18-bar pressure: (<b>a</b>) normal; (<b>b</b>) leak.</p>
Full article ">Figure 10
<p>Confusion matrices for leak detection of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">Figure 10 Cont.
<p>Confusion matrices for leak detection of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">Figure 11
<p>Confusion matrices for leak size identification of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">Figure 12
<p>t-SNE plots for leak detection of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">Figure 13
<p>t-SNE plots for leak size identification of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">Figure 13 Cont.
<p>t-SNE plots for leak size identification of (<b>a</b>) proposed method; (<b>b</b>) 1D CNN; (<b>c</b>) LSTM; and (<b>d</b>) XGBoost.</p>
Full article ">
25 pages, 7982 KiB  
Article
Aerial Imagery Redefined: Next-Generation Approach to Object Classification
by Eran Dahan, Itzhak Aviv and Tzvi Diskin
Information 2025, 16(2), 134; https://doi.org/10.3390/info16020134 - 11 Feb 2025
Viewed by 495
Abstract
Identifying and classifying objects in aerial images are two significant and complex issues in computer vision. The fine-grained classification of objects in overhead images has become widespread in various real-world applications, due to recent advancements in high-resolution satellite and airborne imaging systems. The [...] Read more.
Identifying and classifying objects in aerial images are two significant and complex issues in computer vision. The fine-grained classification of objects in overhead images has become widespread in various real-world applications, due to recent advancements in high-resolution satellite and airborne imaging systems. The task is challenging, particularly in low-resource cases, due to the minor differences between classes and the significant differences within each class caused by the fine-grained nature. We introduce Classification of Objects for Fine-Grained Analysis (COFGA), a recently developed dataset for accurately categorizing objects in high-resolution aerial images. The COFGA dataset comprises 2104 images and 14,256 annotated objects across 37 distinct labels. This dataset offers superior spatial information compared to other publicly available datasets. The MAFAT Challenge is a task that utilizes COFGA to improve fine-grained classification methods. The baseline model achieved a mAP of 0.6. This cost was 60, whereas the most superior model achieved a score of 0.6271 by utilizing state-of-the-art ensemble techniques and specific preprocessing techniques. We offer solutions to address the difficulties in analyzing aerial images, particularly when annotated and imbalanced class data are scarce. The findings provide valuable insights into the detailed categorization of objects and have practical applications in urban planning, environmental assessment, and agricultural management. We discuss the constraints and potential future endeavors, specifically emphasizing the potential to integrate supplementary modalities and contextual information into aerial imagery analysis. Full article
(This article belongs to the Special Issue Online Registration and Anomaly Detection of Cyber Security Events)
Show Figures

Figure 1

Figure 1
<p>Visualization of different annotation methods: (<b>a</b>) image patch; (<b>b</b>) horizontal, axis-aligned, BB; (<b>c</b>) oriented BB; (<b>d</b>) polygon segmentation; (<b>e</b>) CPM.</p>
Full article ">Figure 2
<p>A sample of COFGA’s fine-grained classification labels, including subclasses, unique features, and perceived color.</p>
Full article ">Figure 3
<p>Log distribution of number of items in each class.</p>
Full article ">Figure 4
<p>Heat map of the inter- and intra-subclass correlation.</p>
Full article ">Figure 5
<p>Distribution of the area, in pixels, of objects from different subclasses: (<b>a</b>) two subclasses of the ‘large vehicle’ class, (<b>b</b>) two subclasses of the ‘small vehicle’ class.</p>
Full article ">Figure 6
<p>Architectures used in the baseline: based on MobileNet and ResNet50.</p>
Full article ">Figure 7
<p>Padding, cropping, and rotation.</p>
Full article ">Figure 8
<p>Squaring and color augmentation: obtained by permuting the three channels of the RGB image.</p>
Full article ">Figure 9
<p>Diagram of the ensemble pipeline used by SeffiCo-Team MMM.</p>
Full article ">Figure 10
<p>Yonatan Wischnitzer—preprocessing.</p>
Full article ">Figure 11
<p>Yonatan Wischnitzer’s model architecture, exploiting the hierarchical nature of the COFGA dataset’s tagging taxonomy.</p>
Full article ">
20 pages, 26727 KiB  
Article
A Supervised Approach for Land Use Identification in Trento Using Mobile Phone Data as an Alternative to Unsupervised Clustering Techniques
by Manuel Mendoza-Hurtado, Gonzalo Cerruela-García and Domingo Ortiz-Boyer
Appl. Sci. 2025, 15(4), 1753; https://doi.org/10.3390/app15041753 - 9 Feb 2025
Viewed by 530
Abstract
This study explores land use classification in Trento using supervised learning techniques combined with call detail records (CDRs) as a proxy for human activity. Located in an alpine environment, Trento presents unique geographic challenges, including varied terrain and sparse network coverage, making it [...] Read more.
This study explores land use classification in Trento using supervised learning techniques combined with call detail records (CDRs) as a proxy for human activity. Located in an alpine environment, Trento presents unique geographic challenges, including varied terrain and sparse network coverage, making it an ideal case for testing the robustness of supervised learning approaches. By analyzing spatiotemporal patterns in CDRs, we trained and evaluated several classification algorithms, including k-nearest neighbors (kNN), support vector machines (SVM), and random forests (RF), to map land use categories, such as home, work, and forest. A comparative analysis highlights the performance of each method, emphasizing the strengths of RF in capturing complex patterns, its good generalization ability, and the usage of kNN with different distance measures. Our supervised machine-learning approach outperforms unsupervised clustering techniques by capturing complex patterns and achieving higher accuracy. Results demonstrate the potential of CDRs for urban planning, offering a cost-effective approach for fine-grained land use monitoring with the particularities of Trento, as its landscape combines urban areas, agricultural fields, and forested regions, reflecting its alpine setting, in contrast with other metropolitan regions. Full article
(This article belongs to the Special Issue Artificial Intelligence and the Future of Smart Cities)
Show Figures

Figure 1

Figure 1
<p>Grid system of Trento.</p>
Full article ">Figure 2
<p>Manual labeling for (<b>a</b>) the 15 × 15 subgrid and (<b>b</b>) the 4 × 4 forest as ground truth values. Yellow: work, blue: home, red: forest. <a href="http://geojson.io/#data=data:text/x-url,https://raw.githubusercontent.com/mendozamanu/trento_mobility/main/geojsons/true_19x19.geojson" target="_blank">Geojson.io visualization</a>.</p>
Full article ">Figure 3
<p>(<b>a</b>) Ground truth and (<b>b</b>) classification predictions for the 15 × 15 subgrid classified with SVM (polynomial kernel) classification for smsout. Yellow: work, blue: home, red: forest. <a href="http://geojson.io/#data=data:text/x-url,https://raw.githubusercontent.com/mendozamanu/trento_mobility/main/geojsons/predicted_smsout_svmpoly_19x19.geojson" target="_blank">Geojson.io visualization</a>.</p>
Full article ">Figure 4
<p>(<b>a</b>) Ground truth and (<b>b</b>) classification predictions for the 15 × 15 subgrid classified with kNN (10 neighbors and CI distance) classification for smsin. Yellow: work, blue: home, red: forest. <a href="http://geojson.io/#data=data:text/x-url,https://raw.githubusercontent.com/mendozamanu/trento_mobility/main/geojsons/predicted_smsin_knn10cid_19x19.geojson" target="_blank">Geojson.io visualization</a>.</p>
Full article ">Figure 5
<p>(<b>a</b>) Ground truth and (<b>b</b>) classification predictions for the 15 × 15 subgrid classified with RF classification (500 estimators) for callin. Yellow: work, blue: home, red: forest. <a href="http://geojson.io/#data=data:text/x-url,https://raw.githubusercontent.com/mendozamanu/trento_mobility/main/geojsons/predicted_callin_rf500_19x19.geojson" target="_blank">Geojson.io visualization</a>.</p>
Full article ">Figure 6
<p>(<b>a</b>) Ground truth and (<b>b</b>) classification predictions for the 15 × 15 subgrid classified with DL for smsout. Yellow: work, blue: home, red: forest. <a href="http://geojson.io/#data=data:text/x-url,https://raw.githubusercontent.com/mendozamanu/trento_mobility/main/geojsons/predicted_smsout_dl_19x19.geojson" target="_blank">Geojson.io visualization</a>.</p>
Full article ">Figure 7
<p>(<b>a</b>) Ground truth and (<b>b</b>) classification predictions for the 15 × 15 subgrid classified with kMeans clustering for internet. Yellow: work, blue: home, red: forest. <a href="http://geojson.io/#data=data:text/x-url,https://raw.githubusercontent.com/mendozamanu/trento_mobility/main/geojsons/predictions_kmeans3_internet_19x19.geojson" target="_blank">Geojson.io visualization</a>.</p>
Full article ">Figure 8
<p>RF classification (500 estimators) for the full grid. Green: no data present, yellow: work, blue: home, red: forest. <a href="http://geojson.io/#data=data:text/x-url,https://raw.githubusercontent.com/mendozamanu/trento_mobility/main/geojsons/pred_fullgrid_trento_callin_rf500_19.geojson" target="_blank">Geojson.io visualization</a>.</p>
Full article ">Figure 9
<p>kMeans clustering for the full grid. Green: no data present, yellow: work, blue: home, red: forest. <a href="http://geojson.io/#data=data:text/x-url,https://raw.githubusercontent.com/mendozamanu/trento_mobility/main/geojsons/predictions_kmeans3_int_fg.geojson" target="_blank">Geojson.io visualization</a>.</p>
Full article ">
22 pages, 2866 KiB  
Article
Enhancing Food Image Recognition by Multi-Level Fusion and the Attention Mechanism
by Zengzheng Chen, Jianxin Wang and Yeru Wang
Foods 2025, 14(3), 461; https://doi.org/10.3390/foods14030461 - 31 Jan 2025
Viewed by 564
Abstract
As a pivotal area of research in the field of computer vision, the technology for food identification has become indispensable across diverse domains including dietary nutrition monitoring, intelligent service provision in restaurants, and ensuring quality control within the food industry. However, recognizing food [...] Read more.
As a pivotal area of research in the field of computer vision, the technology for food identification has become indispensable across diverse domains including dietary nutrition monitoring, intelligent service provision in restaurants, and ensuring quality control within the food industry. However, recognizing food images falls within the domain of Fine-Grained Visual Classification (FGVC), which presents challenges such as inter-class similarity, intra-class variability, and the complexity of capturing intricate local features. Researchers have primarily focused on deep information in deep convolutional neural networks for fine-grained visual classification, often neglecting shallow and detailed information. Taking these factors into account, we propose a Multi-level Attention Feature Fusion Network (MAF-Net). Specifically, we use feature maps generated by the Convolutional Neural Networks (CNNs) backbone network at different stages as inputs. We apply a self-attention mechanism to identify local features on these feature maps and then stack them together. The feature vectors obtained through the attention mechanism are then integrated with the original input to enhance data augmentation. Simultaneously, to capture as many local features as possible, we encourage multi-scale features to concentrate on distinct local regions at each stage by maximizing the Kullback-Leibler Divergence (KL-divergence) between the different stages. Additionally, we present a novel approach called subclass center loss (SCloss) to implement label smoothing, minimize intra-class feature distribution differences, and enhance the model’s generalization capability. Experiments conducted on three food image datasets—CETH Food-101, Vireo Food-172, and UEC Food-100—demonstrated the superiority of the proposed model. The model achieved Top-1 accuracies of 90.22%, 89.86%, and 90.61% on CETH Food-101, Vireo Food-172, and UEC Food-100, respectively. Notably, our method not only outperformed other methods in terms of the Top-5 accuracy of Vireo Food-172 but also achieved the highest performance in the Top-1 accuracies of UEC Food-100. Full article
Show Figures

Figure 1

Figure 1
<p>Visualization results from guided backpropagation (GB) [<a href="#B22-foods-14-00461" class="html-bibr">22</a>], implemented using AlexNet [<a href="#B23-foods-14-00461" class="html-bibr">23</a>], trained on Vireo Food172 [<a href="#B24-foods-14-00461" class="html-bibr">24</a>]. The deeper CNN layers concentrate on semantically significant regions while abstracting low-level information acquired by shallow layers. However, this depth in CNN layers may result in the loss of certain details.</p>
Full article ">Figure 2
<p>The framework of MAF-Net. It includes a multi-stage feature fusion module and a self-attention mechanism module.</p>
Full article ">Figure 3
<p>Comparisons of the impact of different stages in selecting the backbone network on experimental results show that the features from the last three stages perform the best.</p>
Full article ">Figure 4
<p>Comparisons of the effects of using different balancing parameters on experimental results indicate that the performance is poor when using only one or two of the loss functions <math display="inline"><semantics> <mrow> <msub> <mi>L</mi> <mrow> <mi>C</mi> <mi>o</mi> <mi>n</mi> </mrow> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>L</mi> <mrow> <mi>S</mi> <mi>C</mi> </mrow> </msub> </mrow> </semantics></math>, or <math display="inline"><semantics> <mrow> <msub> <mi>L</mi> <mrow> <mi>K</mi> <mi>L</mi> </mrow> </msub> </mrow> </semantics></math>, while the highest recognition accuracy is achieved with the balancing parameters set to (0.5, 0.25, 0.25).</p>
Full article ">Figure 5
<p>Visualization results of MAF-Net on some samples from CETH Food-101 (<b>left</b>) and Vireo Food-172 (<b>right</b>). From left to right are the input images, Stage-3, Stage-4, and Stage-5, with rows 1, 3, and 5 showing the original network, and rows 2, 4, and 6 showing MAF-Net.</p>
Full article ">
16 pages, 5020 KiB  
Article
Blind Channel Estimation Method Using CNN-Based Resource Grouping
by Gayeon Kim, Yumin Kim, Daegun Jang, Byeong-Gwon Kang and Taehyoung Kim
Mathematics 2025, 13(3), 481; https://doi.org/10.3390/math13030481 - 31 Jan 2025
Viewed by 421
Abstract
This paper proposes a novel blind channel estimation method using convolutional neural network (CNN)-based resource grouping. The traditional K-means-based blind channel estimation scheme suffers limitations in reflecting fine-grained channel variations in both the time and frequency domains. To address these limitations, we propose [...] Read more.
This paper proposes a novel blind channel estimation method using convolutional neural network (CNN)-based resource grouping. The traditional K-means-based blind channel estimation scheme suffers limitations in reflecting fine-grained channel variations in both the time and frequency domains. To address these limitations, we propose dynamic resource grouping based on CNN architecture utilizing a two-step learning process that adapts to various channel conditions. The first step of the proposed method identifies the optimal number of subcarriers for each channel condition, providing a foundation for the second step. The second step adjusts the number of orthogonal frequency division multiplexing (OFDM) symbols, a parameter for determining the proposed pattern in the time domain, to adapt to dynamic channel variations. Simulation results demonstrate that the proposed CNN-based blind channel estimation method achieves high channel estimation accuracy across various signal-to-noise ratio (SNR) levels, attaining the highest accuracy of 82.5% at an SNR of 10 dB. Even when classification accuracy is relatively low, the CNN effectively mitigates signal distortion, delivering superior performance compared to conventional methods in terms of mean squared error (MSE) across diverse channel conditions. Notably, the proposed method maintains robust performance under high-mobility scenarios and severe channel variations. Full article
Show Figures

Figure 1

Figure 1
<p>Four classified clusters based on K-means clustering of QPSK signal.</p>
Full article ">Figure 2
<p>Resource grouping pattern for <math display="inline"><semantics> <mrow> <mi>t</mi> <mo>=</mo> <mn>7</mn> <mo>,</mo> <mo> </mo> <mi>f</mi> <mo>=</mo> <mn>60</mn> </mrow> </semantics></math>.</p>
Full article ">Figure 3
<p>Time domain pattern distribution according to velocity conditions.</p>
Full article ">Figure 4
<p>Frequency domain pattern distribution according to delay spread conditions.</p>
Full article ">Figure 5
<p>Number of data distribution according to SNR (−6, −4, −2, 0 dB).</p>
Full article ">Figure 6
<p>Number of data distribution according to SNR (2, 4, 6, 8, 10 dB).</p>
Full article ">Figure 7
<p>Proposed CNN structure.</p>
Full article ">Figure 8
<p>MSE performance for each pattern with delay spared according to SNR: (<b>a</b>) 50 ns delay spread, (<b>b</b>) 300 ns delay spread.</p>
Full article ">Figure 9
<p>MSE performance for each pattern with velocity according to SNR: (<b>a</b>) 30 km/h velocity, (<b>b</b>) 150 km/h velocity.</p>
Full article ">
Back to TopTop