Saved Queries

The purpose of infrared and visible image fusion is to combine the advantages of both and generate a fused image that contains target information and has rich details and contrast. However, existing fusion algorithms often overlook the importance of incorporating both local and global feature extraction, leading to missing key information in the fused image. To address these challenges, this paper proposes a dual-branch fusion network combining convolutional neural network (CNN) and Transformer, which enhances the feature extraction capability and motivates the fused image to contain more information. Firstly, a local feature extraction module with CNN as the core is constructed. Specifically, the residual gradient module is used to enhance the ability of the network to extract texture information. Also, jump links and coordinate attention are used in order to relate shallow features to deeper ones. In addition, a global feature extraction module based on Transformer is constructed. Through the powerful ability of Transformer, the global context information of the image can be captured and the global features are fully extracted. The effectiveness of the proposed method in this paper is verified on different experimental datasets, and it is better than most of the current advanced fusion algorithms. Full article

(This article belongs to the Special Issue Multi-Source Image Fusion, Restoration, and Understanding and Its Application in Sensing)

►▼ Show Figures

Figure 1

15 pages, 23802 KiB

Open AccessArticle

Vision-Based Prediction of Flashover Using Transformers and Convolutional Long Short-Term Memory Model

by M. Hamed Mozaffari, Yuchuan Li, Niloofar Hooshyaripour and Yoon Ko

Electronics 2024, 13(23), 4776; https://doi.org/10.3390/electronics13234776 - 3 Dec 2024

Abstract

The prediction of fire growth is crucial for effective firefighting and rescue operations. Recent advancements in vision-based techniques using RGB vision and infrared (IR) thermal imaging data, coupled with artificial intelligence and deep learning techniques, have shown promising solutions to be applied in the detection of fire and the prediction of its behavior. This study introduces the use of Convolutional Long Short-term Memory (ConvLSTM) network models for predicting room fire growth by analyzing spatiotemporal IR thermal imaging data acquired from full-scale room fire tests. Our findings revealed that SwinLSTM, an enhanced version of ConvLSTM combined with transformers (a deep learning architecture based on a new mechanism called multi-head attention) for computer vision purposes, can be used for the prediction of room fire flashover occurrence. Notably, transformer-based ConvLSTM deep learning models, such as SwinLSTM, demonstrate superior prediction capability, which suggests a new vision-based smart solution for future fire growth prediction tasks. The main focus of this work is to perform a feasibility study on the use of a pure vision-based deep learning model for analysis of future video data to anticipate behavior of fire growth in room fire incidents. Full article

(This article belongs to the Special Issue Deep Learning for Computer Vision Application)

►▼ Show Figures

Figure 1

11 pages, 5505 KiB

Open AccessProceeding Paper

Combining Deep Learning and Street View Images for Urban Building Color Research

by Wenjing Li, Qian Ma and Zhiyong Lin

Proceedings 2024, 110(1), 7; https://doi.org/10.3390/proceedings2024110007 - 3 Dec 2024

Abstract

The color of a cityscape plays a significant role in its atmosphere; however, the traditional city color analysis methods cover a wide range but are not precise enough, requiring field sampling, a lot of manual comparisons, and lacking quantitative analysis of color. With the development of artificial intelligence, deep learning and computer vision technology show great potential in urban environment research. In this document, we focus on “building color” and present a deep learning-based framework that combines geospatial big data with AI technology to extract and analyze urban color information. The framework is composed of two phases: “deep learning” and “quantitative analysis.” In the “deep learning” phase, a deep convolutional neural network (DCNN)-based color extraction model is designed to automatically learn building color information from street view images; in the “quantitative analysis” phase, building color is quantitatively analyzed at the overall and local levels, and a color clustering model is designed to quantitatively display the color relationship to comprehensively understand the current status of urban building color. The research method and results of this paper are one of the effective ways to combine geospatial big data with GeoAI, which is helpful to the collection and analysis of urban color and provides direction for the construction of urban color information management. Full article

►▼ Show Figures

Figure 1

Figure 1
Distribution of buildings in the study area. Full article ">Figure 2
The process of the street view image acquisition, processing, and training dataset construction. Full article ">Figure 3
Framework for research on architectural colors. Full article ">Figure 4
(a) Scatter diagram of building hue-lightness distribution in Jiangan district; (b) Scatter diagram of building hue-saturation distribution in Jiangan district. The horizontal coordinate axis indicates the hue value, the vertical coordinate axis indicates the lightness or saturation value, and the size of the scatter indicates how many numbers the corresponding value is. When the hue is N, it is a non-colorful gray color. The value of the hue is 0, and there is no value of the saturation value. Full article ">Figure 5
The color tone analysis diagram of the building in the Jiangan district. Full article ">Figure 6
The main colors of various buildings and their proportions. Full article ">Figure 7
The relative intensity of the dominant colors of various buildings. Full article ">

25 pages, 44855 KiB

Open AccessArticle

Burned Olive Trees Identification with a Deep Learning Approach in Unmanned Aerial Vehicle Images

by Christos Vasilakos and Vassilios S. Verykios

Remote Sens. 2024, 16(23), 4531; https://doi.org/10.3390/rs16234531 - 3 Dec 2024

Viewed by 82

Abstract

Olive tree orchards are suffering from wildfires in many Mediterranean countries. Following a wildfire event, identifying damaged olive trees is crucial for developing effective management and restoration strategies, while rapid damage assessment can support potential compensation for producers. Moreover, the implementation of real-time health monitoring in olive groves allows producers to carry out targeted interventions, reducing production losses and preserving crop health. This research examines the use of deep learning methodologies in true-color images from Unmanned Aerial Vehicles (UAV) to detect damaged trees, including withering and desiccation of branches and leaf scorching. More specifically, the object detection and image classification computer vision techniques area applied and compared. In the object detection approach, the algorithm aims to localize and identify burned/dry and unburned/healthy olive trees, while in the image classification approach, the classifier categorizes an image showing a tree as burned/dry or unburned/healthy. Training data included true color UAV images of olive trees damaged by fire obtained by multiple cameras and multiple flight heights, resulting in various resolutions. For object detection, the Residual Neural Network was used as a backbone in an object detection approach with a Single-Shot Detector. In the image classification application, two approaches were evaluated. In the first approach, a new shallow network was developed, while in the second approach, transfer learning from pre-trained networks was applied. According to the results, the object detection approach managed to identify healthy trees with an average accuracy of 74%, while for trees with drying, the average accuracy was 69%. However, the optimal network identified olive trees (healthy or unhealthy) that the user did not detect during data collection. In the image classification approach, the application of convolutional neural networks achieved significantly better results with an F1-score above 0.94, either in the new network training approach or by applying transfer learning. In conclusion, the use of computer vision techniques in UAV images identified damaged olive trees, while the image classification approach performed significantly better than object detection. Full article

(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)

►▼ Show Figures

Figure 1

27 pages, 21616 KiB

Open AccessArticle

Integrating Convolutional Attention and Encoder–Decoder Long Short-Term Memory for Enhanced Soil Moisture Prediction

by Jingfeng Han, Jian Hong, Xiao Chen, Jing Wang, Jinlong Zhu, Xiaoning Li, Yuguang Yan and Qingliang Li

Water 2024, 16(23), 3481; https://doi.org/10.3390/w16233481 - 3 Dec 2024

Viewed by 102

Abstract

Soil moisture is recognized as a crucial variable in land–atmosphere interactions. This study introduces the Convolutional Attention Encoder–Decoder Long Short-Term Memory (CAEDLSTM) model to address the uncertainties and limitations inherent in traditional soil moisture prediction methods, especially in capturing complex temporal dynamics across diverse environmental conditions. Unlike existing approaches, this model integrates convolutional layers, an encoder–decoder framework, and multi-head attention mechanisms for the first time in soil moisture prediction. The convolutional layers capture local spatial features, while the encoder–decoder architecture effectively manages temporal dependencies. Additionally, the multi-head attention mechanism enhances the model’s ability to simultaneously focus on multiple key influencing factors, ensuring a comprehensive understanding of complex environmental variables. This synergistic combination significantly improves predictive performance, particularly in challenging climatic conditions. The model was validated using the LandBench1.0 dataset, which includes multiple high-resolution datasets, such as ERA5-land, ERA5 atmospheric variables, and SoilGrids, covering various climatic regions, including high latitudes, temperate zones, and tropical areas. The superior performance of the CAEDLSTM model is evidenced by comparisons with advanced models such as AEDLSTM, CNNLSTM, EDLSTM, and AttLSTM. Relative to the traditional LSTM model, CAEDLSTM achieved an average increase of 5.01% in R², a 12.89% reduction in RMSE, a 16.67% decrease in bias, and a 4.35% increase in KGE. Moreover, it effectively addresses the limitations of traditional deep learning methods in challenging climates, including tropical Africa, the Tibetan Plateau, and Southeast Asia, resulting in significant enhancements in predictive accuracy within these regions, with R² values improving by as much as 20%. These results underscore the capabilities of CAEDLSTM in capturing complex soil moisture dynamics, demonstrating its considerable potential for applications in agriculture and water resource monitoring across diverse climates. Full article

(This article belongs to the Special Issue Methods and Tools for Sustainable Agricultural Water Management)

►▼ Show Figures

Figure 1

16 pages, 7431 KiB

Open AccessArticle

Deep Learning-Based Model for Effective Classification of Ziziphus jujuba Using RGB Images

by Yu-Jin Jeon, So Jin Park, Hyein Lee, Ho-Youn Kim and Dae-Hyun Jung

AgriEngineering 2024, 6(4), 4604-4619; https://doi.org/10.3390/agriengineering6040263 (registering DOI) - 3 Dec 2024

Viewed by 91

Abstract

Ensuring the quality of medicinal herbs in the herbal market is crucial. However, the genetic and physical similarities among medicinal materials have led to issues of mixing and counterfeit distribution, posing significant challenges to quality assurance. Recent advancements in deep learning technology, widely applied in the field of computer vision, have demonstrated the potential to classify images quickly and accurately, even those that can only be distinguished by experts. This study aimed to develop a classification model based on deep learning technology to distinguish RGB images of seeds from Ziziphus jujuba Mill. var. spinosa, Ziziphus mauritiana Lam., and Hovenia dulcis Thunb. Using three advanced convolutional neural network (CNN) architectures—ResNet-50, Inception-v3, and DenseNet-121—all models demonstrated a classification performance above 98% on the test set, with classification times as low as 23 ms. These results validate that the models and methods developed in this study can effectively distinguish Z. jujuba seeds from morphologically similar species. Furthermore, the strong performance and speed of these models make them suitable for practical use in quality inspection settings. Full article

(This article belongs to the Special Issue Exploring the Application of Artificial Intelligence and Image Processing in Agriculture)

►▼ Show Figures

Figure 1

14 pages, 1185 KiB

Open AccessArticle

Monitoring Substance Use with Fitbit Biosignals: A Case Study on Training Deep Learning Models Using Ecological Momentary Assessments and Passive Sensing

by Shizhe Li, Chunzhi Fan, Ali Kargarandehkordi, Yinan Sun, Christopher Slade, Aditi Jaiswal, Roberto M. Benzo, Kristina T. Phillips and Peter Washington

AI 2024, 5(4), 2725-2738; https://doi.org/10.3390/ai5040131 (registering DOI) - 3 Dec 2024

Viewed by 171

Abstract

Substance use disorders affect 17.3% of Americans. Digital health solutions that use machine learning to detect substance use from wearable biosignal data can eventually pave the way for real-time digital interventions. However, difficulties in addressing severe between-subject data heterogeneity have hampered the adaptation of machine learning approaches for substance use detection, necessitating more robust technological solutions. We tested the utility of personalized machine learning using participant-specific convolutional neural networks (CNNs) enhanced with self-supervised learning (SSL) to detect drug use. In a pilot feasibility study, we collected data from 9 participants using Fitbit Charge 5 devices, supplemented by ecological momentary assessments to collect real-time labels of substance use. We implemented a baseline 1D-CNN model with traditional supervised learning and an experimental SSL-enhanced model to improve individualized feature extraction under limited label conditions. Results: Among the 9 participants, we achieved an average area under the receiver operating characteristic curve score across participants of 0.695 for the supervised CNNs and 0.729 for the SSL models. Strategic selection of an optimal threshold enabled us to optimize either sensitivity or specificity while maintaining reasonable performance for the other metric. Conclusion: These findings suggest that Fitbit data have the potential to enhance substance use monitoring systems. However, the small sample size in this study limits its generalizability to diverse populations, so we call for future research that explores SSL-powered personalization at a larger scale. Full article

(This article belongs to the Section Medical & Healthcare AI)

►▼ Show Figures

Figure 1

Figure 1
Study overview. We recruited participants and equipped them with Fitbits collecting various biosensor data, including HR, steps taken, BR, sleep patterns, and <math display="inline"><semantics> <msub> <mi>SpO</mi> <mn>2</mn> </msub> </semantics></math>. Concurrently, participants completed EMAs via a custom mobile app, recording each substance use event over the monitoring period. We then analyzed these data using personalized deep learning models to detect substance use based on biosensor data from the Fitbit. To protect patient privacy and to avoid asking participants to self-report illegal activity, we gave participants the option to record fruit code names rather than substance names, and the participants eligible for our analysis chose this option. Full article ">Figure 2
Distribution of each feature’s utilization. Features are ranked by their selection count using Gini impurity. Full article ">Figure 3
An SSL-enhanced transfer learning framework for drug use classification, utilizing selected biometric features from each participant. A CNN pre-trained with SSL, outlined with a dotted line around the 1D convulutional, pooling, and flatten layers, is fine-tuned with new dense layers to predict drug use from biometric featrues. The dotted line indicates the layers transferred for the task-specific model. Full article ">Figure 4
Mean bootstrapped sensitivity and specificity at different decision threshold cutoffs across 9 participants, each denoted by distinct colors. Full article ">

9 pages, 6603 KiB

Open AccessProceeding Paper

Spatially Seamless Downscaling of a SMAP Soil Moisture Product Through a CNN-Based Approach with Integrated Multi-Source Remote Sensing Data

by Yan Jin, Haoyu Fan, Zeshuo Li and Yaojie Liu

Proceedings 2024, 110(1), 8; https://doi.org/10.3390/proceedings2024110008 - 3 Dec 2024

Viewed by 49

Abstract

Surface soil moisture (SSM) is crucial for understanding terrestrial hydrological processes. Despite its widespread use since 2015, the Soil Moisture Active and Passive (SMAP) SSM dataset faces challenges due to its inherent low spatial resolution and data gaps. This study addresses these limitations through a deep learning approach aimed at interpolating missing values and downscaling soil moisture data. The result is a seamless, daily 1 km resolution SSM dataset for China, spanning from 1 January 2016 to 31 December 2022. For the original 9 km daily SMAP products, a convolutional neural network (CNN) with residual connections was employed to achieve the spatially seamless 9 km SSM data, integrating multi-source remote sensing data. Subsequently, auxiliary data including land cover, land surface temperatures, vegetation indices, vegetation temperature drought indices, elevation, and soil texture were integrated into the CNN-based downscaling model to generate the spatially seamless 1 km SSM. Comparative analysis of the spatially seamless 9 km and 1 km SSM datasets with ground observations yielded unbiased root mean square error values of 0.09 cm³/cm³ for both, demonstrating the effectiveness of the downscaling method. This approach provides a promising solution for generating high-resolution, spatially seamless soil moisture data to meet the needs of hydrological, meteorological, and agricultural applications. Full article

►▼ Show Figures

Figure 1

Figure 1
Study area and locations of the ground stations. The label in the figure represents the network name, which refers to the specific monitoring networks included in ISMN. Each network consists of multiple sites. Full article ">Figure 2
Model structure of the developed TsSMN. Full article ">Figure 3
SSM images for 25 June 2018: original SMAP data, spatially seamless 9 km SSM predictions, and downscaled spatially seamless 1 km SSM data. Full article ">Figure 4
Scatter plots comparing the ground observations with three different 9 km SSM datasets. The dashed line in the figure represents the situation where the predicted value is equal to the actual value, the solid line is the equation obtained by linear regression of the scatter plot, and the values in the figure represent the difference in vertical distance from the point to the regression equation line. Full article ">Figure 5
Time series comparison of SSM data derived at four ground stations: ground observations (In Situ), original SMAP, TsSMN-based 9 km predictions, SMN-based 9 km predictions, and downscaled 1 km predictions. Full article ">Figure 6
Scatter plot comparing ground observations with the downscaled 1 km predictions. Full article ">

24 pages, 2138 KiB

Open AccessArticle

A Multimodal Machine Learning Model in Pneumonia Patients Hospital Length of Stay Prediction

by Anna Annunziata, Salvatore Cappabianca, Salvatore Capuozzo, Nicola Coppola, Camilla Di Somma, Ludovico Docimo, Giuseppe Fiorentino, Michela Gravina, Lidia Marassi, Stefano Marrone, Domenico Parmeggiani, Giorgio Emanuele Polistina, Alfonso Reginelli, Caterina Sagnelli and Carlo Sansone

Big Data Cogn. Comput. 2024, 8(12), 178; https://doi.org/10.3390/bdcc8120178 - 3 Dec 2024

Viewed by 191

Abstract

Hospital overcrowding, driven by both structural management challenges and widespread medical emergencies, has prompted extensive research into machine learning (ML) solutions for predicting patient length of stay (LOS) to optimize bed allocation. While many existing models simplify the LOS prediction problem to a classification task, predicting broad ranges of hospital days, an exact day-based regression model is often crucial for precise planning. Additionally, available data are typically limited and heterogeneous, often collected from a small patient cohort. To address these challenges, we present a novel multimodal ML framework that combines imaging and clinical data to enhance LOS prediction accuracy. Specifically, our approach uses the following: (i) feature extraction from chest CT scans via a convolutional neural network (CNN), (ii) their integration with clinically relevant tabular data from patient exams, refined through a feature selection system to retain only significant predictors. As a case study, we applied this framework to pneumonia patient data collected during the COVID-19 pandemic at two hospitals in Naples, Italy—one specializing in infectious diseases and the other general-purpose. Under our experimental setup, the proposed system achieved an average prediction error of only three days, demonstrating its potential to improve patient flow management in critical care environments. Full article

(This article belongs to the Special Issue Application of Deep Learning and Convolution Neural Networks for Social Healthcare)

►▼ Show Figures

Figure 1

15 pages, 2366 KiB

Open AccessArticle

Gas Leakage Detection Using Tiny Machine Learning

by Majda El Barkani, Nabil Benamar, Hanae Talei and Miloud Bagaa

Electronics 2024, 13(23), 4768; https://doi.org/10.3390/electronics13234768 - 2 Dec 2024

Viewed by 252

Abstract

Gas leakage detection is a critical concern in both industrial and residential settings, where real-time systems are essential for quickly identifying potential hazards and preventing dangerous incidents. Traditional detection systems often rely on centralized data processing, which can lead to delays and scalability issues. To overcome these limitations, in this study, we present a solution based on tiny machine learning (TinyML) to process data directly on devices. TinyML has the potential to execute machine learning algorithms locally, in real time, and using tiny devices, such as microcontrollers, ensuring faster and more efficient responses to potential dangers. Our approach combines an MLX90640 thermal camera with two optimized convolutional neural networks (CNNs), MobileNetV1 and EfficientNet-B0, deployed on the Arduino Nano 33 BLE Sense. The results show that our system not only provides real-time analytics but does so with high accuracy—88.92% for MobileNetV1 and 91.73% for EfficientNet-B0—while achieving inference times of 1414 milliseconds and using just 124.8 KB of memory. Compared to existing solutions, our edge-based system overcomes common challenges related to latency and scalability, making it a reliable, fast, and efficient option. This work demonstrates the potential for low-cost, scalable gas detection systems that can be deployed widely to enhance safety in various environments. By integrating cutting-edge machine learning models with affordable IoT devices, we aim to make safety more accessible, regardless of financial limitations, and pave the way for further innovation in environmental monitoring solutions. Full article

(This article belongs to the Special Issue Machine Learning in Electronic and Biomedical Engineering, 3rd Edition)

►▼ Show Figures

Figure 1

Figure 1
Experimental setup for data collection (reprinted, with permission, from [<a href="#B20-electronics-13-04768" class="html-bibr">20</a>] @2022 MDPI). Full article ">Figure 2
The four dataset categories: (a) Mixture; (b); No Gas, (c); Perfume; (d) Smoke. Full article ">Figure 3
Confusion matrix for the MobileNetV1 model. Full article ">Figure 4
Confusion matrix for the BO configuration. Full article ">

23 pages, 3424 KiB

Open AccessArticle

Automated Detection of Gastrointestinal Diseases Using Resnet50*-Based Explainable Deep Feature Engineering Model with Endoscopy Images

by Veysel Yusuf Cambay, Prabal Datta Barua, Abdul Hafeez Baig, Sengul Dogan, Mehmet Baygin, Turker Tuncer and U. R. Acharya

Sensors 2024, 24(23), 7710; https://doi.org/10.3390/s24237710 (registering DOI) - 2 Dec 2024

Viewed by 200

Abstract

This work aims to develop a novel convolutional neural network (CNN) named ResNet50* to detect various gastrointestinal diseases using a new ResNet50*-based deep feature engineering model with endoscopy images. The novelty of this work is the development of ResNet50*, a new variant of the ResNet model, featuring convolution-based residual blocks and a pooling-based attention mechanism similar to PoolFormer. Using ResNet50*, a gastrointestinal image dataset was trained, and an explainable deep feature engineering (DFE) model was developed. This DFE model comprises four primary stages: (i) feature extraction, (ii) iterative feature selection, (iii) classification using shallow classifiers, and (iv) information fusion. The DFE model is self-organizing, producing 14 different outcomes (8 classifier-specific and 6 voted) and selecting the most effective result as the final decision. During feature extraction, heatmaps are identified using gradient-weighted class activation mapping (Grad-CAM) with features derived from these regions via the final global average pooling layer of the pretrained ResNet50*. Four iterative feature selectors are employed in the feature selection stage to obtain distinct feature vectors. The classifiers k-nearest neighbors (kNN) and support vector machine (SVM) are used to produce specific outcomes. Iterative majority voting is employed in the final stage to obtain voted outcomes using the top result determined by the greedy algorithm based on classification accuracy. The presented ResNet50* was trained on an augmented version of the Kvasir dataset, and its performance was tested using Kvasir, Kvasir version 2, and wireless capsule endoscopy (WCE) curated colon disease image datasets. Our proposed ResNet50* model demonstrated a classification accuracy of more than 92% for all three datasets and a remarkable 99.13% accuracy for the WCE dataset. These findings affirm the superior classification ability of the ResNet50* model and confirm the generalizability of the developed architecture, showing consistent performance across all three distinct datasets. Full article

(This article belongs to the Special Issue AI-Based Automated Recognition and Detection in Healthcare)

►▼ Show Figures

Figure 1

22 pages, 29223 KiB

Open AccessArticle

Risk Assessment of Bridge Damage Due to Heavy Rainfall Considering Landslide Risk and Driftwood Generation Potential Using Convolutional Neural Networks and Conventional Machine Learning

by Fudong Ren, Koichi Isobe and Miku Ando

Water 2024, 16(23), 3471; https://doi.org/10.3390/w16233471 - 2 Dec 2024

Viewed by 316

Abstract

This study addresses the assessment of bridge damage risks associated with heavy rainfall, focusing on landslide susceptibility and driftwood generation potential. By integrating convolutional neural networks (CNNs) with traditional machine learning methods, the research develops an advanced predictive framework for estimating driftwood accumulation at river bridges—a recognized challenge in disaster management. Concentrating on the Tokachi River basin in Hokkaido, Japan, the research utilizes diverse environmental and geographical data from authoritative sources. The findings demonstrate that the innovative approach not only enhances the accuracy of driftwood volume predictions but also distinguishes the effectiveness of CNNs compared to conventional methods. Crucially, areas prone to landslides are identified as significant contributors to driftwood generation, impacting bridge safety. The study underscores the potential of machine learning models in improving disaster risk assessment, while suggesting further exploration into real-time data integration and model refinement to adapt to changing climate conditions and ensure long-term infrastructure safety. Full article

►▼ Show Figures

Figure 1

16 pages, 2225 KiB

Open AccessArticle

Multimodal MRI Deep Learning for Predicting Central Lymph Node Metastasis in Papillary Thyroid Cancer

by Xiuyu Wang, Heng Zhang, Hang Fan, Xifeng Yang, Jiansong Fan, Puyeh Wu, Yicheng Ni and Shudong Hu

Cancers 2024, 16(23), 4042; https://doi.org/10.3390/cancers16234042 - 2 Dec 2024

Viewed by 183

Abstract

Background: Central lymph node metastasis (CLNM) in papillary thyroid cancer (PTC) significantly influences surgical decision-making strategies. Objectives: This study aims to develop a predictive model for CLNM in PTC patients using magnetic resonance imaging (MRI) and clinicopathological data. Methods: By incorporating deep learning (DL) algorithms, the model seeks to address the challenges in diagnosing CLNM and reduce overtreatment. The results were compared with traditional machine learning (ML) models. In this retrospective study, preoperative MRI data from 105 PTC patients were divided into training and testing sets. A radiologist manually outlined the region of interest (ROI) on MRI images. Three classic ML algorithms (support vector machine [SVM], logistic regression [LR], and random forest [RF]) were employed across different data modalities. Additionally, an AMMCNet utilizing convolutional neural networks (CNNs) was proposed to develop DL models for CLNM. Predictive performance was evaluated using receiver operator characteristic (ROC) curve analysis, and clinical utility was assessed through decision curve analysis (DCA). Results: Lesion diameter was identified as an independent risk factor for CLNM. Among ML models, the RF-(T1WI + T2WI, T1WI + T2WI + Clinical) models achieved the highest area under the curve (AUC) at 0.863. The DL fusion model surpassed all ML fusion models with an AUC of 0.891. Conclusions: A fusion model based on the AMMCNet architecture using MRI images and clinicopathological data was developed, effectively predicting CLNM in PTC patients. Full article

(This article belongs to the Special Issue Recent Advances in Oncology Imaging: 2nd Edition)

►▼ Show Figures

Figure 1

Figure 1
Inclusion and exclusion flowchart. Full article ">Figure 2
Segmentation of the ROI in axial T1 and T2 images. The red arrows in the MRI images indicate the location of the primary lesion. Full article ">Figure 3
Distribution of LASSO coefficients for T1 features, T2 features, and combined T1 + T2 features. (a,b) represent T1 features, (c,d) represent T2 features and (e,f) represent the combined T1 + T2 features. Full article ">Figure 4
Deep learning model workflow. Full article ">Figure 5
Histograms of best feature coefficients for T1, T2, and T1 + T2. (a) The best feature coefficients for T1; (b) the best feature coefficients for T2; (c) the best feature coefficients for T1 + T2. Full article ">Figure 6
The ROC curves of ML and DL models on the training set and test set. (a,b) The ROC curves of the SVM models on the training set and test set, (c,d) the ROC curves of the LR models on the training set and test set, (e,f) the ROC curves of RF models on training set and test set, (g,h) ROC curves of DL models on training set and test set. Full article ">Figure 7
ML and DL models’ DCA curves on the test set. (a) The DCA curves of the SVM models on the test set, (b) the DCA curves of the LR models on the test set, (c) the DCA curves of RF models on the test set, and (d) the DCA curves of the DL modes on the test set. Full article ">

16 pages, 952 KiB

Open AccessArticle

CGFTNet:Content-Guided Frequency Domain Transform Network for Face Super-Resolution

by Yeerlan Yekeben, Shuli Cheng and Anyu Du

Information 2024, 15(12), 765; https://doi.org/10.3390/info15120765 (registering DOI) - 2 Dec 2024

Viewed by 228

Abstract

Recent advancements in face super resolution (FSR) have been propelled by deep learning techniques using convolutional neural networks (CNN). However, existing methods still struggle with effectively capturing global facial structure information, leading to reduced fidelity in reconstructed images, and often require additional manual data annotation. To overcome these challenges, we introduce a content-guided frequency domain transform network (CGFTNet) for face super-resolution tasks. The network features a channel attention-linked encoder-decoder architecture with two key components: the Frequency Domain and Reparameterized Focus Convolution Feature Enhancement module (FDRFEM) and the Content-Guided Channel Attention Fusion (CGCAF) module. FDRFEM enhances feature representation through transformation domain techniques and reparameterized focus convolution (RefConv), capturing detailed facial features and improving image quality. CGCAF dynamically adjusts feature fusion based on image content, enhancing detail restoration. Extensive evaluations across multiple datasets demonstrate that the proposed CGFTNet consistently outperforms other state-of-the-art methods. Full article

►▼ Show Figures

Figure 1

Figure 1
An overview of the proposed CGFTNet. FDRFEM and CGCAF are two core modules that we have proposed. Full article ">Figure 2
The architecture of the proposed CGCAF. Full article ">Figure 3
The architecture of the proposed channel attention (CA), which is a component of CGCAF. Full article ">Figure 4
The architecture of the proposed FDRFEM. Full article ">Figure 5
Visual comparisons of multiple methods for 8x super-resolution on the CelebA test set. Full article ">Figure 6
Visual comparisons of multiple methods for 8x super-resolution on the Helen test set. Full article ">Figure 7
Visual comparisons of multiple methods for 8x super resolution on the real world images. Full article ">Figure 8
Model complexity studies for ×8 SR on the CelebA test sets. Our CGFTNet achieves a better balance between model size, model performance, and execution time. Full article ">

29 pages, 1618 KiB

Open AccessArticle

Optimization of Deep Neural Networks Using a Micro Genetic Algorithm

by Ricardo Landa, David Tovias-Alanis and Gregorio Toscano

AI 2024, 5(4), 2651-2679; https://doi.org/10.3390/ai5040127 (registering DOI) - 2 Dec 2024

Viewed by 213

Abstract

This work proposes the use of a micro genetic algorithm to optimize the architecture of fully connected layers in convolutional neural networks, with the aim of reducing model complexity without sacrificing performance. Our approach applies the paradigm of transfer learning, enabling training without the need for extensive datasets. A micro genetic algorithm requires fewer computational resources due to its reduced population size, while still preserving a substantial degree of the search capabilities found in algorithms with larger populations. By exploring different representations and objective functions, including classification accuracy, hidden neuron ratio, minimum redundancy, and maximum relevance for feature selection, eight algorithmic variants were developed, with six variants performing both hidden layers reduction and feature-selection tasks. Experimental results indicate that the proposed algorithm effectively reduces the architecture of the fully connected layers in the convolutional neural network. The variant achieving the best reduction used only 44% of the convolutional features in the input layer, and only 9.7% of neurons in the hidden layers, without negatively impacting (statistically confirmed) classification accuracy when compared to a network model based on a full reference architecture and a representative method from the literature. Full article

(This article belongs to the Section AI Systems: Theory and Applications)

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 345.

Go to page 1 2 3 4 5

Search Results (17,206)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI