Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,960)

Search Parameters:
Keywords = GANs

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
16 pages, 30693 KiB  
Article
LM-CycleGAN: Improving Underwater Image Quality Through Learned Perceptual Image Patch Similarity and Multi-Scale Adaptive Fusion Attention
by Jiangyan Wu, Guanghui Zhang and Yugang Fan
Sensors 2024, 24(23), 7425; https://doi.org/10.3390/s24237425 - 21 Nov 2024
Abstract
The underwater imaging process is often hindered by high noise levels, blurring, and color distortion due to light scattering, absorption, and suspended particles in the water. To address the challenges of image enhancement in complex underwater environments, this paper proposes an underwater image [...] Read more.
The underwater imaging process is often hindered by high noise levels, blurring, and color distortion due to light scattering, absorption, and suspended particles in the water. To address the challenges of image enhancement in complex underwater environments, this paper proposes an underwater image color correction and detail enhancement model based on an improved Cycle-consistent Generative Adversarial Network (CycleGAN), named LPIPS-MAFA CycleGAN (LM-CycleGAN). The model integrates a Multi-scale Adaptive Fusion Attention (MAFA) mechanism into the generator architecture to enhance its ability to perceive image details. At the same time, the Learned Perceptual Image Patch Similarity (LPIPS) is introduced into the loss function to make the training process more focused on the structural information of the image. Experiments conducted on the public datasets UIEB and EUVP demonstrate that LM-CycleGAN achieves significant improvements in Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), Average Gradient (AG), Underwater Color Image Quality Evaluation (UCIQE), and Underwater Image Quality Measure (UIQM). Moreover, the model excels in color correction and fidelity, successfully avoiding issues such as red checkerboard artifacts and blurred edge details commonly observed in reconstructed images generated by traditional CycleGAN approaches. Full article
(This article belongs to the Collection Computational Imaging and Sensing)
Show Figures

Figure 1

Figure 1
<p>Network structure of LM-CycleGAN. <math display="inline"><semantics> <mrow> <mi>X</mi> </mrow> </semantics></math> denotes the underwater degraded image domain and <math display="inline"><semantics> <mrow> <mi>Y</mi> </mrow> </semantics></math> denotes the underwater high-quality image domain.</p>
Full article ">Figure 2
<p>The network structure of the LM-CycleGAN generator, where the MLP is the Multi-Layer Perceptron, “n*nConv” denotes an operation that involves processing with a single convolutional kernel, and “⊕” denotes element-wise addition.</p>
Full article ">Figure 3
<p>Network structure of Multi-scale Adaptive Fusion Attention (MAFA), where the dilation rates are set to [<a href="#B1-sensors-24-07425" class="html-bibr">1</a>,<a href="#B2-sensors-24-07425" class="html-bibr">2</a>,<a href="#B3-sensors-24-07425" class="html-bibr">3</a>,<a href="#B4-sensors-24-07425" class="html-bibr">4</a>], and the number of ‘heads’ is set to 8.</p>
Full article ">Figure 4
<p>Network structure of LM-CycleGAN discriminator. Assuming the input is an RGB image with dimensions 3 × 256 × 256 pixels, the final output will be a tensor of size 1 × 30 × 30.</p>
Full article ">Figure 5
<p>Original image and its edge image and generated image and its edge image: (<b>a</b>) Underwater degraded image; (<b>b</b>) Edge image corresponding to the underwater degraded image; (<b>c</b>) Generated underwater high-quality image; (<b>d</b>) Edge image corresponding to the underwater high-quality image.</p>
Full article ">Figure 6
<p>The network structure of Learned Perceptual Image Patch Similarity (LPIPS), <math display="inline"><semantics> <mrow> <msub> <mrow> <mi>ω</mi> </mrow> <mrow> <mi>i</mi> </mrow> </msub> </mrow> </semantics></math> denotes the specific weight layer corresponding to the ith output layer.</p>
Full article ">Figure 7
<p>Sample images from the UIEB, EUVP, and RUIE datasets: (<b>a</b>,<b>b</b>) underwater degraded images and their corresponding reference images from the UIEB dataset; (<b>c</b>,<b>d</b>) underwater degraded images and their corresponding reference images from the EUVP dataset; (<b>e</b>) underwater degraded images from the RUIE dataset.</p>
Full article ">Figure 8
<p>Visual comparison of image enhancement algorithms on the UIEB dataset.</p>
Full article ">Figure 9
<p>Visual comparison of image enhancement algorithms on the EUVP dataset.</p>
Full article ">Figure 10
<p>Visual comparison of image enhancement algorithms on the RUIE dataset.</p>
Full article ">Figure 11
<p>Comparison of enhancement effects from different strategies on the UIEB dataset, where the red box highlights local enhancement areas.</p>
Full article ">
32 pages, 7730 KiB  
Article
High-Fidelity Infrared Remote Sensing Image Generation Method Coupled with the Global Radiation Scattering Mechanism and Pix2PixGAN
by Yue Li, Xiaorui Wang, Chao Zhang, Zhonggen Zhang and Fafa Ren
Remote Sens. 2024, 16(23), 4350; https://doi.org/10.3390/rs16234350 - 21 Nov 2024
Abstract
To overcome the problems in existing infrared remote sensing image generation methods, which make it difficult to combine high fidelity and high efficiency, we propose a High-Fidelity Infrared Remote Sensing Image Generation Method Coupled with the Global Radiation Scattering Mechanism and Pix2PixGAN (HFIRSIGM_GRSMP) [...] Read more.
To overcome the problems in existing infrared remote sensing image generation methods, which make it difficult to combine high fidelity and high efficiency, we propose a High-Fidelity Infrared Remote Sensing Image Generation Method Coupled with the Global Radiation Scattering Mechanism and Pix2PixGAN (HFIRSIGM_GRSMP) in this paper. Firstly, based on the global radiation scattering mechanism, the HFIRSIGM_GRSMP model is constructed to address the problem of accurately characterizing factors that affect fidelity—such as the random distribution of the radiation field, multipath scattering, and nonlinear changes—through the innovative fusion of physical models and deep learning. This model accurately characterizes the complex radiation field distribution and the image detail-feature mapping relationship from visible-to-infrared remote sensing. Then, 8000 pairs of image datasets were constructed based on Landsat 8 and Sentinel-2 satellite data. Finally, the experiment demonstrates that the average SSIM of images generated using HFIRSIGM_GRSMP reaches 89.16%, and all evaluation metrics show significant improvement compared to the contrast models. More importantly, this method demonstrates high accuracy and strong adaptability in generating short-wave, mid-wave, and long-wave infrared remote sensing images. This method provides a more comprehensive solution for generating high-fidelity infrared remote sensing images. Full article
(This article belongs to the Special Issue Deep Learning Innovations in Remote Sensing)
23 pages, 39189 KiB  
Article
An Assessment of the Map-Style Influence on Generalization with CycleGAN: Taking Line Features as an Example
by Heng Yu, Haoxuan Chen and Ling Zhang
ISPRS Int. J. Geo-Inf. 2024, 13(12), 418; https://doi.org/10.3390/ijgi13120418 - 21 Nov 2024
Abstract
As the complexity of GIS data continues to increase, there is a growing demand for automated map generalization. As end-to-end generative models, GAN models offer new solutions for automated map generalization. This study explores the impact of different map symbolization configurations on generative [...] Read more.
As the complexity of GIS data continues to increase, there is a growing demand for automated map generalization. As end-to-end generative models, GAN models offer new solutions for automated map generalization. This study explores the impact of different map symbolization configurations on generative models, specifically using CycleGAN for line feature generalization. The quality of the generated results was assessed by constructing various symbolization datasets (line width, type, and color) and evaluating CycleGAN’s performance using metrics such as the MSE, SSIM, and PSNR. The results indicate that moderate line widths (0.5–1) yield better detail preservation, and different line types (framed lines and dashed lines) can highlight feature boundaries and enhance visual perception. By contrast, high-contrast color schemes enhance feature differentiation but increase pixel-level errors. This study concludes that generative models can maintain the geometric structure and spatial distribution of line features, but it is crucial to choose more suitable line features for different scenarios to meet detail requirements, ensuring high-quality outputs under diverse configurations. Full article
Show Figures

Figure 1

Figure 1
<p>The research framework of symbolization impact on generative linear feature generalization.</p>
Full article ">Figure 2
<p>The model architecture of CycleGAN.</p>
Full article ">Figure 3
<p>Example of multi-width linear feature dataset.</p>
Full article ">Figure 4
<p>Example of multi-type line feature dataset.</p>
Full article ">Figure 5
<p>Example of multi-color line feature dataset.</p>
Full article ">Figure 6
<p>Study area and data samples.</p>
Full article ">Figure 7
<p>Generation results of multi-width linear features.</p>
Full article ">Figure 8
<p>Evaluation metrics for the generative results of multi-width features.</p>
Full article ">Figure 9
<p>Issues in the generative results when the line width is thicker.</p>
Full article ">Figure 10
<p>Evaluation metrics for the generative results of multi-type features.</p>
Full article ">Figure 11
<p>Example of generative results for dashed lines.</p>
Full article ">Figure 12
<p>Disadvantages of the generative results for dashed lines.</p>
Full article ">Figure 13
<p>Example of generative results for framed lines.</p>
Full article ">Figure 14
<p>Disadvantages of the generative results for framed lines.</p>
Full article ">Figure 15
<p>Example of generative results for striped lines.</p>
Full article ">Figure 16
<p>Disadvantages of the generative results for striped lines.</p>
Full article ">Figure 17
<p>Evaluation metrics for the generative results of multi-color features.</p>
Full article ">Figure 18
<p>Examples and disadvantages of generative results for grayscale features.</p>
Full article ">Figure 19
<p>Examples and disadvantages of generative results for high-contrast features.</p>
Full article ">Figure 20
<p>Consistency of generative results for linear features at appropriate widths.</p>
Full article ">Figure 21
<p>The highlighting effect of framed lines and dashed lines on features.</p>
Full article ">Figure 22
<p>The distinguishing effect of high-contrast colors on multiple types of features.</p>
Full article ">Figure 23
<p>Example of mixed symbolized line feature dataset.</p>
Full article ">Figure 24
<p>Evaluation metrics for the generative results of mixed symbolized features.</p>
Full article ">Figure 25
<p>Issues in generated images with mixed symbolization configurations.</p>
Full article ">Figure 26
<p>Evaluation metrics for the GCGAN results of mixed symbolized features.</p>
Full article ">Figure 27
<p>Comparison of results between CycleGAN and GCGAN.</p>
Full article ">
30 pages, 2346 KiB  
Article
A Novel Method for 3D Lung Tumor Reconstruction Using Generative Models
by Hamidreza Najafi, Kimia Savoji, Marzieh Mirzaeibonehkhater, Seyed Vahid Moravvej, Roohallah Alizadehsani and Siamak Pedrammehr
Diagnostics 2024, 14(22), 2604; https://doi.org/10.3390/diagnostics14222604 - 20 Nov 2024
Viewed by 382
Abstract
Background: Lung cancer remains a significant health concern, and the effectiveness of early detection significantly enhances patient survival rates. Identifying lung tumors with high precision is a challenge due to the complex nature of tumor structures and the surrounding lung tissues. Methods: To [...] Read more.
Background: Lung cancer remains a significant health concern, and the effectiveness of early detection significantly enhances patient survival rates. Identifying lung tumors with high precision is a challenge due to the complex nature of tumor structures and the surrounding lung tissues. Methods: To address these hurdles, this paper presents an innovative three-step approach that leverages Generative Adversarial Networks (GAN), Long Short-Term Memory (LSTM), and VGG16 algorithms for the accurate reconstruction of three-dimensional (3D) lung tumor images. The first challenge we address is the accurate segmentation of lung tissues from CT images, a task complicated by the overwhelming presence of non-lung pixels, which can lead to classifier imbalance. Our solution employs a GAN model trained with a reinforcement learning (RL)-based algorithm to mitigate this imbalance and enhance segmentation accuracy. The second challenge involves precisely detecting tumors within the segmented lung regions. We introduce a second GAN model with a novel loss function that significantly improves tumor detection accuracy. Following successful segmentation and tumor detection, the VGG16 algorithm is utilized for feature extraction, preparing the data for the final 3D reconstruction. These features are then processed through an LSTM network and converted into a format suitable for the reconstructive GAN. This GAN, equipped with dilated convolution layers in its discriminator, captures extensive contextual information, enabling the accurate reconstruction of the tumor’s 3D structure. Results: The effectiveness of our method is demonstrated through rigorous evaluation against established techniques using the LIDC-IDRI dataset and standard performance metrics, showcasing its superior performance and potential for enhancing early lung cancer detection. Conclusions:This study highlights the benefits of combining GANs, LSTM, and VGG16 into a unified framework. This approach significantly improves the accuracy of detecting and reconstructing lung tumors, promising to enhance diagnostic methods and patient results in lung cancer treatment. Full article
(This article belongs to the Special Issue AI and Digital Health for Disease Diagnosis and Monitoring)
Show Figures

Figure 1

Figure 1
<p>Overview of the proposed model: In step 1, two lungs are segmented in CT images using the GAN-based model. In step 2, the tumor is detected using the second GAN-based model. After the features are extracted by VGG16, a 3D model of the tumor is reconstructed using the third GAN in step 3.</p>
Full article ">Figure 2
<p>Architecture of the U-Ne-based generator network used for lung segmentation, illustrating the flow from input CT scan through the encoder and decoder stages to the final mask output.</p>
Full article ">Figure 3
<p>Comparative visualization of original CT scans and segmented lung regions by the proposed model.</p>
Full article ">Figure 4
<p>(<b>a</b>) Optimization of the <math display="inline"><semantics> <mi>λ</mi> </semantics></math> parameter for lung segmentation model performance, (<b>b</b>) learning trajectory of the reward optimization of the agent on 100 episodes.</p>
Full article ">Figure 5
<p>Comparative analysis of tumor detection. The top row shows the ground truth, while the bottom row presents the model predictions. The black square in every sample shows the tumor extracted by the proposed model.</p>
Full article ">Figure 6
<p>Comparison of original and reconstructed 3D tumor shapes highlighting the effectiveness of the proposed reconstruction method in preserving boundary smoothness.</p>
Full article ">Figure 7
<p>Distribution of decision-making times for 3D reconstruction in RTB environments.</p>
Full article ">Figure 8
<p>Loss trends in (<b>a</b>) lung segmentation, (<b>b</b>) tumor detection, and (<b>c</b>) 3D reconstruction models over 250 epochs.</p>
Full article ">Figure 9
<p>HD metric trends in lung segmentation, tumor detection, and 3D reconstruction over 250 epochs.</p>
Full article ">
18 pages, 2568 KiB  
Article
ATGT3D: Animatable Texture Generation and Tracking for 3D Avatars
by Fei Chen and Jaeho Choi
Electronics 2024, 13(22), 4562; https://doi.org/10.3390/electronics13224562 - 20 Nov 2024
Viewed by 183
Abstract
We propose the ATGT3D an Animatable Texture Generation and Tracking for 3D Avatars, featuring the innovative design of the Eye Diffusion Module (EDM) and Pose Tracking Diffusion Module (PTDM), which are dedicated to high-quality eye texture generation and synchronized tracking of dynamic poses [...] Read more.
We propose the ATGT3D an Animatable Texture Generation and Tracking for 3D Avatars, featuring the innovative design of the Eye Diffusion Module (EDM) and Pose Tracking Diffusion Module (PTDM), which are dedicated to high-quality eye texture generation and synchronized tracking of dynamic poses and textures, respectively. Compared to traditional GAN and VAE methods, ATGT3D significantly enhances texture consistency and generation quality in animated scenes using the EDM, which produces high-quality full-body textures with detailed eye information using the HUMBI dataset. Additionally, the Pose Tracking and Diffusion Module (PTDM) monitors human motion parameters utilizing the BEAT2 and AMASS mesh-level animatable human model datasets. The EDM, in conjunction with a basic texture seed featuring eyes and the diffusion model, restores high-quality textures, whereas the PTDM, by integrating MoSh++ and SMPL-X body parameters, models hand and body movements from 2D human images, thus providing superior 3D motion capture datasets. This module maintains the synchronization of textures and movements over time to ensure precise animation texture tracking. During training, the ATGT3D model uses the diffusion model as the generative backbone to produce new samples. The EDM improves the texture generation process by enhancing the precision of eye details in texture images. The PTDM involves joint training for pose generation and animation tracking reconstruction. Textures and body movements are generated individually using encoded prompts derived from masked gestures. Furthermore, ATGT3D adaptively integrates texture and animation features using the diffusion model to enhance both fidelity and diversity. Experimental results show that ATGT3D achieves optimal texture generation performance and can flexibly integrate predefined spatiotemporal animation inputs to create comprehensive human animation models. Our experiments yielded unexpectedly positive outcomes. Full article
(This article belongs to the Special Issue AI for Human Collaboration)
Show Figures

Figure 1

Figure 1
<p>This framework facilitates the generation and tracking of animated textures for 3D virtual images.</p>
Full article ">Figure 2
<p>The AMASS dataset is sorted based on the attributes with the most actions (motions) and the least time (minutes). The light blue bars represent subsets of the dataset not utilized in the study, dark blue bars remaining subsets were selected for evaluation and experimentation.</p>
Full article ">Figure 3
<p>Overview of the proposed method for texture recovery estimation from a single image.</p>
Full article ">Figure 4
<p>Complete texture tracking in our method to match 3D human models.</p>
Full article ">Figure 5
<p>Texture generation as well as tracking and matching graphs of textures with modeled action poses.</p>
Full article ">Figure 6
<p>Part (<b>a</b>) depicts the texture map processed using the image diffusion model to recover high-quality texture. Part (<b>b</b>) shows the untrained texture map. Clearly, the clarity of the eye part is superior in image (<b>a</b>) compared to image (<b>b</b>).</p>
Full article ">Figure 7
<p>Examples ofmultiple actions across multiple datasets. From top to bottom: natural human postures of various actions for (<b>a</b>) AMASS jump Model and AMASS jump Texture, (<b>b</b>) AMASS pick-up Model and AMASS pick-up Texture, and (<b>c</b>) EMAGE Model and EMAGE Texture.</p>
Full article ">
18 pages, 1819 KiB  
Article
Detecting Adversarial Attacks in IoT-Enabled Predictive Maintenance with Time-Series Data Augmentation
by Flora Amato, Egidia Cirillo, Mattia Fonisto and Alberto Moccardi
Information 2024, 15(11), 740; https://doi.org/10.3390/info15110740 - 20 Nov 2024
Viewed by 350
Abstract
Despite considerable advancements in integrating the Internet of Things (IoT) and artificial intelligence (AI) within the industrial maintenance framework, the increasing reliance on these innovative technologies introduces significant vulnerabilities due to cybersecurity risks, potentially compromising the integrity of decision-making processes. Accordingly, this study [...] Read more.
Despite considerable advancements in integrating the Internet of Things (IoT) and artificial intelligence (AI) within the industrial maintenance framework, the increasing reliance on these innovative technologies introduces significant vulnerabilities due to cybersecurity risks, potentially compromising the integrity of decision-making processes. Accordingly, this study aims to offer comprehensive insights into the cybersecurity challenges associated with predictive maintenance, proposing a novel methodology that leverages generative AI for data augmentation, enhancing threat detection capabilities. Experimental evaluations conducted using the NASA Commercial Modular Aero-Propulsion System Simulation (N-CMAPSS) dataset affirm the viability of this approach leveraging the state-of-the-art TimeGAN model for temporal-aware data generation and building a recurrent classifier for attack discrimination in a balanced dataset. The classifier’s results demonstrate the satisfactory and robust performance achieved in terms of accuracy (between 80% and 90%) and how the strategic generation of data can effectively bolster the resilience of intelligent maintenance systems against cyber threats. Full article
Show Figures

Figure 1

Figure 1
<p>Relationship between vulnerabilities and impact of attacks.</p>
Full article ">Figure 2
<p>Failure-data scarcity and augmentation practices in predictive maintenance.</p>
Full article ">Figure 3
<p>NASA Commercial Modular Aero-Propulsion Simulation System (N-CMAPSS) [<a href="#B34-information-15-00740" class="html-bibr">34</a>].</p>
Full article ">Figure 4
<p>Compact workflow diagram for IoT system integration.</p>
Full article ">Figure 5
<p>Time-series data augmentation.</p>
Full article ">Figure 6
<p>Time GAN architecture, kernels and loss functions.</p>
Full article ">Figure 7
<p>Exploratory data analysis of FD001 N-CMAPSS dataset.</p>
Full article ">Figure 8
<p>TimeGAN training process.</p>
Full article ">Figure 9
<p>Visualization of synthetic data and original data with PCA and t-SNE.</p>
Full article ">Figure 10
<p>Training and validation performance of the classifier over 250 epochs. The left panel shows accuracy, and the right panel shows AUC. Solid lines represent training metrics, and dashed lines represent validation metrics.</p>
Full article ">
30 pages, 7296 KiB  
Article
Estimation of Arterial Path Flow Considering Flow Distribution Consistency: A Data-Driven Semi-Supervised Method
by Zhe Zhang, Qi Cao, Wenxie Lin, Jianhua Song, Weihan Chen and Gang Ren
Systems 2024, 12(11), 507; https://doi.org/10.3390/systems12110507 - 19 Nov 2024
Viewed by 348
Abstract
To implement fine-grained progression signal control on arterial, it is essential to have access to the time-varying distribution of the origin–destination (OD) flow of the arterial. However, due to the sparsity of automatic vehicle identification (AVI) devices and the low penetration of connected [...] Read more.
To implement fine-grained progression signal control on arterial, it is essential to have access to the time-varying distribution of the origin–destination (OD) flow of the arterial. However, due to the sparsity of automatic vehicle identification (AVI) devices and the low penetration of connected vehicles (CVs), it is difficult to directly obtain the distribution pattern of arterial OD flow (i.e., path flow). To solve this problem, this paper develops a semi-supervised arterial path flow estimation method considering the consistency of path flow distribution by combining the sparse AVI data and the low permeability CV data. Firstly, this paper proposes a semi-supervised arterial path flow estimation model based on multi-knowledge graphs. It utilizes graph neural networks to combine some arterial AVI OD flow observation information with CV trajectory information to infer the path flow of AVI unobserved OD pairs. Further, to ensure that the estimation results of the multi-knowledge graph path flow estimation model are consistent with the distribution of path flow in real situations, we introduce a generative adversarial network (GAN) architecture to correct the estimation results. The proposed model is extensively tested based on a real signalized arterial. The results show that the proposed model is still able to achieve reliable estimation results under low connected vehicle penetration and with less observed label data. Full article
Show Figures

Figure 1

Figure 1
<p>Schematic of arterial scenario equipped with AVI devices.</p>
Full article ">Figure 2
<p>A signalized arterial structure.</p>
Full article ">Figure 3
<p>Data-driven semi-supervised arterial path flow estimation problem description.</p>
Full article ">Figure 4
<p>Framework of the proposed method.</p>
Full article ">Figure 5
<p>Semi-supervised arterial path flow estimation based on GCN.</p>
Full article ">Figure 6
<p>Topology connection diagram.</p>
Full article ">Figure 7
<p>Multi-knowledge graph fusion based on RGCN.</p>
Full article ">Figure 8
<p>The structure of the path flow estimation based on multiple knowledge graphs.</p>
Full article ">Figure 9
<p>The structure of the typical GAN.</p>
Full article ">Figure 10
<p>The structure of the multi-knowledge graph GAN model.</p>
Full article ">Figure 11
<p>The geometric layout of the studied site.</p>
Full article ">Figure 12
<p>The time distribution patterns of path flows in the studied arterial. (This figure serves as the foundation for calculating the temporal similarity and potential correlations between different paths. Based on these calculations, the temporal similarity graph and potential correlation graph within the multi-knowledge graph structure are constructed. We utilized the dynamic time warping (DTW) algorithm and the maximal information coefficient (MIC) algorithm to compute the temporal similarity and potential correlations based on the flow information of each path. These correlations are crucial for identifying patterns and dependencies that can inform the model’s output).</p>
Full article ">Figure 13
<p>The corresponding adjacency matrices of the three knowledge graphs. (<b>a</b>) Topological connectivity graph. Each cell in the matrix represents the connectivity between two paths, with darker colors indicating stronger connections and reflecting higher topological proximity. This graph helps to capture the structural relationships between different paths in the arterial. (<b>b</b>) Temporal similarity graph. Each cell represents the temporal similarity between two paths, with darker colors indicating higher similarity. This graph captures the dynamic nature of traffic flow over time, providing insights into how different paths behave similarly during specific time intervals. (<b>c</b>) Potential correlation graph. Each cell represents the potential correlation between two paths, with darker colors indicating stronger correlations. This graph highlights the statistical dependencies and interactions between different paths. During the estimation process, the model utilizes RGCN to extract feature information from the topological connectivity graph, temporal similarity graph, and potential correlation graph. By deeply fusing these features, the model can leverage the characteristics of other paths that have strong associations with the target path, thereby enhancing the estimation accuracy.</p>
Full article ">Figure 14
<p>The four paths with the best estimation performance. (<b>a</b>) Path1-5, (<b>b</b>) Path2-3, (<b>c</b>) Path5-2, and (<b>d</b>) Path5-4.</p>
Full article ">Figure 15
<p>The four paths with the worst estimation performance. (<b>a</b>) Path3-5, (<b>b</b>) Path2-5, (<b>c</b>) Path4-2, and (<b>d</b>) Path4-5.</p>
Full article ">Figure 16
<p>Critical path recognition reliability analysis. (<b>a</b>) SSM model, and (<b>b</b>) MKG-GAN model.</p>
Full article ">Figure 17
<p>Schematic of long-distance arterial scenario.</p>
Full article ">Figure 18
<p>Percentage of unobserved paths whose estimates satisfy different R<sup>2</sup> values.</p>
Full article ">Figure 19
<p>Estimated performance of MKG-GAN model with different CV penetration rates for different traffic conditions.</p>
Full article ">
19 pages, 7807 KiB  
Article
Harnessing Risks with Data: A Leakage Assessment Framework for WDN Using Multi-Attention Mechanisms and Conditional GAN-Based Data Balancing
by Wenhong Wu, Jiahao Zhang, Yunkai Kang, Zhengju Tang, Xinyu Pan and Ning Liu
Water 2024, 16(22), 3329; https://doi.org/10.3390/w16223329 - 19 Nov 2024
Viewed by 302
Abstract
Assessing leakage risks in water distribution networks (WDNs) and implementing preventive monitoring for high-risk pipelines has become a widely accepted approach for leakage control. However, existing methods face significant data barriers between Geographic Information System (GIS) and leakage prediction systems. These barriers hinder [...] Read more.
Assessing leakage risks in water distribution networks (WDNs) and implementing preventive monitoring for high-risk pipelines has become a widely accepted approach for leakage control. However, existing methods face significant data barriers between Geographic Information System (GIS) and leakage prediction systems. These barriers hinder traditional pipeline risk assessment methods, particularly when addressing challenges such as data imbalance, poor model interpretability, and lack of intuitive prediction results. To overcome these limitations, this study proposes a leakage assessment framework for water distribution networks based on multiple attention mechanisms and a generative model-based data balancing method. Extensive comparative experiments were conducted using water distribution network data from B2 and B3 District Metered Areas in Zhengzhou. The results show that the proposed model, optimized with a balanced data method, achieved a 40.76% improvement in the recall rate for leakage segment assessments, outperforming the second-best model using the same strategy by 1.7%. Furthermore, the strategy effectively enhanced the performance of all models, further proving that incorporating more valid data contributes to improved assessment results. This study comprehensively demonstrates the application of data-driven models in the field of “smart water management”, providing practical guidance and reference cases for advancing the development of intelligent water infrastructure. Full article
Show Figures

Figure 1

Figure 1
<p>Framework.</p>
Full article ">Figure 2
<p>Data Enhancement Program.</p>
Full article ">Figure 3
<p>Pipeline Risk Prediction Modeling Framework.</p>
Full article ">Figure 4
<p>Enhanced sample balance results for leakage sample data: (<b>a</b>) Original Data; (<b>b</b>) SMOTE Data Enhancement Method; (<b>c</b>) Conditional GAN Data Augmentation Method. (The horizontal and vertical axes represent the two-dimensional vector values obtained by dimensionality reduction of the high-dimensional representation of the samples, serving only as markers).</p>
Full article ">Figure 5
<p>SHAP analysis results: Overall Ranking Analysis of SHAP Risk Factors.</p>
Full article ">Figure 6
<p>SHAP analysis results: SHAP Single Case Analysis-Case 10850.</p>
Full article ">Figure 7
<p>SHAP analysis results: SHAP Single Example Analysis-Case 486.</p>
Full article ">Figure 8
<p>Pipeline age leakage rate analysis: (<b>a</b>) Age distribution of pipelines assessed as a level of risk, (<b>b</b>) Leakage ratios for pipelines of different ages.</p>
Full article ">Figure 9
<p>Visualization Platform: (<b>a</b>) Leakage Point; (<b>b</b>) Pipeline Location; (<b>c</b>) B2, B3 DMA location; (<b>d</b>) Total Pipeline leakage risk status statistics; (<b>e</b>) Classified pipeline leakage risk status statistics and positioning; (<b>f</b>) Individual pipeline leakage risk status information; (<b>g</b>) Region-specific leakage risk status statistics.</p>
Full article ">
20 pages, 717 KiB  
Review
Deep Learning-Based Atmospheric Visibility Detection
by Yawei Qu, Yuxin Fang, Shengxuan Ji, Cheng Yuan, Hao Wu, Shengbo Zhu, Haoran Qin and Fan Que
Atmosphere 2024, 15(11), 1394; https://doi.org/10.3390/atmos15111394 - 19 Nov 2024
Viewed by 228
Abstract
Atmospheric visibility is a crucial meteorological element impacting urban air pollution monitoring, public transportation, and military security. Traditional visibility detection methods, primarily manual and instrumental, have been costly and imprecise. With advancements in data science and computing, deep learning-based visibility detection technologies have [...] Read more.
Atmospheric visibility is a crucial meteorological element impacting urban air pollution monitoring, public transportation, and military security. Traditional visibility detection methods, primarily manual and instrumental, have been costly and imprecise. With advancements in data science and computing, deep learning-based visibility detection technologies have rapidly emerged as a research hotspot in atmospheric science. This paper systematically reviews the applications of various deep learning models—Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and Transformer networks—in visibility estimation, prediction, and enhancement. Each model’s characteristics and application methods are discussed, highlighting the efficiency of CNNs in spatial feature extraction, RNNs in temporal tracking, GANs in image restoration, and Transformers in capturing long-range dependencies. Furthermore, the paper addresses critical challenges in the field, including dataset quality, algorithm optimization, and practical application barriers, proposing future research directions, such as the development of large-scale, accurately labeled datasets, innovative learning strategies, and enhanced model interpretability. These findings highlight the potential of deep learning in enhancing atmospheric visibility detection techniques, providing valuable insights into the literature and contributing to advances in the field of meteorological observation and public safety. Full article
(This article belongs to the Special Issue Air Pollution Modeling and Observations in Asian Megacities)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>The basic structure of (<b>a</b>) CNNs, (<b>b</b>) RNNs, (<b>c</b>) GANs, and (<b>d</b>) Transformers.</p>
Full article ">
24 pages, 9386 KiB  
Article
Toward Improving Human Training by Combining Wearable Full-Body IoT Sensors and Machine Learning
by Nazia Akter, Andreea Molnar and Dimitrios Georgakopoulos
Sensors 2024, 24(22), 7351; https://doi.org/10.3390/s24227351 - 18 Nov 2024
Viewed by 400
Abstract
This paper proposes DigitalUpSkilling, a novel IoT- and AI-based framework for improving and personalising the training of workers who are involved in physical-labour-intensive jobs. DigitalUpSkilling uses wearable IoT sensors to observe how individuals perform work activities. Such sensor observations are continuously processed to [...] Read more.
This paper proposes DigitalUpSkilling, a novel IoT- and AI-based framework for improving and personalising the training of workers who are involved in physical-labour-intensive jobs. DigitalUpSkilling uses wearable IoT sensors to observe how individuals perform work activities. Such sensor observations are continuously processed to synthesise an avatar-like kinematic model for each worker who is being trained, referred to as the worker’s digital twins. The framework incorporates novel work activity recognition using generative adversarial network (GAN) and machine learning (ML) models for recognising the types and sequences of work activities by analysing an individual’s kinematic model. Finally, the development of skill proficiency ML is proposed to evaluate each trainee’s proficiency in work activities and the overall task. To illustrate DigitalUpSkilling from wearable IoT-sensor-driven kinematic models to GAN-ML models for work activity recognition and skill proficiency assessment, the paper presents a comprehensive study on how specific meat processing activities in a real-world work environment can be recognised and assessed. In the study, DigitalUpSkilling achieved 99% accuracy in recognising specific work activities performed by meat workers. The study also presents an evaluation of the proficiency of workers by comparing kinematic data from trainees performing work activities. The proposed DigitalUpSkilling framework lays the foundation for next-generation digital personalised training. Full article
(This article belongs to the Special Issue Wearable and Mobile Sensors and Data Processing—2nd Edition)
Show Figures

Figure 1

Figure 1
<p>DigitalUpSkilling framework.</p>
Full article ">Figure 2
<p>Hybrid GAN-ML activity classification.</p>
Full article ">Figure 3
<p>Skill proficiency assessment.</p>
Full article ">Figure 4
<p>(<b>a</b>) Placement of sensors; (<b>b</b>) sensors and straps; (<b>c</b>) alignment of sensors with the participant’s movements.</p>
Full article ">Figure 5
<p>Work environment for the data collection: (<b>a</b>) boning area; (<b>b</b>) slicing area.</p>
Full article ">Figure 6
<p>Dataflow of the study.</p>
Full article ">Figure 7
<p>(<b>a</b>) Worker performing boning; (<b>b</b>) worker’s real-time digital twin; (<b>c</b>) digital twins showing body movements along with real-time graphs of the joint’s movements.</p>
Full article ">Figure 8
<p>Comparison of the error rates of the different ML models.</p>
Full article ">Figure 9
<p>Confusion matrices: (<b>a</b>) boning; (<b>b</b>) slicing with pitch and roll from right-hand sensors.</p>
Full article ">Figure 10
<p>Distribution of the activity classification: (<b>a</b>) boning; (<b>b</b>) slicing.</p>
Full article ">Figure 11
<p>Accuracy of the GAN for different percentages of synthetic data: (<b>a</b>) boning; (<b>b</b>) slicing.</p>
Full article ">Figure 12
<p>Accuracy of the GAN with different percentages of synthetic data (circled area showing drop in the accuracy): (<b>a</b>) boning; (<b>b</b>) slicing.</p>
Full article ">Figure 13
<p>Classification accuracy with the GAN, SMOTE, and ENN (circled area showing improvement in the accuracy): (<b>a</b>) boning; (<b>b</b>) slicing.</p>
Full article ">Figure 14
<p>Distribution of right-hand pitch and roll mean (in degree).</p>
Full article ">Figure 15
<p>Comparison of the engagement in boning (W1: Worker 1; W2: Worker 2).</p>
Full article ">Figure 16
<p>Comparison of the engagement in slicing.</p>
Full article ">Figure 17
<p>Comparison of the accelerations of the right hand.</p>
Full article ">Figure 18
<p>Comparison of the accelerations of the right-hand.</p>
Full article ">Figure 19
<p>Comparisons of abduction, rotation, and flexion of the right shoulder during boning activities: (<b>a</b>) worker 1; (<b>b</b>) worker 2.</p>
Full article ">
30 pages, 10377 KiB  
Article
An Intelligent Kick Detection Model for Large-Hole Ultra-Deep Wells in the Sichuan Basin
by Xudong Wang, Pengcheng Wu, Ye Chen, Ergang Zhang, Xiaoke Ye, Qi Huang, Chi Peng and Jianhong Fu
Processes 2024, 12(11), 2589; https://doi.org/10.3390/pr12112589 - 18 Nov 2024
Viewed by 351
Abstract
The Sichuan Basin has abundant deep and ultra-deep natural gas resources, making it a primary target for exploration and the development of China’s oil and gas industry. However, during the drilling of ultra-deep wells in the Sichuan Basin, complex geological conditions frequently lead [...] Read more.
The Sichuan Basin has abundant deep and ultra-deep natural gas resources, making it a primary target for exploration and the development of China’s oil and gas industry. However, during the drilling of ultra-deep wells in the Sichuan Basin, complex geological conditions frequently lead to gas kicks, posing significant challenges to well control and safety. Compared to traditional kick detection methods, artificial intelligence technology can improve the accuracy and timeliness of kick detection. However, there are limited real-world kick data available from drilling operations, and the datasets are extremely imbalanced, making it difficult to train intelligent models with sufficient accuracy and generalization capabilities. To address this issue, this paper proposes a kick data augmentation method based on a time-series generative adversarial network (TimeGAN). This method generates synthetic kick samples from real datasets and then employs a long short-term memory (LSTM) neural network to extract multivariate time-series features of surface drilling parameters. A multilayer perceptron (MLP) network is used for data classification tasks, constructing an intelligent kick detection model. Using real drilling data from ultra-deep wells in the SY block of the Sichuan Basin, the effects of k-fold cross-validation, data dimensionality, various imbalanced data handling techniques, and the sample imbalance ratio on the model’s kick detection performance are analyzed. Ablation experiments are also conducted to assess the contribution of each module in identifying kick. The results show that TimeGAN outperforms other imbalanced data handling techniques. The accuracy, recall, precision, and F1-score of the kick identification model are highest when the sample imbalance ratio is at 1 but decrease as the imbalance ratio increases. This indicates that maintaining a balance between positive and negative samples is essential for training a reliable intelligent kick detection model. The trained model is applied during the drilling of seven ultra-deep wells in Sichuan, demonstrating its effectiveness and accuracy in real-world kick detection. Full article
(This article belongs to the Special Issue Modeling, Control, and Optimization of Drilling Techniques)
Show Figures

Figure 1

Figure 1
<p>Framework of the intelligent kick detection model.</p>
Full article ">Figure 2
<p>Workflow of the intelligent kick detection model.</p>
Full article ">Figure 3
<p>Training process of the TimeGAN network.</p>
Full article ">Figure 4
<p>Standard structure of an LSTM network.</p>
Full article ">Figure 5
<p>Time-series feature extraction model.</p>
Full article ">Figure 6
<p>Time-series feature extraction.</p>
Full article ">Figure 7
<p>Correlation analysis of the input parameters. (ΔSPP—SPP difference, ΔV—mud pit increment, DT—drilling time, ΔDF—inlet/outlet flow rate difference, ΔTD—inlet/outlet mud temperature difference, ΔDD—inlet/outlet mud density difference, ΔCD—inlet/outlet mud conductivity difference, Dc—Dc exponent, WOB—weight on bit, RPM—revolution per minute, CASEP—casing pressure).</p>
Full article ">Figure 8
<p>Example of kick data augmentation by TimeGAN. (<b>a</b>) Drilling time, (<b>b</b>) standpipe pressure, (<b>c</b>) mud pit increment, (<b>d</b>) i/outlet mud temperature difference, (<b>e</b>) inlet/outlet mud conductivity difference, (<b>f</b>) inlet/outlet mud density difference, (<b>g</b>) inlet/outlet flow rate difference, (<b>h</b>) inlet/outlet flow rate difference.</p>
Full article ">Figure 8 Cont.
<p>Example of kick data augmentation by TimeGAN. (<b>a</b>) Drilling time, (<b>b</b>) standpipe pressure, (<b>c</b>) mud pit increment, (<b>d</b>) i/outlet mud temperature difference, (<b>e</b>) inlet/outlet mud conductivity difference, (<b>f</b>) inlet/outlet mud density difference, (<b>g</b>) inlet/outlet flow rate difference, (<b>h</b>) inlet/outlet flow rate difference.</p>
Full article ">Figure 8 Cont.
<p>Example of kick data augmentation by TimeGAN. (<b>a</b>) Drilling time, (<b>b</b>) standpipe pressure, (<b>c</b>) mud pit increment, (<b>d</b>) i/outlet mud temperature difference, (<b>e</b>) inlet/outlet mud conductivity difference, (<b>f</b>) inlet/outlet mud density difference, (<b>g</b>) inlet/outlet flow rate difference, (<b>h</b>) inlet/outlet flow rate difference.</p>
Full article ">Figure 9
<p>Evaluation metrics under different k and feature parameter dimensions. (<b>a</b>) Accuracy. (<b>b</b>) Recall. (<b>c</b>) Precision. (<b>d</b>) F-measure.</p>
Full article ">Figure 9 Cont.
<p>Evaluation metrics under different k and feature parameter dimensions. (<b>a</b>) Accuracy. (<b>b</b>) Recall. (<b>c</b>) Precision. (<b>d</b>) F-measure.</p>
Full article ">Figure 10
<p>Model training time with different k values.</p>
Full article ">Figure 11
<p>The impact of imbalance ratios on the performance of the kick detection model. (<b>a</b>) Accuracy. (<b>b</b>) Recall. (<b>c</b>) Precision. (<b>d</b>) F-measure.</p>
Full article ">Figure 11 Cont.
<p>The impact of imbalance ratios on the performance of the kick detection model. (<b>a</b>) Accuracy. (<b>b</b>) Recall. (<b>c</b>) Precision. (<b>d</b>) F-measure.</p>
Full article ">Figure 12
<p>The configuration of well SY-5.</p>
Full article ">
12 pages, 4376 KiB  
Article
High-Quality Epitaxial Cobalt-Doped GaN Nanowires on Carbon Paper for Stable Lithium-Ion Storage
by Peng Wu, Xiaoguang Wang, Danchen Wang, Yifan Wang, Qiuju Zheng, Tailin Wang, Changlong Sun, Dan Liu, Fuzhou Chen and Sake Wang
Molecules 2024, 29(22), 5428; https://doi.org/10.3390/molecules29225428 - 18 Nov 2024
Viewed by 279
Abstract
Due to its distinctive structure and unique physicochemical properties, gallium nitride (GaN) has been considered a prospective candidate for lithium storage materials. However, its inferior conductivity and unsatisfactory cycle performance hinder the further application of GaN as a next-generation anode material for lithium-ion [...] Read more.
Due to its distinctive structure and unique physicochemical properties, gallium nitride (GaN) has been considered a prospective candidate for lithium storage materials. However, its inferior conductivity and unsatisfactory cycle performance hinder the further application of GaN as a next-generation anode material for lithium-ion batteries (LIBs). To address this, cobalt (Co)-doped GaN (Co-GaN) nanowires have been designed and synthesized by utilizing the chemical vapor deposition (CVD) strategy. The structural characterizations indicate that the doped Co elements in the GaN nanowires exist as Co2+ rather than metallic Co. The Co2+ prominently promotes electrical conductivity and ion transfer efficiency in GaN. The cycling capacity of Co-GaN reached up to 495.1 mA h g−1 after 100 cycles. After 500 cycles at 10 A g−1, excellent cycling capacity remained at 276.6 mA h g−1. The intimate contact between Co-GaN nanowires and carbon paper enhances the conductivity of the composite. Density functional theory (DFT) calculations further illustrated that Co substitution changed the electron configuration in the GaN, which led to enhancement of the electron transfer efficiency and a reduction in the ion diffusion barrier on the Co-GaN electrode. This doping design boosts the lithium-ion storage performance of GaN as an advanced material in lithium-ion battery anodes and in other electrochemical applications. Full article
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>,<b>b</b>) SEM images at different magnifications. (<b>c</b>) Energy-dispersive X-ray (EDX) elemental test. (<b>d</b>) TEM image. (<b>e</b>) High-resolution TEM images. (<b>f</b>) SAED images. (<b>g</b>–<b>j</b>) EDS mapping analysis of Co-GaN.</p>
Full article ">Figure 2
<p>(<b>a</b>) XRD patterns, (<b>b</b>) amplified XRD patterns from 32° to 38°, and (<b>c</b>) the corresponding schematic structure model of Co-GaN (White ball for hydrogen atom, pink ball for gallium atom, blue ball for nitrogen atom and red ball for cobalt atom). (<b>d</b>) Raman spectra, (<b>e</b>) Co 2p, (<b>f</b>,<b>g</b>) Ga 3d, and (<b>h</b>,<b>i</b>) N 1s XPS spectra of GaN and Co-GaN, respectively.</p>
Full article ">Figure 3
<p>Electrochemical test results of the Co-GaN nanowires: (<b>a</b>) CV tests at a scan rate of 0.1 mV s<sup>−1</sup>. (<b>b</b>) Galvanostatic charge and discharge (GCD) tests. (<b>c</b>) Cycling performance test. (<b>d</b>) Rate performance test of GaN and Co-GaN nanowire electrodes. (<b>e</b>) The long cycling at a high current density of 10.0 A g<sup>−1</sup>. (<b>f</b>) Schematic illustration of lithium transfer channel.</p>
Full article ">Figure 4
<p>(<b>a</b>) CV tests of Co-GaN at various scan rates. (<b>b</b>) Determination of the calculated <span class="html-italic">b</span> value. (<b>c</b>) Pseudocapacitive contribution at 1.0 mV s<sup>−1</sup>. (<b>d</b>) Pseudocapacitive contribution illustration at the scan rate of 1.0 mV s<sup>−1</sup>. (<b>e</b>) Electrochemical impedance spectra (EIS) with the fitted Nyquist plots and the equivalent circuit of the GaN and Co-GaN electrodes. (<b>f</b>) The calculation of relationships between Z’ and ω<sup>−1/2</sup>.</p>
Full article ">Figure 5
<p>Band structure of (<b>a</b>) GaN and (<b>b</b>) Co-GaN. (<b>c</b>,<b>d</b>) Differences in charge density of GaN and Co-GaN.</p>
Full article ">
16 pages, 1284 KiB  
Article
DLT-GAN: Dual-Layer Transfer Generative Adversarial Network-Based Time Series Data Augmentation Method
by Zirui Chen, Yongheng Pang, Shuowei Jin, Jia Qin, Suyuan Li and Hongchen Yang
Electronics 2024, 13(22), 4514; https://doi.org/10.3390/electronics13224514 - 18 Nov 2024
Viewed by 383
Abstract
In actual production processes, analysis and prediction tasks commonly rely on large amounts of time-series data. However, real-world scenarios often face issues such as insufficient or imbalanced data, severely impacting the accuracy of analysis and predictions. To address this challenge, this paper proposes [...] Read more.
In actual production processes, analysis and prediction tasks commonly rely on large amounts of time-series data. However, real-world scenarios often face issues such as insufficient or imbalanced data, severely impacting the accuracy of analysis and predictions. To address this challenge, this paper proposes a dual-layer transfer model based on Generative Adversarial Networks (GANs) aiming to enhance the training speed and generation quality of time-series data augmentation under small-sample conditions while reducing the reliance on large training datasets. This method introduces a module transfer strategy based on the traditional GAN framework which balances the training between the discriminator and the generator, thereby improving the model’s performance and convergence speed. By employing a dual-layer network structure to transfer the features of time-series signals, the model effectively reduces the generation of noise and other irrelevant features, improving the similarity of the generated signals’ characteristics. This paper uses speech signals as a case study, addressing scenarios where speech data are difficult to collect and the limited number of speech samples available for effective feature extraction and analysis. Simulated speech timbre generation is conducted, and the experimental results on the CMU-ARCTIC database show that, compared to traditional methods, this approach achieves significant improvements in enhancing the consistency of generated signal features and the model’s convergence speed. Full article
Show Figures

Figure 1

Figure 1
<p>Schematic diagram of the coarse transfer network structure.</p>
Full article ">Figure 2
<p>Schematic diagram of the fine transfer network structure.</p>
Full article ">Figure 3
<p>Schematic diagram of module transfer principle.</p>
Full article ">Figure 4
<p>Illustration of DLT-GAN time-series data augmentation process under few-shot conditions.</p>
Full article ">Figure 5
<p>Scatter plots of MCEP features generated by the four methods.</p>
Full article ">Figure 6
<p>Comparison of Ablation Experiment Results for the Module Migration Strategy in Handwritten Digit Experiment.</p>
Full article ">Figure 7
<p>The MCEP feature scatter plot of single-layer transfer network generation results.</p>
Full article ">
35 pages, 15883 KiB  
Article
Sound-Based Unsupervised Fault Diagnosis of Industrial Equipment Considering Environmental Noise
by Jeong-Geun Lee, Kwang Sik Kim and Jang Hyun Lee
Sensors 2024, 24(22), 7319; https://doi.org/10.3390/s24227319 - 16 Nov 2024
Viewed by 466
Abstract
The influence of environmental noise is generally excluded during research on machine fault diagnosis using acoustic signals. This study proposes a fault diagnosis method using a variational autoencoder (VAE) and domain adaptation neural network (DANN), both of which are based on unsupervised learning, [...] Read more.
The influence of environmental noise is generally excluded during research on machine fault diagnosis using acoustic signals. This study proposes a fault diagnosis method using a variational autoencoder (VAE) and domain adaptation neural network (DANN), both of which are based on unsupervised learning, to address this problem. The proposed method minimizes the impact of environmental noise and maintains the fault diagnosis performance in altered environments. The fault diagnosis algorithm was implemented using acoustic signals containing noise, present in the malfunctioning industrial machine investigation and inspection open dataset, and the fault prediction performance in noisy environments was examined based on forklift acoustic data using the VAE and DANN. The VAE primarily learns from normal state acoustic data and determines the occurrence of faults based on reconstruction error. To achieve this, statistical features of Mel frequency cepstral coefficients were extracted, generating features applicable regardless of signal length. Additionally, features were enhanced by applying noise reduction techniques via magnitude spectral subtraction and feature optimization, reflecting the characteristics of rotating equipment. Furthermore, data were augmented using generative adversarial networks to prevent overfitting. Given that the forklift acoustic data possess time-series characteristics, the exponentially weighted moving average was determined to quantitatively track time-series changes and identify early signs of faults. The VAE defined the reconstruction error as the fault index, diagnosing the fault states and demonstrating excellent performance using time-series data. However, the fault diagnosis performance of the VAE tended to decrease in noisy environments. Moreover, applying DANN for fault diagnosis significantly improved diagnostic performance in noisy environments by overcoming environmental differences between the source and target domains. In particular, by adapting the model learned in the source domain to the target domain and considering the domain differences based on signal-to-noise ratio, high diagnostic accuracy was maintained regardless of the noise levels. The DANN evaluated interdomain similarity using cosine similarity, enabling the accurate classification of fault states in the target domain. Ultimately, the combination of the VAE and DANN techniques enabled effective fault diagnosis even in noisy environments. Full article
(This article belongs to the Special Issue AI-Assisted Condition Monitoring and Fault Diagnosis)
Show Figures

Figure 1

Figure 1
<p>Feature extraction and fault diagnosis procedure using the MIMII and forklift datasets.</p>
Full article ">Figure 2
<p>Feature extraction and validation procedure of the MIMII and forklift datasets.</p>
Full article ">Figure 3
<p>Waveforms obtained from pump ID.00 and .02 under normal and abnormal conditions.</p>
Full article ">Figure 4
<p>Acquiring front-end failure data of a forklift in an anechoic room: (<b>a</b>) left-side, (<b>b</b>) isometric, and (<b>c</b>) front views.</p>
Full article ">Figure 5
<p>Acquiring background noise data in an anechoic room: (<b>a</b>) HVAC and (<b>b</b>) internal-combustion forklift noise.</p>
Full article ">Figure 6
<p>Visual examples of the forklift audio waveforms: (<b>a</b>) normal SNR: 6 dB, (<b>b</b>) failure SNR: 6 dB, (<b>c</b>) normal SNR: 0 dB, (<b>d</b>) failure SNR: 0 dB, (<b>e</b>) normal SNR: −6 dB, and (<b>f</b>) failure SNR: −6 dB.</p>
Full article ">Figure 7
<p>Visual examples of the Mel filter bank: the numbers of coefficients are (<b>a</b>) 20 and (<b>b</b>) 128.</p>
Full article ">Figure 8
<p>Visual examples of the forklift audio log spectrogram (STFT): (<b>a</b>) normal SNR: 6 dB, (<b>b</b>) failure SNR: 6 dB, (<b>c</b>) normal SNR: 0 dB, (<b>d</b>) failure SNR: 0 dB, (<b>e</b>) normal SNR: −6 dB, and (<b>f</b>) failure SNR: −6 dB.</p>
Full article ">Figure 8 Cont.
<p>Visual examples of the forklift audio log spectrogram (STFT): (<b>a</b>) normal SNR: 6 dB, (<b>b</b>) failure SNR: 6 dB, (<b>c</b>) normal SNR: 0 dB, (<b>d</b>) failure SNR: 0 dB, (<b>e</b>) normal SNR: −6 dB, and (<b>f</b>) failure SNR: −6 dB.</p>
Full article ">Figure 9
<p>Visual examples of the forklift audio MFCC results obtained using 20 coefficients: (<b>a</b>) normal SNR: 6 dB, (<b>b</b>) failure SNR: 6 dB, (<b>c</b>) normal SNR: 0 dB, (<b>d</b>) failure SNR: 0 dB, (<b>e</b>) normal SNR: −6 dB, and (<b>f</b>) failure SNR: −6 dB.</p>
Full article ">Figure 10
<p>Structure of the 1D-CNN classifier from the input to the output.</p>
Full article ">Figure 11
<p>Fault diagnosis procedure using a VAE.</p>
Full article ">Figure 12
<p>VAE architecture.</p>
Full article ">Figure 13
<p>Fault diagnosis algorithm using the reconstruction error of the VAE.</p>
Full article ">Figure 14
<p>Visual examples depicting the MSS enhancement of valve and slide rail audio using log spectrograms (STFT): (<b>a</b>) before enhancement: valve with an SNR of −6 dB; (<b>b</b>) after enhancement: valve with an SNR of −6 dB; (<b>c</b>) before enhancement: slide rail with an SNR of −6 dB; and (<b>d</b>) after enhancement: slide rail with an SNR of −6 dB.</p>
Full article ">Figure 15
<p>Visual examples depicting the MSS enhancement of valve and slide rail audio using log spectrograms (STFT): (<b>a</b>) before enhancement: fan with an SNR of −6 dB; (<b>b</b>) after enhancement: fan with an SNR of −6 dB; (<b>c</b>) before enhancement: pump with an SNR of −6 dB; and (<b>d</b>) after enhancement: pump with an SNR of −6 dB.</p>
Full article ">Figure 16
<p>Combining the training dataset with the augmentation dataset and constructing the test dataset.</p>
Full article ">Figure 17
<p>Comparing the AUC scores of each dataset for nonrotating equipment: (<b>a</b>) valve and (<b>b</b>) slide rail.</p>
Full article ">Figure 18
<p>Comparing the AUC scores of each dataset for rotating equipment: (<b>a</b>) fan and (<b>b</b>) pump.</p>
Full article ">Figure 19
<p>AUC score comparison across cases and SNR levels corresponding to the forklift.</p>
Full article ">Figure 20
<p>PCA visual examples using case 2, noise type 00, SNR = 6 dB: (<b>a</b>) 100 (<b>b</b>) 100–200, (<b>c</b>) 200–300, (<b>d</b>) 300–400, (<b>e</b>) 400–500, (<b>f</b>) 500–600, (<b>g</b>) 600–700, (<b>h</b>) 700–800, and (<b>i</b>) 800–900 data points.</p>
Full article ">Figure 20 Cont.
<p>PCA visual examples using case 2, noise type 00, SNR = 6 dB: (<b>a</b>) 100 (<b>b</b>) 100–200, (<b>c</b>) 200–300, (<b>d</b>) 300–400, (<b>e</b>) 400–500, (<b>f</b>) 500–600, (<b>g</b>) 600–700, (<b>h</b>) 700–800, and (<b>i</b>) 800–900 data points.</p>
Full article ">Figure 21
<p>PCA visual examples using case 2, noise type 00, SNR = −6 dB: (<b>a</b>) 100, (<b>b</b>) 100–200, (<b>c</b>) 200–300, (<b>d</b>) 300–400, (<b>e</b>) 400–500, (<b>f</b>) 500–600, (<b>g</b>) 600–700, (<b>h</b>) 700–800, and (<b>i</b>) 800–900 data points.</p>
Full article ">Figure 22
<p>Visualized fault index (reconstruction error, MSE) under case 2, noise type 00, SNR = 6 dB: (<b>a</b>) before and (<b>b</b>) after EWMA (alpha = 0.5 and three points).</p>
Full article ">Figure 23
<p>Visualized fault index (reconstruction error, MSE) under case 2, noise type 00, SNR of −6 dB: (<b>a</b>) before and (<b>b</b>) after EWMA (alpha = 0.5 and three points).</p>
Full article ">Figure 24
<p>Comparison between the AUC scores for diagnosing faults in a forklift using the EWMA-applied MSE criteria.</p>
Full article ">Figure 25
<p>Fault diagnosis procedure using a DANN.</p>
Full article ">Figure 26
<p>Architecture of the DANN.</p>
Full article ">Figure 27
<p>Domain adaptation procedure using the MIMII and forklift datasets.</p>
Full article ">Figure 28
<p>Cosine similarity results obtained from the MIMII dataset: (<b>a</b>) valve, (<b>b</b>) slide rail, (<b>c</b>) fan, and (<b>d</b>) pump.</p>
Full article ">Figure 29
<p>Cosine similarity results obtained using the forklift dataset.</p>
Full article ">Figure 30
<p>Comparison between the F1-scores corresponding to the DANN-based fault diagnosis results obtained from MIMII nonrotating equipment: (<b>a</b>) valve and (<b>b</b>) slide rail.</p>
Full article ">Figure 31
<p>Visualization of the domain adaptation results obtained from the valve using PCA under case 3 and machine ID 00: (<b>a</b>) Before and (<b>b</b>) after the adaptation of normal data. (<b>c</b>) Before and (<b>d</b>) after the adaptation of failure data.</p>
Full article ">Figure 32
<p>Visualization of the domain adaptation results obtained from the slide rail using PCA under case 3 and machine ID 00: (<b>a</b>) Before and (<b>b</b>) after the adaptation of normal data. (<b>c</b>) Before and (<b>d</b>) after the adaptation of failure data.</p>
Full article ">Figure 33
<p>Comparison of the F1-scores corresponding to DANN-based fault diagnosis results obtained using MIMII rotating equipment: (<b>a</b>) fan and (<b>b</b>) pump.</p>
Full article ">Figure 34
<p>Visualization of the domain adaptation results obtained from a fan using PCA under case 3 and machine ID 04: (<b>a</b>) Before and (<b>b</b>) after the adaptation of normal data. (<b>c</b>) Before and (<b>d</b>) after the adaptation of failure data.</p>
Full article ">Figure 35
<p>Visualization of the domain adaptation results obtained from a pump using PCA under case 3 and machine ID 00: (<b>a</b>) Before and (<b>b</b>) after the adaptation of normal data. (<b>c</b>) Before and (<b>d</b>) after the adaptation of failure data.</p>
Full article ">Figure 36
<p>Comparison of the F1-scores corresponding to the DANN-based fault diagnosis results obtained from a forklift: equipment (<b>a</b>) ID.0 and (<b>b</b>) ID.1.</p>
Full article ">Figure 37
<p>Visualization of the domain adaptation results obtained from a forklift using PCA under case 3 and noise type 00: (<b>a</b>) Before and (<b>b</b>) after the adaptation of normal data. (<b>c</b>) Before and (<b>d</b>) after the adaptation of failure data.</p>
Full article ">
20 pages, 8781 KiB  
Article
A Virtual View Acquisition Technique for Complex Scenes of Monocular Images Based on Layered Depth Images
by Qi Wang and Yan Piao
Appl. Sci. 2024, 14(22), 10557; https://doi.org/10.3390/app142210557 - 15 Nov 2024
Viewed by 424
Abstract
With the rapid development of stereoscopic display technology, how to generate high-quality virtual view images has become the key in the applications of 3D video, 3D TV and virtual reality. The traditional virtual view rendering technology maps the reference view into the virtual [...] Read more.
With the rapid development of stereoscopic display technology, how to generate high-quality virtual view images has become the key in the applications of 3D video, 3D TV and virtual reality. The traditional virtual view rendering technology maps the reference view into the virtual view by means of 3D transformation, but when the background area is occluded by the foreground object, the content of the occluded area cannot be inferred. To solve this problem, we propose a virtual view acquisition technique for complex scenes of monocular images based on a layered depth image (LDI). Firstly, the depth discontinuities of the edge of the occluded area are reasonably grouped by using the multilayer representation of the LDI, and the depth edge of the occluded area is inpainted by the edge inpainting network. Then, the generative adversarial network (GAN) is used to fill the information of color and depth in the occluded area, and the inpainting virtual view is generated. Finally, GAN is used to optimize the color and depth of the virtual view, and the high-quality virtual view is generated. The effectiveness of the proposed method is proved by experiments, and it is also applicable to complex scenes. Full article
Show Figures

Figure 1

Figure 1
<p>The overall frame of virtual viewpoint image generation.</p>
Full article ">Figure 2
<p>Depth images of various types generated by the method proposed in this paper.</p>
Full article ">Figure 3
<p>Depth image preprocessing. (<b>a</b>) The input RGB image. (<b>b</b>) Depth image after filtering. (<b>c</b>) The enlarged image of the red box area in (<b>b</b>); (<b>d</b>) The preprocessed image of (<b>c</b>). (<b>e</b>) The image of lines with discontinuous depth.</p>
Full article ">Figure 4
<p>The area division of the input RGB image. (<b>a</b>) The input RBG image. (<b>b</b>) Generated virtual view image without inpainting. The pink area is the foreground area, the gray area is the occluded area, and the blue area is the background area.</p>
Full article ">Figure 5
<p>The framework of the edge inpainting network.</p>
Full article ">Figure 6
<p>The framework of virtual view optimization network.</p>
Full article ">Figure 7
<p>Virtual viewpoint images generated at different positions. (<b>a</b>) The images of the standard model established by C4D, and the model position is x = 0; (<b>b</b>) Viewpoint images of the model generated by C4D at x = −3; (<b>c</b>) The virtual viewpoint images of the model estimated by the method in this paper at x = −3; (<b>d</b>) Viewpoint images of the model generated by C4D at x = +3; (<b>e</b>) Virtual viewpoint images of the model estimated by the method in this paper at x = +3.</p>
Full article ">Figure 8
<p>Camera distributions.</p>
Full article ">Figure 9
<p>Generated virtual viewpoint images using the ballet image sequence. (<b>a</b>) The input image, which is the 10th frame taken by Cam4. (<b>b</b>) The 10th frame image taken by Cam3. (<b>c</b>) The 10th frame image taken by Cam5. (<b>d</b>) The 10th frame image of Cam3, which is generated by the method in this paper. (<b>e</b>) The 10th frame image of Cam5, which is generated by the method in this paper.</p>
Full article ">Figure 10
<p>Generated virtual viewpoint images using the breakdancers image sequence. (<b>a</b>) The input image, which is the 20th frame taken by Cam4. (<b>b</b>) The 20th frame image taken by Cam3. (<b>c</b>) The 20th frame image taken by Cam5. (<b>d</b>) The 20th frame image of Cam3, which is generated by the method in this paper. (<b>e</b>) The 20th frame image of Cam5, which is generated by the method in this paper.</p>
Full article ">Figure 11
<p>Different types of virtual viewpoint images rendered by the method in this paper.</p>
Full article ">Figure 12
<p>Rendered virtual viewpoint images. (<b>a</b>) Input images; (<b>b</b>) The partial enlarged images of the contents in red boxes in (<b>a</b>); (<b>c</b>) The true images of virtual viewpoint images (<b>d</b>,<b>e</b>); (<b>d</b>) Virtual viewpoint images generated by the method of [<a href="#B58-applsci-14-10557" class="html-bibr">58</a>]; (<b>e</b>) Virtual viewpoint images generated by the method in this paper.</p>
Full article ">
Back to TopTop