Nothing Special   »   [go: up one dir, main page]

You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (83)

Search Parameters:
Keywords = Wasserstein GAN

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 2484 KiB  
Article
Automatic Fault Classification in Photovoltaic Modules Using Denoising Diffusion Probabilistic Model, Generative Adversarial Networks, and Convolutional Neural Networks
by Carlos Roberto da Silveira Junior, Carlos Eduardo Rocha Sousa and Ricardo Henrique Fonseca Alves
Energies 2025, 18(4), 776; https://doi.org/10.3390/en18040776 - 7 Feb 2025
Viewed by 459
Abstract
Current techniques for fault analysis in photovoltaic (PV) systems plants involve either electrical performance measurements or image processing, as well as line infrared thermography for visual inspection. Deep convolutional neural networks (CNNs) are machine learning algorithms that perform tasks involving images, such as [...] Read more.
Current techniques for fault analysis in photovoltaic (PV) systems plants involve either electrical performance measurements or image processing, as well as line infrared thermography for visual inspection. Deep convolutional neural networks (CNNs) are machine learning algorithms that perform tasks involving images, such as image classification and object recognition. However, to train a model effectively to recognize different patterns, it is crucial to have a sufficiently balanced dataset. Unfortunately, this is not always feasible owing to the limited availability of publicly accessible datasets for PV thermographic data and the unequal distribution of different faults in real-world systems. In this study, three data augmentation techniques—geometric transformations (GTs), generative adversarial networks (GANs), and the denoising diffusion probabilistic model (DDPM)—were combined with a CNN to classify faults in PV modules through thermographic images and identify the type of fault in 11 different classes (i.e., soiling, shadowing, and diode). Through the cross-validation method, the main results found with the Wasserstein GAN (WGAN) and DDPM networks combined with the CNN for anomaly classification achieved testing accuracies of 86.98% and 89.83%, respectively. These results demonstrate the effectiveness of both networks for accurately classifying anomalies in the dataset. The results corroborate the use of the diffusion model as a PV data augmentation technique when compared with other methods such as GANs and GTs. Full article
Show Figures

Figure 1

Figure 1
<p>Visualization of the layer order and dimensions within the convolutional neural network (CNN) architecture for the purpose of identifying and classifying faults in PV modules.</p>
Full article ">Figure 2
<p>Architecture of a generative adversarial network (GAN). The generator leverages a latent vector to create synthetic data, which are then evaluated by the discriminator using both real and generated data. The discriminator outputs a probability indicating whether the input is real or fake, helping to improve the generator’s output quality over time. The diagram also highlights key components, such as convolutional layers, batch normalization, activation functions (ReLU, Sigmoid), and dropout for regularization.</p>
Full article ">Figure 3
<p>Schematic representation of the layer order and dimensions in the diffusion model architecture, illustrating the process for generating synthetic images of photovoltaic (PV) modules. The layers are progressively concatenated to refine image synthesis, with the indicated dimensions showing the resolution at each stage of the process. Time embedding is incorporated to enhance the temporal consistency of the generated images.</p>
Full article ">Figure 4
<p>Graph of the training accuracy of 100 epochs of the CNN anomaly identification network. The training was performed with 60,000 images, including 20,000 original images and 40,000 images generated using the GT data augmentation technique.</p>
Full article ">Figure 5
<p>Confusion matrix graph of the CNN anomaly identification network. The training was performed with 60,000 images, including 20,000 original images and 40,000 images generated using the data augmentation technique.</p>
Full article ">Figure 6
<p>Graph of the training accuracy of 100 epochs of the CNN anomaly identification network. Training was performed with 60,000 images, including 20,000 original images and 40,000 images generated using the Wasserstein GAN (WGAN) technique.</p>
Full article ">Figure 7
<p>Confusion matrix graph of the CNN anomaly identification network. Training was performed with 60,000 images, including 20,000 original images and 40,000 images generated using the WGAN technique.</p>
Full article ">Figure 8
<p>Graph of the training accuracy of 100 epochs of the CNN anomaly identification network. Training was performed with 60,000 images, including 20,000 original images and 40,000 images generated using the denoising diffusion probabilistic model (DDPM) technique.</p>
Full article ">Figure 9
<p>Confusion matrix graph of the CNN anomaly identification network. Training was performed with 60,000 images, including 20,000 original images and 40,000 images generated using the DDPM technique.</p>
Full article ">Figure 10
<p>Graph of the training accuracy of 100 epochs of the CNN anomaly classification network with balancing images from the database. The training was performed with 5775 images, including 1925 original images and 3850 images generated using the data augmentation technique.</p>
Full article ">Figure 11
<p>Confusion matrix graph of the CNN anomaly classification network with database image balancing. The training was performed with 5775 images, including 1925 original images and 3850 images generated using the data augmentation technique.</p>
Full article ">Figure 12
<p>Graph of the training accuracy of 100 epochs of the CNN anomaly classification network without balancing images from the database. The training was performed with 30,000 images, including 10,000 original images and 20,000 images generated using the data augmentation technique.</p>
Full article ">Figure 13
<p>Confusion matrix graph of the CNN anomaly classification network without balancing images from the database. The training was performed with 30,000 images, including 10,000 original images and 20,000 images generated using the data augmentation technique.</p>
Full article ">Figure 14
<p>Graph of the training accuracy of 100 epochs of the CNN anomaly classification network with image balancing using the WGAN. Training was performed with 33,000 images, 3300 images from each of the 11 classes of anomalies, which corresponds to 10,000 original images and 23,000 images generated by the GAN.</p>
Full article ">Figure 15
<p>Confusion matrix graph of the CNN anomaly classification network with balancing images from the database using the WGAN. Training was carried out with 33,000 images, 3300 images from each of the 11 classes of anomalies, which corresponds to 10,000 original images and 23,000 images generated by the GAN.</p>
Full article ">Figure 16
<p>Graph of the training accuracy of 100 epochs of the CNN anomaly classification network with image balancing using a diffusion network.</p>
Full article ">Figure 17
<p>Confusion matrix graph of the CNN anomaly classification network with balancing images from the database using a diffusion network.</p>
Full article ">Figure 18
<p>Number of images for each PV fault class. InfraredSolarModule (ISM) dataset, geometric transformation (GT), WGAN, DDPM [<a href="#B27-energies-18-00776" class="html-bibr">27</a>].</p>
Full article ">Figure 19
<p>Radar graph of the presented scenarios used to describe the success rates for each failure class analyzed. CNN CLASS BAL is the CNN with balanced data; CLASS UNB + DA is the CNN with unbalanced data and data augmentation by GT; CNN CLASS BAL + WGAN is the CNN with balanced data using the WGAN synthetic image generation technique; and CNN CLASS BAL + DIFF is the CNN with balanced data using the diffusion synthetic image generation technique.</p>
Full article ">
13 pages, 1650 KiB  
Technical Note
Pano-GAN: A Deep Generative Model for Panoramic Dental Radiographs
by Søren Pedersen, Sanyam Jain, Mikkel Chavez, Viktor Ladehoff, Bruna Neves de Freitas and Ruben Pauwels
J. Imaging 2025, 11(2), 41; https://doi.org/10.3390/jimaging11020041 - 2 Feb 2025
Viewed by 493
Abstract
This paper presents the development of a generative adversarial network (GAN) for the generation of synthetic dental panoramic radiographs. While this is an exploratory study, the ultimate aim is to address the scarcity of data in dental research and education. A deep convolutional [...] Read more.
This paper presents the development of a generative adversarial network (GAN) for the generation of synthetic dental panoramic radiographs. While this is an exploratory study, the ultimate aim is to address the scarcity of data in dental research and education. A deep convolutional GAN (DCGAN) with the Wasserstein loss and a gradient penalty (WGAN-GP) was trained on a dataset of 2322 radiographs of varying quality. The focus of this study was on the dentoalveolar part of the radiographs; other structures were cropped out. Significant data cleaning and preprocessing were conducted to standardize the input formats while maintaining anatomical variability. Four candidate models were identified by varying the critic iterations, number of features and the use of denoising prior to training. To assess the quality of the generated images, a clinical expert evaluated a set of generated synthetic radiographs using a ranking system based on visibility and realism, with scores ranging from 1 (very poor) to 5 (excellent). It was found that most generated radiographs showed moderate depictions of dentoalveolar anatomical structures, although they were considerably impaired by artifacts. The mean evaluation scores showed a trade-off between the model trained on non-denoised data, which showed the highest subjective quality for finer structures, such as the mandibular canal and trabecular bone, and one of the models trained on denoised data, which offered better overall image quality, especially in terms of clarity and sharpness and overall realism. These outcomes serve as a foundation for further research into GAN architectures for dental imaging applications. Full article
(This article belongs to the Special Issue Tools and Techniques for Improving Radiological Imaging Applications)
Show Figures

Figure 1

Figure 1
<p>An overview of WGAN-GP. The generator receives input noise and produces samples, which are compared against real data from the dataset in the discriminator. The figure illustrates the calculation of the critic loss (D loss) by real, fake, and gradient penalty terms, along with the generator loss (G loss). The gradient penalty term is highlighted alongside the Wasserstein distance in a separate box.</p>
Full article ">Figure 2
<p>Generator <math display="inline"><semantics> <mrow> <mi>G</mi> <mo>(</mo> <mi>z</mi> <mo>)</mo> </mrow> </semantics></math> and discriminator <math display="inline"><semantics> <mrow> <mi>D</mi> <mo>(</mo> <mi>I</mi> <mo>)</mo> </mrow> </semantics></math> networks used in methodology.</p>
Full article ">Figure 3
<p>Examples of real (R), fake (F), and Gaussian noise (G) images. The t-SNE plot compares the feature embedding for R (blue cluster), F (red cluster), and G (green cluster) images.</p>
Full article ">Figure 4
<p>Boxplots of observer scores for Models 1 and 2.</p>
Full article ">Figure 5
<p>Radar plots for Models 1 (red) and 2 (blue).</p>
Full article ">Figure 6
<p><span class="html-italic">Best images</span> generated using Models 1 (top row) and 2 (bottom row).</p>
Full article ">Figure 7
<p><span class="html-italic">Worst images</span> among all model variants, showing poor overall anatomical depiction and severe artifacts.</p>
Full article ">
14 pages, 17020 KiB  
Article
A Long Short-Term Memory–Wasserstein Generative Adversarial Network-Based Data Imputation Method for Photovoltaic Power Output Prediction
by Zhu Liu, Lingfeng Xuan, Dehuang Gong, Xinlin Xie and Dongguo Zhou
Energies 2025, 18(2), 399; https://doi.org/10.3390/en18020399 - 17 Jan 2025
Viewed by 467
Abstract
To address the challenges of the issue of inaccurate prediction results due to missing data in PV power records, a photovoltaic power data imputation method based on a Wasserstein Generative Adversarial Network (WGAN) and Long Short-Term Memory (LSTM) network is proposed. This method [...] Read more.
To address the challenges of the issue of inaccurate prediction results due to missing data in PV power records, a photovoltaic power data imputation method based on a Wasserstein Generative Adversarial Network (WGAN) and Long Short-Term Memory (LSTM) network is proposed. This method introduces a data-driven GAN framework with quasi-convex characteristics to ensure the smoothness of the imputed data with the existing data and employs a gradient penalty mechanism and a single-batch multi-iteration strategy for stable training. Finally, through frequency domain analysis, t-Distributed Stochastic Neighbor Embedding (t-SNE) metrics, and prediction performance validation of the generated data, the proposed method can improve the continuity and reliability of data in photovoltaic prediction tasks. Full article
(This article belongs to the Special Issue Forecasting of Photovoltaic Power Generation and Model Optimization)
Show Figures

Figure 1

Figure 1
<p>LSTM-WGAN architecture for repairing the missing PV output data.</p>
Full article ">Figure 2
<p>Intermittent power-generation records of photovoltaic power plants.</p>
Full article ">Figure 3
<p>DCT Results of PV power generation records: (<b>a</b>) PV power output of power plant located in Alice Springs, Australia; (<b>b</b>,<b>c</b>) top five lowest-frequency components of the DCT following the zero-padding and 200-value filling methods for missing data, respectively.</p>
Full article ">Figure 4
<p>Variation in generator and discriminator loss during training epochs.</p>
Full article ">Figure 5
<p>DCT Results of power generation records: (<b>a</b>) power generation of power plant located in Alice Springs, Australia, after data imputation; (<b>b</b>,<b>c</b>) top five lowest-frequency components of the DCT following the zero-padding and our filling methods for missing data, respectively.</p>
Full article ">Figure 6
<p>t-NSE visualization of real and generated data.</p>
Full article ">
20 pages, 42222 KiB  
Article
WGAN-GP for Synthetic Retinal Image Generation: Enhancing Sensor-Based Medical Imaging for Classification Models
by Héctor Anaya-Sánchez, Leopoldo Altamirano-Robles, Raquel Díaz-Hernández and Saúl Zapotecas-Martínez
Sensors 2025, 25(1), 167; https://doi.org/10.3390/s25010167 - 31 Dec 2024
Viewed by 850
Abstract
Accurate synthetic image generation is crucial for addressing data scarcity challenges in medical image classification tasks, particularly in sensor-derived medical imaging. In this work, we propose a novel method using a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) and nearest-neighbor interpolation to [...] Read more.
Accurate synthetic image generation is crucial for addressing data scarcity challenges in medical image classification tasks, particularly in sensor-derived medical imaging. In this work, we propose a novel method using a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) and nearest-neighbor interpolation to generate high-quality synthetic images for diabetic retinopathy classification. Our approach enhances training datasets by generating realistic retinal images that retain critical pathological features. We evaluated the method across multiple retinal image datasets, including Retinal-Lesions, Fine-Grained Annotated Diabetic Retinopathy (FGADR), Indian Diabetic Retinopathy Image Dataset (IDRiD), and the Kaggle Diabetic Retinopathy dataset. The proposed method outperformed traditional generative models, such as conditional GANs and PathoGAN, achieving the best performance on key metrics: a Fréchet Inception Distance (FID) of 15.21, a Mean Squared Error (MSE) of 0.002025, and a Structural Similarity Index (SSIM) of 0.89 in the Kaggle dataset. Additionally, expert evaluations revealed that only 56.66% of synthetic images could be distinguished from real ones, demonstrating the high fidelity and clinical relevance of the generated data. These results highlight the effectiveness of our approach in improving medical image classification by generating realistic and diverse synthetic datasets. Full article
(This article belongs to the Collection Medical Applications of Sensor Systems and Devices)
Show Figures

Figure 1

Figure 1
<p>Methodology diagram.</p>
Full article ">Figure 2
<p>Diagram of the lesion extraction technique.</p>
Full article ">Figure 3
<p>Style Transfer diagrams. (<b>a</b>) Diagram illustrating the perceptual loss process, utilizing VGG19 for feature extraction. (<b>b</b>) Diagram depicting the severity loss process, where a pretrained CNN is employed for retinal classification.</p>
Full article ">Figure 4
<p>Images of each configuration where the real image does not have lesions. The image with PathoGAN label is the implementation of [<a href="#B13-sensors-25-00167" class="html-bibr">13</a>]. The others are using WGAN-GP with different resizing algorithms, except cGAN. Underlined is the best FID result.</p>
Full article ">Figure 5
<p>Comparison of generated images across different configurations, where the real image does not contain lesions. The images generated by WGAN-GP and PathoGAN exhibit smoothing effects, while the cGAN successfully transfers the noise from the original image. Underlined is the best FID result.</p>
Full article ">Figure 6
<p>Comparison with generated and real image samples. The images generated using the proposed method exhibit colors and textures that are more similar to the real image. In contrast, the images generated by the cGAN and PathoGAN show color variations in areas where the real image does not present them. Underlined is the best FID result.</p>
Full article ">Figure 7
<p>Comparison with generated and real image samples. The proposed method successfully extracts and preserves the color and texture of the original image, while the cGAN method displays different tones. Underlined is the best FID result.</p>
Full article ">Figure 8
<p>Comparison with generated and real image samples. The proposed method successfully transfers lesions from the original images.</p>
Full article ">Figure 9
<p>Sample images with lesions from the Retinal-Lesions database. Underlined is the best FID result.</p>
Full article ">Figure 10
<p>Sample images with lesions from the FGADR database. Underlined is the best FID result.</p>
Full article ">Figure 11
<p>Sample images with lesions from the IDRiD database. Underlined is the best FID result.</p>
Full article ">Figure 12
<p>Sample images with lesions from the Kaggle database. Underlined is the best FID result.</p>
Full article ">
22 pages, 5995 KiB  
Article
Research on 3D Localization of Indoor UAV Based on Wasserstein GAN and Pseudo Fingerprint Map
by Junhua Yang, Jinhang Tian, Yang Qi, Wei Cheng, Yang Liu, Gang Han, Shanzhe Wang, Yapeng Li, Chenghu Cao and Santuan Qin
Drones 2024, 8(12), 740; https://doi.org/10.3390/drones8120740 - 9 Dec 2024
Viewed by 835
Abstract
In addition to outdoor environments, unmanned aerial vehicles (UAVs) also have a wide range of applications in indoor environments. The complex and changeable indoor environment and relatively small space make indoor localization of UAVs more difficult and urgent. An innovative 3D localization method [...] Read more.
In addition to outdoor environments, unmanned aerial vehicles (UAVs) also have a wide range of applications in indoor environments. The complex and changeable indoor environment and relatively small space make indoor localization of UAVs more difficult and urgent. An innovative 3D localization method for indoor UAVs using a Wasserstein generative adversarial network (WGAN) and a pseudo fingerprint map (PFM) is proposed in this paper. The primary aim is to enhance the localization accuracy and robustness in complex indoor environments. The proposed method integrates four classic matching localization algorithms with WGAN and PFM, demonstrating significant improvements in localization precision. Simulation results show that both the WGAN and PFM algorithms significantly reduce localization errors and enhance environmental adaptability and robustness in both small and large simulated indoor environments. The findings confirm the robustness and efficiency of the proposed method in real-world indoor localization scenarios. In the inertial measurement unit (IMU)-based tracking algorithm, using the fingerprint database of initial coarse particles and the fingerprint database processed by the WGAN algorithm to locate the UAV, the localization error of the four algorithms is reduced by 30.3% on average. After using the PFM algorithm for matching localization, the localization error of the UAV is reduced by 28% on average. Full article
Show Figures

Figure 1

Figure 1
<p>Block diagram of Indoor UAV localization proposed in this paper.</p>
Full article ">Figure 2
<p>Different amounts of fingerprint data are extracted from the dense fingerprint database. (<b>a</b>) Full initial fingerprint database; (<b>b</b>) 1/200 of the initial fingerprint database; (<b>c</b>) 1/500 of the initial fingerprint database; (<b>d</b>) 1/1000 of the initial fingerprint database.</p>
Full article ">Figure 2 Cont.
<p>Different amounts of fingerprint data are extracted from the dense fingerprint database. (<b>a</b>) Full initial fingerprint database; (<b>b</b>) 1/200 of the initial fingerprint database; (<b>c</b>) 1/500 of the initial fingerprint database; (<b>d</b>) 1/1000 of the initial fingerprint database.</p>
Full article ">Figure 3
<p>Schematic diagram of fingerprint segmentation model for indoor drone localization.</p>
Full article ">Figure 4
<p>Three-dimensional velocity over minimum time interval calculated from IMU data.</p>
Full article ">Figure 5
<p>Algorithm flow chart of the enhanced fingerprint database with WGAN.</p>
Full article ">Figure 6
<p>Schematic diagram of the difference between WGAN and WGAN-IM.</p>
Full article ">Figure 7
<p>Schematic diagram of the simulated environment, where four green cubes represent the routers. (<b>a</b>) Three-dimensional view; (<b>b</b>) plane view.</p>
Full article ">Figure 8
<p>Display of the initial fingerprint database and the upgraded fingerprint database after the WGAN algorithm. (<b>a</b>) Initial fingerprint database <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> <mi>I</mi> </msub> </mrow> </semantics></math>, which shows the fingerprint data of 4 APs in 75 RCs; (<b>b</b>) upgraded fingerprint database <math display="inline"><semantics> <mrow> <msub> <mstyle mathvariant="bold" mathsize="normal"> <mi>R</mi> </mstyle> <mi>g</mi> </msub> </mrow> </semantics></math>, which shows the fingerprint data of 4 APs in 600 RCs.</p>
Full article ">Figure 9
<p>The effect of UAV localization is shown using a coarse-grained initial fingerprint database before WGAN. (<b>a</b>) 3D view display; (<b>b</b>) plane view display.</p>
Full article ">Figure 10
<p>The effect of a UAV localization is shown using upgraded fingerprint database after WGAN. (<b>a</b>) 3D view display; (<b>b</b>) plane view display.</p>
Full article ">Figure 11
<p>Comparing the localization results of the initial fingerprint database and the fingerprint database after WGAN processing, the average value of the two databases is taken after 1000 tracks are located.</p>
Full article ">Figure 12
<p>Schematic diagram of UAV localization Scenario 2. This is a large indoor environment with a length, width, and height of 50 m, 30 m, and 6 m, respectively, in which 15 obstacles are randomly arranged. There are 15 APs evenly arranged on the ceiling, represented by small bright green cubes.</p>
Full article ">Figure 13
<p>Display of the fingerprint database after the WGAN algorithm in Scenario 2, which shows the fingerprint data of 15 APs in 9000 RCs.</p>
Full article ">Figure 14
<p>UAV real trajectory and results of four localization algorithms using a single real fingerprint information.</p>
Full article ">Figure 15
<p>UAV real trajectory and results of four localization algorithms using PFM.</p>
Full article ">Figure 16
<p>Comparison results of localization without and with PFM algorithm when the transmitted signal strength is set to −30 dBm.</p>
Full article ">
26 pages, 3161 KiB  
Review
Survey of Quantum Generative Adversarial Networks (QGAN) to Generate Images
by Mohammadsaleh Pajuhanfard, Rasoul Kiani and Victor S. Sheng
Mathematics 2024, 12(23), 3852; https://doi.org/10.3390/math12233852 - 6 Dec 2024
Viewed by 1341
Abstract
Quantum Generative Adversarial Networks (QGANs) represent a useful development in quantum machine learning, using the particular properties of quantum mechanics to solve the challenges of data analysis and modeling. This paper brings up a general analysis of five QGAN architectures, focusing on their [...] Read more.
Quantum Generative Adversarial Networks (QGANs) represent a useful development in quantum machine learning, using the particular properties of quantum mechanics to solve the challenges of data analysis and modeling. This paper brings up a general analysis of five QGAN architectures, focusing on their evolution, strengths, weaknesses, and limitations in noisy intermediate-scale quantum (NISQ) devices. Primary methods like Entangling Quantum GAN (EQ-GAN) and Quantum state fidelity (QuGAN) concentrate on stability, convergence, and robust performance on small-scale datasets such as 2 × 2 grayscale images. Intermediate models such as Image Quantum GAN (IQGAN) and Experimental Quantum GAN (EXQGAN) provide new ideas like trainable encoders and patch-based sub-generators that are scalable to 8 × 8 datasets with increasing noise resilience. The most advanced method is Parameterized Quantum Wasserstein GAN (PQWGAN), which uses a hybrid quantum-classical structure to obtain high-resolution image processing for 28 × 28 grayscale datasets while trying to maintain parameter efficiency. This study explores, analyzes, and summarizes critical problems of QGANs, including accuracy, convergence, parameter efficiency, image quality, performance metrics, and training stability under noisy conditions. In addition, developing QGANs can generate and train parameters in quantum approximation optimization algorithms. One of the useful applications of QGAN is generating medical datasets that can generate medical images from limited datasets to train specific medical models for the recognition of diseases. Full article
Show Figures

Figure 1

Figure 1
<p>Operation of the GANs loss function.</p>
Full article ">Figure 2
<p>The CC means the data and the algorithms are classic, but the quantum concept, methods, or process has helped improve the classical algorithms. The CQ means the data is classic and the algorithms are quantum. The QC means the data is quantum (such as chemistry data) and the algorithms are classic. The QQ means the data and the algorithms are quantum. <a href="https://commons.wikimedia.org/wiki/File:Qml_approaches.tif?page=1" target="_blank">https://commons.wikimedia.org/wiki/File:Qml_approaches.tif?page=1</a> (accessed on 2 November 2024).</p>
Full article ">Figure 3
<p>The view of QGAN.</p>
Full article ">Figure 4
<p>The general structure of QGAN.</p>
Full article ">Figure 5
<p>The structure of Quantum state fidelity.</p>
Full article ">Figure 6
<p>Scheme of quantum generator in quantum patch GAN.</p>
Full article ">Figure 7
<p>Scheme of quantum patch GAN.</p>
Full article ">
21 pages, 3915 KiB  
Article
Boosting EEG and ECG Classification with Synthetic Biophysical Data Generated via Generative Adversarial Networks
by Archana Venugopal and Diego Resende Faria
Appl. Sci. 2024, 14(23), 10818; https://doi.org/10.3390/app142310818 - 22 Nov 2024
Viewed by 1014
Abstract
This study presents a novel approach using Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP) to generate synthetic electroencephalography (EEG) and electrocardiogram (ECG) waveforms. The synthetic EEG data represent concentration and relaxation mental states, while the synthetic ECG data correspond to normal and [...] Read more.
This study presents a novel approach using Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP) to generate synthetic electroencephalography (EEG) and electrocardiogram (ECG) waveforms. The synthetic EEG data represent concentration and relaxation mental states, while the synthetic ECG data correspond to normal and abnormal states. By addressing the challenges of limited biophysical data, including privacy concerns and restricted volunteer availability, our model generates realistic synthetic waveforms learned from real data. Combining real and synthetic datasets improved classification accuracy from 92% to 98.45%, highlighting the benefits of dataset augmentation for machine learning performance. The WGAN-GP model achieved 96.84% classification accuracy for synthetic EEG data representing relaxation states and optimal accuracy for concentration states when classified using a fusion of convolutional neural networks (CNNs). A 50% combination of synthetic and real EEG data yielded the highest accuracy of 98.48%. For EEG signals, the real dataset consisted of 60-s recordings across four channels (TP9, AF7, AF8, and TP10) from four individuals, providing approximately 15,000 data points per subject per state. For ECG signals, the dataset contained 1200 real samples, each comprising 140 data points, representing normal and abnormal states. WGAN-GP outperformed a basic generative adversarial network (GAN) in generating reliable synthetic data. For ECG data, a support vector machine (SVM) classifier achieved an accuracy of 98% with real data and 95.8% with synthetic data. Synthetic ECG data improved the random forest (RF) classifier’s accuracy from 97% with real data alone to 98.40% when combined with synthetic data. Statistical significance was assessed using the Wilcoxon signed-rank test, demonstrating the robustness of the WGAN-GP model. Techniques such as discrete wavelet transform, downsampling, and upsampling were employed to enhance data quality. This method shows significant potential in addressing biophysical data scarcity and advancing applications in assistive technologies, human-robot interaction, and mental health monitoring, among other medical applications. Full article
Show Figures

Figure 1

Figure 1
<p>Five decomposition of EEG waves using discrete wavelet transform (DWT) into approximation coefficient (<math display="inline"><semantics> <mrow> <mi>c</mi> <mi>A</mi> </mrow> </semantics></math>) and detailed coefficients (<math display="inline"><semantics> <mrow> <mi>c</mi> <msub> <mi>D</mi> <mrow> <mn>1</mn> <mo>−</mo> <mn>5</mn> </mrow> </msub> </mrow> </semantics></math>).</p>
Full article ">Figure 2
<p>WGAN-GP architecture.</p>
Full article ">Figure 3
<p>Workflow of synthetic EEG wave generation using WGAN-GP model.</p>
Full article ">Figure 4
<p>Two-dimensional CNN for EEG classification.</p>
Full article ">Figure 5
<p>Interface of the synthetic EEG generator, visualization, and CNN classification.</p>
Full article ">Figure 6
<p>EEG plot of TP9 channel for Subject A in concentration and relaxation states using WGAN-GP.</p>
Full article ">Figure 7
<p>PSD plot of TP9 channel for Subject A in EEG concentration and relaxation states.</p>
Full article ">Figure 8
<p>Real and synthetic normal ECG samples.</p>
Full article ">Figure 9
<p>Real and synthetic abnormal ECG samples.</p>
Full article ">Figure 10
<p>Bar chart of model accuracies with significance annotations. The label “<math display="inline"><semantics> <mrow> <mi>n</mi> <mi>s</mi> </mrow> </semantics></math>” stands for no statistical significance and the label “*” presents comparisons with statistical significance.</p>
Full article ">Figure 11
<p>Heatmap of pairwise statistical significance.</p>
Full article ">
32 pages, 8354 KiB  
Article
Estimation of Fractal Dimension and Detection of Fake Finger-Vein Images for Finger-Vein Recognition
by Seung Gu Kim, Jin Seong Hong, Jung Soo Kim and Kang Ryoung Park
Fractal Fract. 2024, 8(11), 646; https://doi.org/10.3390/fractalfract8110646 - 31 Oct 2024
Cited by 1 | Viewed by 1002
Abstract
With recent advancements in deep learning, spoofing techniques have developed and generative adversarial networks (GANs) have become an emerging threat to finger-vein recognition systems. Therefore, previous research has been performed to generate finger-vein images for training spoof detectors. However, these are limited and [...] Read more.
With recent advancements in deep learning, spoofing techniques have developed and generative adversarial networks (GANs) have become an emerging threat to finger-vein recognition systems. Therefore, previous research has been performed to generate finger-vein images for training spoof detectors. However, these are limited and researchers still cannot generate elaborate fake finger-vein images. Therefore, we develop a new densely updated contrastive learning-based self-attention generative adversarial network (DCS-GAN) to create elaborate fake finger-vein images, enabling the training of corresponding spoof detectors. Additionally, we propose an enhanced convolutional network for a next-dimension (ConvNeXt)-Small model with a large kernel attention module as a new spoof detector capable of distinguishing the generated fake finger-vein images. To improve the spoof detection performance of the proposed method, we introduce fractal dimension estimation to analyze the complexity and irregularity of class activation maps from real and fake finger-vein images, enabling the generation of more realistic and sophisticated fake finger-vein images. Experimental results obtained using two open databases showed that the fake images by the DCS-GAN exhibited Frechet inception distances (FID) of 7.601 and 23.351, with Wasserstein distances (WD) of 18.158 and 10.123, respectively, confirming the possibility of spoof attacks when using existing state-of-the-art (SOTA) frameworks of spoof detection. Furthermore, experiments conducted with the proposed spoof detector yielded average classification error rates of 0.4% and 0.12% on the two aforementioned open databases, respectively, outperforming existing SOTA methods for spoof detection. Full article
Show Figures

Figure 1

Figure 1
<p>Overall flowchart of proposed method.</p>
Full article ">Figure 2
<p>Architecture of DCS-GAN.</p>
Full article ">Figure 3
<p>Samples for the selection of input and target image for training the generator and discriminator of DCS-GAN. * denotes one image randomly chosen in the intra-class of the input image, excluding the input image.</p>
Full article ">Figure 4
<p>Architecture of enhanced ConvNeXt-Small.</p>
Full article ">Figure 5
<p>Sample images of real finger-veins in the databases. (<b>a</b>) Examples from the ISPR database and (<b>b</b>) examples from the Idiap database.</p>
Full article ">Figure 6
<p>Examples of data augmentation on the Idiap database. (<b>a</b>) Original image, (<b>b</b>) image shifted upward, (<b>c</b>) image shifted downward, (<b>d</b>) image shifted to the left, (<b>e</b>) image shifted to the right.</p>
Full article ">Figure 7
<p>Graphs for the training and validation loss of DCS-GAN. (<b>a</b>) Training loss graph of the generator and the discriminator. (<b>b</b>) Validation loss graph of the generator and the discriminator.</p>
Full article ">Figure 8
<p>Training and validation accuracy (Acc) and loss (Loss) graphs of the enhanced ConvNeXt-Small. (<b>a</b>) Training accuracy and loss graphs. (<b>b</b>) Validation accuracy and loss graphs.</p>
Full article ">Figure 9
<p>Sample images of fake finger-vein images generated by DCS-GAN and other SOTA methods. Examples of (<b>a</b>) original image and images generated by (<b>b</b>) Pix2Pix, (<b>c</b>) Pix2PixHD, (<b>d</b>) CycleGAN, (<b>e</b>) CUT, and (<b>f</b>) DCS-GAN.</p>
Full article ">Figure 10
<p>FD estimation analysis for comparison between real and fake vein images: the first to the fourth images, from the left, in (<b>a</b>–<b>h</b>) mean finger vein image, CAM, BCAM, and FD graph, respectively. (<b>a</b>,<b>c</b>,<b>e</b>,<b>g</b>) show the real finger-vein images whereas (<b>b</b>,<b>d</b>,<b>f</b>,<b>h</b>) present the corresponding fake finger-vein images.</p>
Full article ">Figure 10 Cont.
<p>FD estimation analysis for comparison between real and fake vein images: the first to the fourth images, from the left, in (<b>a</b>–<b>h</b>) mean finger vein image, CAM, BCAM, and FD graph, respectively. (<b>a</b>,<b>c</b>,<b>e</b>,<b>g</b>) show the real finger-vein images whereas (<b>b</b>,<b>d</b>,<b>f</b>,<b>h</b>) present the corresponding fake finger-vein images.</p>
Full article ">Figure 10 Cont.
<p>FD estimation analysis for comparison between real and fake vein images: the first to the fourth images, from the left, in (<b>a</b>–<b>h</b>) mean finger vein image, CAM, BCAM, and FD graph, respectively. (<b>a</b>,<b>c</b>,<b>e</b>,<b>g</b>) show the real finger-vein images whereas (<b>b</b>,<b>d</b>,<b>f</b>,<b>h</b>) present the corresponding fake finger-vein images.</p>
Full article ">Figure 11
<p>ROC curves of TPR according to FPR by the proposed and the SOTA methods on (<b>a</b>) ISPR database and (<b>b</b>) Idiap database.</p>
Full article ">Figure 12
<p>Jetson TX2 board.</p>
Full article ">Figure 13
<p>Examples of correct spoof detection by the proposed method. (<b>a</b>) and (<b>c</b>) are examples of real images from the ISPR and Idiap databases, respectively, and (<b>b</b>) and (<b>d</b>) are corresponding examples of fake images.</p>
Full article ">Figure 14
<p>Examples of incorrect spoof detection by the proposed method. (<b>a</b>) and (<b>c</b>) are examples of real images from the ISPR and Idiap databases, respectively, and (<b>b</b>) and (<b>d</b>) are corresponding examples of fake images. In the proposed method, (<b>b</b>) and (<b>d</b>) are incorrectly identified as real images.</p>
Full article ">Figure 15
<p>Grad-CAM images. (<b>a</b>) shows Grad-CAM images for real images, while (<b>b</b>) shows Grad-CAM images for fake images generated from the real images in (<b>a</b>). In both (<b>a</b>,<b>b</b>), the first row is from the ISPR database, and the second row is from the Idiap database. Each row starts with the input image on the far left, followed by Grad-CAM images acquired from the first ConvNeXt Block, second ConvNeXt Block, third ConvNeXt Block, fourth ConvNeXt Block, and LKA attention of <a href="#fractalfract-08-00646-t004" class="html-table">Table 4</a>, respectively.</p>
Full article ">
12 pages, 1581 KiB  
Article
Airfoil Shape Generation and Feature Extraction Using the Conditional VAE-WGAN-gp
by Kazuo Yonekura, Yuki Tomori and Katsuyuki Suzuki
AI 2024, 5(4), 2092-2103; https://doi.org/10.3390/ai5040102 - 28 Oct 2024
Cited by 1 | Viewed by 1182
Abstract
A machine learning method was applied to solve an inverse airfoil design problem. A conditional VAE-WGAN-gp model, which couples the conditional variational autoencoder (VAE) and Wasserstein generative adversarial network with gradient penalty (WGAN-gp), is proposed for an airfoil generation method, and then, it [...] Read more.
A machine learning method was applied to solve an inverse airfoil design problem. A conditional VAE-WGAN-gp model, which couples the conditional variational autoencoder (VAE) and Wasserstein generative adversarial network with gradient penalty (WGAN-gp), is proposed for an airfoil generation method, and then, it is compared with the WGAN-gp and VAE models. The VAEGAN model couples the VAE and GAN models, which enables feature extraction in the GAN models. In airfoil generation tasks, to generate airfoil shapes that satisfy lift coefficient requirements, it is known that VAE outperforms WGAN-gp with respect to the accuracy of the reproduction of the lift coefficient, whereas GAN outperforms VAE with respect to the smoothness and variations of generated shapes. In this study, VAE-WGAN-gp demonstrated a good performance in all three aspects. Latent distribution was also studied to compare the feature extraction ability of the proposed method. Full article
Show Figures

Figure 1

Figure 1
<p>Conditional GAN.</p>
Full article ">Figure 2
<p>Conditional VAE.</p>
Full article ">Figure 3
<p>Conditional VAEGAN.</p>
Full article ">Figure 4
<p>Network architectures of the encoder, decoder, and discriminator.</p>
Full article ">Figure 5
<p>Shape discretization.</p>
Full article ">Figure 6
<p>Histogram of <math display="inline"><semantics> <msub> <mi>C</mi> <mi mathvariant="normal">L</mi> </msub> </semantics></math>.</p>
Full article ">Figure 7
<p>Learning curve.</p>
Full article ">Figure 8
<p>Generated shapes. Numbers on top of each shape represent re-calculated <math display="inline"><semantics> <msub> <mi>C</mi> <mi mathvariant="normal">L</mi> </msub> </semantics></math>. Red figure implies <math display="inline"><semantics> <msub> <mi>C</mi> <mi mathvariant="normal">L</mi> </msub> </semantics></math> calculation did not converge.</p>
Full article ">Figure 9
<p>Generated shapes <math display="inline"><semantics> <mrow> <msub> <mi>C</mi> <mi mathvariant="normal">L</mi> </msub> <mo>=</mo> <mn>0.5</mn> </mrow> </semantics></math>. Different color represents different shapes.</p>
Full article ">Figure 10
<p>Latent distribution.</p>
Full article ">
17 pages, 18662 KiB  
Article
Symmetric Connected U-Net with Multi-Head Self Attention (MHSA) and WGAN for Image Inpainting
by Yanyang Hou, Xiaopeng Ma, Junjun Zhang and Chenxian Guo
Symmetry 2024, 16(11), 1423; https://doi.org/10.3390/sym16111423 - 25 Oct 2024
Cited by 1 | Viewed by 1159
Abstract
This study presents a new image inpainting model based on U-Net and incorporating the Wasserstein Generative Adversarial Network (WGAN). The model uses skip connections to connect every encoder block to the corresponding decoder block, resulting in a strictly symmetrical architecture referred to as [...] Read more.
This study presents a new image inpainting model based on U-Net and incorporating the Wasserstein Generative Adversarial Network (WGAN). The model uses skip connections to connect every encoder block to the corresponding decoder block, resulting in a strictly symmetrical architecture referred to as Symmetric Connected U-Net (SC-Unet). By combining SC-Unet with a GAN, the study aims to reconstruct images more effectively and seamlessly. The traditional discriminators only differentiate the entire image as true or false. In this study, the discriminator calculated the probability of each pixel belonging to the hole and non-hole regions, which provided the generator with more gradient loss information for image inpainting. Additionally, every block of SC-Unet incorporated a Dilated Convolutional Neural Network (DCNN) to increase the receptive field of the convolutional layers. Our model also integrated Multi-Head Self-Attention (MHSA) into selected blocks to enable it to efficiently search the entire image for suitable content to fill the missing areas. This study adopts the publicly available datasets CelebA-HQ and ImageNet for evaluation. Our proposed algorithm demonstrates a 10% improvement in PSNR and a 2.94% improvement in SSIM compared to existing representative image inpainting methods in the experiment. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

Figure 1
<p>SC-Unet architecture, with a 3 × 256 × 256 image as input.</p>
Full article ">Figure 2
<p>(<b>left</b>) The original convolutional block of U-Net; (<b>right</b>) the Dilated Convolutional Neural Network in the convolutional block of SC-Unet.</p>
Full article ">Figure 3
<p>Multi-Head Self-Attention in the convolutional block of SC-Unet.</p>
Full article ">Figure 4
<p>The main framework of our image inpainting model.</p>
Full article ">Figure 5
<p>Sample of results with CelebA-HQ dataset. (<b>a</b>) Ground truth image. (<b>b</b>) Input image. (<b>c</b>) Reconstructed image. (<b>d</b>) Mask image. (<b>e</b>) Predicted image.</p>
Full article ">Figure 6
<p>Sample of results with ImageNet dataset. (<b>a</b>) Ground truth image. (<b>b</b>) Input image. (<b>c</b>) Reconstructed image. (<b>d</b>) Mask image. (<b>e</b>) Predicted image.</p>
Full article ">Figure 7
<p>Comparison with GLCIC and CA models on ImageNet.</p>
Full article ">Figure 8
<p>Comparison with GLCIC and CA models on CelebA-HQ.</p>
Full article ">Figure 9
<p>Ablation study with different model modules on ImageNet.</p>
Full article ">Figure 10
<p>Ablation study with different model modules on CelebA-HQ.</p>
Full article ">Figure A1
<p>Ablation study with different model modules on ImageNet.</p>
Full article ">Figure A2
<p>Ablation study with different model modules on ImageNet.</p>
Full article ">Figure A3
<p>Ablation study with different model modules on CelebA-HQ.</p>
Full article ">Figure A4
<p>Ablation study with different model modules on CelebA-HQ.</p>
Full article ">
13 pages, 7413 KiB  
Article
A Study on Enhancing the Visual Fidelity of Aviation Simulators Using WGAN-GP for Remote Sensing Image Color Correction
by Chanho Lee, Hyukjin Kwon, Hanseon Choi, Jonggeun Choi, Ilkyun Lee, Byungkyoo Kim, Jisoo Jang and Dongkyoo Shin
Appl. Sci. 2024, 14(20), 9227; https://doi.org/10.3390/app14209227 - 11 Oct 2024
Viewed by 959
Abstract
When implementing outside-the-window (OTW) visuals in aviation tactical simulators, maintaining terrain image color consistency is critical for enhancing pilot immersion and focus. However, due to various environmental factors, inconsistent image colors in terrain can cause visual confusion and diminish realism. To address these [...] Read more.
When implementing outside-the-window (OTW) visuals in aviation tactical simulators, maintaining terrain image color consistency is critical for enhancing pilot immersion and focus. However, due to various environmental factors, inconsistent image colors in terrain can cause visual confusion and diminish realism. To address these issues, a color correction technique based on a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) is proposed. The proposed WGAN-GP model utilizes multi-scale feature extraction and Wasserstein distance to effectively measure and adjust the color distribution difference between the input image and the reference image. This approach can preserve the texture and structural characteristics of the image while maintaining color consistency. In particular, by converting Bands 2, 3, and 4 of the BigEarthNet-S2 dataset into RGB images as the reference image and preprocessing the reference image to serve as the input image, it is demonstrated that the proposed WGAN-GP model can handle large-scale remote sensing images containing various lighting conditions and color differences. The experimental results showed that the proposed WGAN-GP model outperformed traditional methods, such as histogram matching and color transfer, and was effective in reflecting the style of the reference image to the target image while maintaining the structural elements of the target image during the training process. Quantitative analysis demonstrated that the mid-stage model achieved a PSNR of 28.93 dB and an SSIM of 0.7116, which significantly outperforms traditional methods. Furthermore, the LPIPS score was reduced to 0.3978, indicating improved perceptual similarity. This approach can contribute to improving the visual elements of the simulator to enhance pilot immersion and has the potential to significantly reduce time and costs compared to the manual methods currently used by the Republic of Korea Air Force. Full article
(This article belongs to the Special Issue Applications of Machine Learning Algorithms in Remote Sensing)
Show Figures

Figure 1

Figure 1
<p>An overview of the architecture of the WGAN-GP model.</p>
Full article ">Figure 2
<p>Architecture of the Generator and Critic in the WGAN-GP model. (<b>a</b>) Architecture of the Generator (<b>b</b>) Architecture of the Critic.</p>
Full article ">Figure 3
<p>BigEarthNet-S2 RGB images (ref images).</p>
Full article ">Figure 4
<p>BigEarthNet-S2 preprocessing images (target images).</p>
Full article ">Figure 5
<p>Color matching results of BigEarthNet-S2 RGB images and BigEarthNet-S2 preprocessing images. (<b>a</b>) Generated image by model at the early stage of training, (<b>b</b>) generated image by model at the mid-stage of training, and (<b>c</b>) generated image by fully trained model.</p>
Full article ">Figure 6
<p>Precise texture reproduction results. (<b>a</b>) Generated image by model at the early stage of training, (<b>b</b>) generated image by model at the mid-stage of training, and (<b>c</b>) generated image by fully trained model.</p>
Full article ">Figure 7
<p>Comparison with other method’s results. (<b>a</b>) Image processed using histogram matching, showing limitations in maintaining color consistency and significant information loss in texture and detail; (<b>b</b>) image processed using the color transfer technique, which also shows limitations in maintaining the ground truth’s color consistency and lacks texture reproduction; (<b>c</b>) image generated by the early stage of the WGAN-GP-based model, where color distribution is irregular and texture representation is still underdeveloped; (<b>d</b>) image generated by the mid-stage model, demonstrating improved color matching and texture reproduction, with textures becoming more similar to the ground truth; and (<b>e</b>) image generated by the fully trained WGAN-GP model, showing a slight decrease in color consistency compared to the mid-stage model but offering superior texture reproduction compared to the other methods.</p>
Full article ">
16 pages, 5561 KiB  
Article
A Hybrid GAN-Inception Deep Learning Approach for Enhanced Coordinate-Based Acoustic Emission Source Localization
by Xuhui Huang, Ming Han and Yiming Deng
Appl. Sci. 2024, 14(19), 8811; https://doi.org/10.3390/app14198811 - 30 Sep 2024
Viewed by 1712
Abstract
In this paper, we propose a novel approach to coordinate-based acoustic emission (AE) source localization to address the challenges of limited and imbalanced datasets from fiber-optic AE sensors used for structural health monitoring (SHM). We have developed a hybrid deep learning model combining [...] Read more.
In this paper, we propose a novel approach to coordinate-based acoustic emission (AE) source localization to address the challenges of limited and imbalanced datasets from fiber-optic AE sensors used for structural health monitoring (SHM). We have developed a hybrid deep learning model combining four generative adversarial network (GAN) variants for data augmentation with an adapted inception neural network for regression-based prediction. The experimental setup features a single fiber-optic AE sensor based on a tightly coiled fiber-optic Fabry-Perot interferometer formed by two identical fiber Bragg gratings. AE signals were generated using the Hsu-Nielsen pencil lead break test on a grid-marked thin aluminum plate with 35 distinct locations, simulating real-world structural monitoring conditions in bounded isotropic plate-like structures. It is demonstrated that the single-sensor configuration can achieve precise localization, avoiding the need for a multiple sensor array. The GAN-based signal augmentation expanded the dataset from 900 to 4500 samples, with the Wasserstein distance between the original and synthetic datasets decreasing by 83% after 2000 training epochs, demonstrating the high fidelity of the synthetic data. Among the GAN variants, the standard GAN architecture proved the most effective, outperforming other variants in this specific application. The hybrid model exhibits superior performance compared to non-augmented deep learning approaches, with the median error distribution comparisons revealing a significant 50% reduction in prediction errors, accompanied by substantially improved consistency across various AE source locations. Overall, this developed hybrid approach offers a promising solution for enhancing AE-based SHM in complex infrastructures, improving damage detection accuracy and reliability for more efficient predictive maintenance strategies. Full article
(This article belongs to the Special Issue Advanced Optical-Fiber-Related Technologies)
Show Figures

Figure 1

Figure 1
<p>Schematic of the fiber-optic coil-based acoustic emission sensing system. Inset: Close-up image of the sensor, showing the flexible mounting and dimensions (8 mm outer, 6 mm inner diameter).</p>
Full article ">Figure 2
<p>(<b>a</b>) Aluminum plate with the grid and fiber-optic sensor for AE testing (<b>b</b>) Schematic representation of the aluminum plate detailing the grid layout and test points.</p>
Full article ">Figure 3
<p>Time series augmentation showing the original data (orange) and generated data (green) to ensure each label has a balanced and sufficient number of samples for improved deep learning model performance.</p>
Full article ">Figure 4
<p>(<b>a</b>) Workflow of the hybrid network for AE source localization (<b>b</b>) Architecture of the Inception network for regression.</p>
Full article ">Figure 5
<p>Architecture of the generator and discriminator networks in the GAN for AE signal augmentation.</p>
Full article ">Figure 6
<p>The t-SNE visualization of synthetic and original datasets (<b>a</b>) The training epoch of 1 for GAN (<b>b</b>) The training epoch of 2000 for GAN (<b>c</b>) The training epoch of 2000 for WGAN (<b>d</b>) The training epoch of 2000 for DCGAN (<b>e</b>) The training epoch of 2000 for TSAGAN (<b>f</b>) Augmentation via addition of noise.</p>
Full article ">Figure 7
<p>The comparison of Wasserstein distance convergence across epochs for the four GAN variants (GAN, TSAGAN, WGAN, and DCGAN).</p>
Full article ">Figure 8
<p>The comparison of acoustic emission (AE) source localization performance. (<b>a</b>) Results from the hybrid deep learning model with GAN-based data augmentation and Inception network. (<b>b</b>) Results from the Inception network alone without GAN-based augmentation. Square markers represent actual source locations, star markers show predicted locations, and the large circular marker indicates the sensor position. The x and y axes represent dimensions in inches.</p>
Full article ">Figure 9
<p>The comparison of errors for the different methods.</p>
Full article ">
12 pages, 2079 KiB  
Article
Research on Default Classification of Unbalanced Credit Data Based on PixelCNN-WGAN
by Yutong Sun, Yanting Ji and Xiangxing Tao
Electronics 2024, 13(17), 3419; https://doi.org/10.3390/electronics13173419 - 28 Aug 2024
Viewed by 1082
Abstract
Personal credit assessment plays a crucial role in the financial system, which not only relates to the financial activities of individuals but also affects the overall credit system and economic health of society. However, the current problem of data imbalance affecting classification results [...] Read more.
Personal credit assessment plays a crucial role in the financial system, which not only relates to the financial activities of individuals but also affects the overall credit system and economic health of society. However, the current problem of data imbalance affecting classification results in the field of personal credit assessment has not been fully solved. In order to solve this problem better, we propose a data-enhanced classification algorithm based on a Pixel Convolutional Neural Network (PixelCNN) and a Generative Adversarial Network (Wasserstein GAN, WGAN). Firstly, the historical data containing borrowers’ borrowing information are transformed into grayscale maps; then, data enhancement of default images is performed using the improved PixelCNN-WGAN model; and finally, the expanded image dataset is inputted into the CNN, AlexNet, SqueezeNet, and MobileNetV2 for classification. The results on the real dataset LendingClub show that the data enhancement algorithm designed in this paper improves the accuracy of the four algorithms by 1.548–3.568% compared with the original dataset, which can effectively improve the classification effect of the credit data, and to a certain extent, it provides a new idea for the classification task in the field of personal credit assessment. Full article
Show Figures

Figure 1

Figure 1
<p>GAN structure diagram.</p>
Full article ">Figure 2
<p>PixelCNN-WGAN model structure diagram.</p>
Full article ">Figure 3
<p>Structure of PixelCNN.</p>
Full article ">Figure 4
<p>Structure of discriminator.</p>
Full article ">Figure 5
<p>Pixel-WGAN default prediction flowchart.</p>
Full article ">Figure 6
<p>A grayscale map of credit data. There are obvious differences in the images corresponding to the normal samples and the default samples in some areas.</p>
Full article ">
16 pages, 4446 KiB  
Article
Method for Recognition of Communication Interference Signals under Small-Sample Conditions
by Rong Ge, Yusheng Li, Yonggang Zhu, Xiuzai Zhang, Kai Zhang and Minghu Chen
Appl. Sci. 2024, 14(13), 5869; https://doi.org/10.3390/app14135869 - 4 Jul 2024
Viewed by 892
Abstract
To address the difficulty in obtaining a large number of labeled jamming signals in complex electromagnetic environments, this paper proposes a small-sample communication jamming signal recognition method based on WDCGAN-SA (Wasserstein Deep Convolution Generative Adversarial Network–Self Attention) and C-ResNet (Convolution Block Attention Module–Residual [...] Read more.
To address the difficulty in obtaining a large number of labeled jamming signals in complex electromagnetic environments, this paper proposes a small-sample communication jamming signal recognition method based on WDCGAN-SA (Wasserstein Deep Convolution Generative Adversarial Network–Self Attention) and C-ResNet (Convolution Block Attention Module–Residual Network). Firstly, leveraging the DCGAN architecture, we integrate the Wasserstein distance measurement and gradient penalty mechanism to design the jamming signal generation model WDCGAN for data augmentation. Secondly, we introduce a self-attention mechanism to make the generation model focus on global correlation features in time–frequency maps while optimizing training strategies to enhance the quality of generated samples. Finally, real samples are mixed with generated samples and fed into the classification network, incorporating cross-channel and spatial information in the classification network to improve jamming signal recognition rates. The simulation results demonstrate that under small-sample conditions with a Jamming-to-Noise Ratio (JNR) ranging from −10 dB to 10 dB, the proposed algorithm significantly outperforms GAN, WGAN and DCGAN comparative algorithms in recognizing six types of communication jamming signals. Full article
Show Figures

Figure 1

Figure 1
<p>The overall architecture of WDCGAN-SA and C-ResNet.</p>
Full article ">Figure 2
<p>Time–frequency plots of six types of jamming signals.</p>
Full article ">Figure 3
<p>Self-Attention Module.</p>
Full article ">Figure 4
<p>Self-Attention Embedded Generative Model WDCGAN-SA.</p>
Full article ">Figure 5
<p>C-ResNet Network Structure.</p>
Full article ">Figure 6
<p>Comparative of Real Samples and Samples Generated by Two Different Network Models.</p>
Full article ">Figure 7
<p>The recognition accuracies of five different classification networks.</p>
Full article ">Figure 8
<p>The confusion matrix for (<b>a</b>) the CNN classification network; (<b>b</b>) our proposed method.</p>
Full article ">Figure 9
<p>Recognition Accuracy with Different Quantities of Generated Samples.</p>
Full article ">Figure 10
<p>The recognition rates of four different generative networks were evaluated under various conditions of sample size enhancement. (<b>a</b>) Enhancement with 30 images; (<b>b</b>) enhancement with 60 images; (<b>c</b>) enhancement with 90 images.</p>
Full article ">
23 pages, 5171 KiB  
Article
Image Enhancement Based on Dual-Branch Generative Adversarial Network Combining Spatial and Frequency Domain Information for Imbalanced Fault Diagnosis of Rolling Bearing
by Yuguang Huang, Bin Wen, Weiqing Liao, Yahui Shan, Wenlong Fu and Renming Wang
Symmetry 2024, 16(5), 512; https://doi.org/10.3390/sym16050512 - 24 Apr 2024
Cited by 2 | Viewed by 1253
Abstract
To address the problems of existing 2D image-based imbalanced fault diagnosis methods for rolling bearings, which generate images with inadequate texture details and color degradation, this paper proposes a novel image enhancement model based on a dual-branch generative adversarial network (GAN) combining spatial [...] Read more.
To address the problems of existing 2D image-based imbalanced fault diagnosis methods for rolling bearings, which generate images with inadequate texture details and color degradation, this paper proposes a novel image enhancement model based on a dual-branch generative adversarial network (GAN) combining spatial and frequency domain information for an imbalanced fault diagnosis of rolling bearing. Firstly, the original vibration signals are converted into 2D time–frequency (TF) images by a continuous wavelet transform, and a dual-branch GAN model with a symmetric structure is constructed. One branch utilizes an auxiliary classification GAN (ACGAN) to process the spatial information of the TF images, while the other employs a GAN with a frequency generator and a frequency discriminator to handle the frequency information of the input images after a fast Fourier transform. Then, a shuffle attention (SA) module based on an attention mechanism is integrated into the proposed model to improve the network’s expression ability and reduce the computational burden. Simultaneously, mean square error (MSE) is integrated into the loss functions of both generators to enhance the consistency of frequency information for the generated images. Additionally, a Wasserstein distance and gradient penalty are also incorporated into the losses of the two discriminators to prevent gradient vanishing and mode collapse. Under the supervision of the frequency WGAN-GP branch, an ACWGAN-GP can generate high-quality fault samples to balance the dataset. Finally, the balanced dataset is utilized to train the auxiliary classifier to achieve fault diagnosis. The effectiveness of the proposed method is validated by two rolling bearing datasets. When the imbalanced ratios of the four datasets are 0.5, 0.2, 0.1, and 0.05, respectively, their average classification accuracy reaches 99.35% on the CWRU bearing dataset. Meanwhile, the average classification accuracy reaches 96.62% on the MFS bearing dataset. Full article
(This article belongs to the Section Engineering and Materials)
Show Figures

Figure 1

Figure 1
<p>The structure of the ACGAN.</p>
Full article ">Figure 2
<p>Framework of the proposed model.</p>
Full article ">Figure 3
<p>Structure of the SA: GAP represents the global average pooling, GN stands for group norm, <span class="html-italic">F</span>(<span class="html-italic">x</span>) = <span class="html-italic">ωx</span> + <span class="html-italic">b</span>, σ(⸱) represents the activation function, ⊗ represents the element-wise product, and C and S stand for the concat and channel shuffle operators, respectively.</p>
Full article ">Figure 4
<p>General structure of the proposed fault diagnosis method.</p>
Full article ">Figure 5
<p>The comparison of image generation ability among different models. (<b>a</b>) Normal; (<b>b</b>) BF7; (<b>c</b>) ORF14; (<b>d</b>) IRF14.</p>
Full article ">Figure 6
<p>Confusion matrix of classification results for the CWRU datasets with different imbalance ratios.To further illustrate the feature learning performance of the proposed diagnostic model with various unbalanced datasets, we utilized the t-distributed stochastic neighbor embedding (t-SNE) algorithm [<a href="#B41-symmetry-16-00512" class="html-bibr">41</a>] to visualize the classification results of the model on the test set. As shown in <a href="#symmetry-16-00512-f007" class="html-fig">Figure 7</a>, it is evident that the feature distributions of the test set samples with different health states in the four datasets exhibit significant differences. Although there are a few samples with conflated feature distributions, this impact on the model’s diagnosis results can be largely disregarded. This further validates the confusion matrices’ classification results and underscores the proposed model’s excellent feature learning and fault diagnosis capability.</p>
Full article ">Figure 7
<p>The visualization of t-SNE for classification results on the CWRU datasets with different imbalance ratios.</p>
Full article ">Figure 8
<p>The comparison of test accuracies among different models across four unbalanced datasets.</p>
Full article ">Figure 9
<p>Mechanical fault comprehensive simulation platform.</p>
Full article ">Figure 10
<p>Bearings with different fault types. (<b>a</b>) IRF; (<b>b</b>); ORF; (<b>c</b>) BF; (<b>d</b>); CF.</p>
Full article ">Figure 11
<p>Confusion matrix of classification results for the MFS datasets with different imbalance ratios. To better evaluate the diagnostic performance of the proposed model in a more intuitive way, the t-SNE dimensionality reduction visualization results are shown in <a href="#symmetry-16-00512-f012" class="html-fig">Figure 12</a>. It is evident that each fault category still exhibits a distinct classification boundary, further indicating the excellent data generation and fault diagnosis performance of the proposed model, along with its strong generalization ability.</p>
Full article ">Figure 12
<p>The visualization of t-SNE for classification results on the MFS datasets with different imbalance ratios.</p>
Full article ">Figure 13
<p>The comparison of test accuracies among different models across four imbalanced datasets.</p>
Full article ">
Back to TopTop