Boosting EEG and ECG Classification with Synthetic Biophysical Data Generated via Generative Adversarial Networks
<p>Five decomposition of EEG waves using discrete wavelet transform (DWT) into approximation coefficient (<math display="inline"><semantics> <mrow> <mi>c</mi> <mi>A</mi> </mrow> </semantics></math>) and detailed coefficients (<math display="inline"><semantics> <mrow> <mi>c</mi> <msub> <mi>D</mi> <mrow> <mn>1</mn> <mo>−</mo> <mn>5</mn> </mrow> </msub> </mrow> </semantics></math>).</p> "> Figure 2
<p>WGAN-GP architecture.</p> "> Figure 3
<p>Workflow of synthetic EEG wave generation using WGAN-GP model.</p> "> Figure 4
<p>Two-dimensional CNN for EEG classification.</p> "> Figure 5
<p>Interface of the synthetic EEG generator, visualization, and CNN classification.</p> "> Figure 6
<p>EEG plot of TP9 channel for Subject A in concentration and relaxation states using WGAN-GP.</p> "> Figure 7
<p>PSD plot of TP9 channel for Subject A in EEG concentration and relaxation states.</p> "> Figure 8
<p>Real and synthetic normal ECG samples.</p> "> Figure 9
<p>Real and synthetic abnormal ECG samples.</p> "> Figure 10
<p>Bar chart of model accuracies with significance annotations. The label “<math display="inline"><semantics> <mrow> <mi>n</mi> <mi>s</mi> </mrow> </semantics></math>” stands for no statistical significance and the label “*” presents comparisons with statistical significance.</p> "> Figure 11
<p>Heatmap of pairwise statistical significance.</p> ">
Abstract
:1. Introduction
2. Related Work
3. Materials and Methods
3.1. Datasets
3.2. EEG Data Preprocessing
3.2.1. Discrete Wavelet Transforms
3.2.2. Downsampling
3.3. WGAN-GP Model Development
3.3.1. Algorithm: WGAN-GP with Advanced Architecture
Algorithm 1 WGAN-GP training |
|
Generator
- Input layer: A single dimension input.
- Hidden layers: Four fully connected layers with increasing units (256, 512, 1024, and 2048), each followed by ReLU activation.
- Output layer: A single-dimension output to match the real data’s dimensionality.
Discriminator
- Input layer: A single-dimension input.
- Hidden layers: Four fully connected layers with decreasing units (512, 256, and 128), each followed by LeakyReLU activation (with a negative slope of 0.2) and dropout (0.3).
- Output layer: A single-dimension output representing the authenticity score.
Training Procedure with WGAN-GP
- 1.
- Initialize models and optimizers: Initialize the generator, , and discriminator, , models. Both are optimized using the Adam optimizer learning rate, , and betas .
- 2.
- Gradient penalty (GP): GP is calculated to enforce the Lipschitz constraint.
- 3.
- Discriminator update: the discriminator is trained on real and fake samples; the loss is calculated, to which the gradient penalty is added and backpropagated.
- 4.
- Generator update: the generator is updated less frequently (every 5 batches) to ensure discriminator training.
3.4. EEG Synthetic Data: Post-Generation Processing Steps
3.4.1. Upsampling
3.4.2. Inverse Discrete Wavelet Transform (IDWT):
3.5. ECG Data Processing
- 1.
- num_epochs = 100
- 2.
- batch_size = 100
- 3.
- gradient_penalty_coef = 10
- 4.
- betas = 0.5, 0.999
3.6. 2D CNN for EEG Classification
Algorithm 2 Mental state classification merging 5 CNNs |
|
3.7. Support Vector Machines and Random Forests for ECG Classification
4. Results
4.1. EEG Results
4.2. ECG Results
4.3. Statistical Significance of the Results
- GPT-2 + SVM vs. GPT-2 + RF: the difference in accuracies (90.84% vs. 88.14%) was not statistically significant (p > 0.05), suggesting similar performance for these two models on real data.
- GPT-2 + SVM vs. WGAN-GP + CNN: WGAN-GP + CNN significantly outperformed GPT-2 + SVM (92% vs. 90.84%, p < 0.05).
- GPT-2 + RF vs. WGAN-GP + CNN: the WGAN-GP + CNN model significantly surpassed GPT-2 + RF (92% vs. 88.14%, p < 0.01), indicating its superior reliability.
- GPT-2 + SVM vs. GAN + CNN: GAN + CNN significantly outperformed GPT-2 + SVM (85.78% vs. 66.88%, p < 0.01).
- GPT-2 + SVM vs. WGAN-GP + CNN: WGAN-GP + CNN achieved markedly higher accuracy (98.42% vs. 66.88%, p < 0.01).
- GAN + CNN vs. WGAN-GP + CNN: WGAN-GP + CNN was significantly better than GAN + CNN (98.42% vs. 85.78%, p < 0.01), reinforcing the robustness of the WGAN-GP model for synthetic data generation.
- GPT-2 + SVM vs. GAN + CNN: the performance difference between GPT-2 + SVM and GAN + CNN was not statistically significant (93.71% vs. 94.11%, p > 0.05).
- GPT-2 + SVM vs. WGAN-GP + CNN: WGAN-GP + CNN significantly outperformed GPT-2 + SVM (98.45% vs. 93.71%, p < 0.01).
- GAN + CNN vs. WGAN-GP + CNN: WGAN-GP + CNN significantly outperformed GAN + CNN (98.45% vs. 94.11%, p < 0.01).
5. Conclusions and Future Work
Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Hoffmann, J.; Mahmood, S.; Fogou, P.S.; George, N.; Raha, S.; Safi, S.; Schmailzl, K.J.; Brandalero, M.; Hubner, M. A Survey on Machine Learning Approaches to ECG Processing. In Proceedings of the Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, Poland, 23–25 September 2020. [Google Scholar]
- Benhamida, A.; Kozlovszky, M. Human ECG data collection, digitalization, streaming and storing. In Proceedings of the 18th World Symposium on Applied Machine Intelligence and Informatics (SAMI), Herlany, Slovakia, 23–25 January 2020. [Google Scholar]
- Salehi, P.; Chalechale, A.; Taghizadeh, M. Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments. arXiv 2020, arXiv:2005.13178. [Google Scholar]
- Abdelfattah, S.M.; Abdelrahman, G.M.; Wang, M. Augmenting the size of EEG datasets using generative Adversarial Networks. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018. [Google Scholar]
- Aznan, N.K.N.; Atapour-Abarghouei, A.; Bonner, S.; Connolly, J.D.; Al Moubayed, N.; Breckon, T.P. Simulating Brain Signals: Creating Synthetic EEG Data via Neural-Based Generative Models for Improved SSVEP Classification. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019. [Google Scholar]
- Galván, C.M.; Spies, R.D.; Milone, D.H.; Peterson, V. Neurophysiologically meaningful motor imagery EEG simulation with applications to data augmentation. IEEE Trans. Neural Syst. Rehabil. Eng. 2024, 32, 2346–2355. [Google Scholar] [CrossRef] [PubMed]
- Chen, S.-Y.; Chang, C.-M.; Chiang, K.-J.; Wei, C.-S. SSVEP-DAN: Cross-Domain Data Alignment for SSVEP-based Brain-Computer Interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 2024, 32, 2027–2037. [Google Scholar] [CrossRef] [PubMed]
- Chaurasia, A.K.; Fallahi, M.; Strufe, T.; Terhörst, P.; Cabarcos, P.A. NeuroIDBench: An open-source benchmark framework for the standardization of methodology in brainwave-based authentication research. J. Inf. Secur. Appl. 2024, 85, 103832. [Google Scholar] [CrossRef]
- Zhang, S.; Sun, L.; Mao, X.; Hu, C.; Liu, P. Review on EEG-Based Authentication Technology. Comput. Intell. Neurosci. 2021, 2021, 5229576. [Google Scholar] [CrossRef]
- Delaney, A.M.; Brophy, E.; Ward, T.E. Synthesis of Realistic ECG using Generative Adversarial Networks. arXiv 2019, arXiv:1909.09150. [Google Scholar]
- Adib, E.; Afghah, F.; Prevost, J.J. Synthetic ECG Signal Generation Using Generative Neural Networks. arXiv 2021, arXiv:2112.03268. [Google Scholar]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
- Fahimi, F.; Zhang, Z.; Goh, W.B.; Ang, K.K.; Guan, C. Towards EEG Generation Using GANs for BCI Applications. In Proceedings of the International Conference on Biomedical and Health Informatics (BHI), Chicago, IL, USA, 19–22 May 2019. [Google Scholar]
- Mirza, M.; Osindero, S. Conditional Generative Adversarial Nets. arXiv 2014, arXiv:1411.1784. [Google Scholar]
- Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein Gan. arXiv 2017, arXiv:1701.07875. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A. Improved Training of Wasserstein GANs. arXiv 2017, arXiv:1704.00028. [Google Scholar]
- Habashi, A.G.; Azab, A.M.; Eldawlatly, S.; Aly, G.M. Generative adversarial networks in EEG analysis: An overview. J. Neuroeng. Rehabil. 2023, 20, 40. [Google Scholar] [CrossRef] [PubMed]
- Cheng, X.; Huang, K.; Zou, Y.; Ma, S. SleepEGAN: A GAN-enhanced ensemble deep learning model for imbalanced classification of sleep stages. Biomed. Signal Process. Control. 2024, 92, 106020. [Google Scholar] [CrossRef]
- Shin, H.-C.; Tenenholtz, N.A.; Rogers, J.K.; Schwarz, C.G.; Senjem, M.L.; Gunter, J.L.; Andriole, K.; Michalski, M. Medical Image Synthesis for Data Augmentation and Anonymization using Generative Adversarial Networks. arXiv 2018, arXiv:1807.10225. [Google Scholar]
- Hazra, D.; Byun, Y.-C. Synsiggan: Generative adversarial networks for synthetic biomedical signal generation. Biology 2020, 9, 441. [Google Scholar] [CrossRef]
- Salazar, A.; Vergara, L.; Safont, G. Generative adversarial networks and Markov random fields for oversampling very small training sets. Expert Syst. Appl. 2021, 163, 113819. [Google Scholar] [CrossRef]
- Zhao, W.; Ye, L.; Cui, Z. EEG Generation Using Generative Adversarial Networks (GANs) [PDF]. Available online: https://warrenzha.github.io/assets/pdf/GAN-EEG-Generation.pdf (accessed on 17 August 2024).
- Kumar, J.S.; Bhuvaneswari, P. Analysis of Electroencephalography (EEG) Signals and Its Categorization–A Study. Procedia Eng. 2012, 38, 2525–2536. [Google Scholar] [CrossRef]
- Schiliro, F.; Moustafa, N.; Beheshti, A. Cognitive Privacy: AI-enabled Privacy using EEG Signals in the Internet of Things. In Proceedings of the 6th International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application (DependSys), Nadi, Fiji, 14–16 December 2020. [Google Scholar]
- Popescu, A.B.; Taca, I.A.; Nita, C.I.; Vizitiu, A.; Demeter, R.; Suciu, C.; Itu, L.M. Privacy Preserving Classification of EEG Data Using Machine Learning and Homomorphic Encryption. Appl. Sci. 2021, 11, 7360. [Google Scholar] [CrossRef]
- Goyal, M.; Mahmoud, Q.H. A Systematic Review of Synthetic Data Generation Techniques Using Generative AI. Electronics 2024, 13, 3509. [Google Scholar] [CrossRef]
- Piacentino, E.; Guarner, A.; Angulo, C. Generating Synthetic ECGs Using GANs for Anonymizing Healthcare Data. Electronics 2021, 10, 389. [Google Scholar] [CrossRef]
- Xu, J.; Wang, R.; Shang, S.; Chen, A.; Winterbottom, L.; Hsu, T.-L.; Chen, W.; Ahmed, K.; La Rotta, P.L.; Zhu, X.; et al. ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke. arXiv 2024, arXiv:2406.12123. [Google Scholar]
- Bird, J.J.; Pritchard, M.; Fratini, A.; Ekart, A.; Faria, D.R. nSynthetic Biological Signals Machine-Generated by GPT-2 Improve the Classification of EEG and EMG Through Data Augmentation. IEEE Robot. Autom. Lett. 2021, 6, 3498–3504. [Google Scholar] [CrossRef]
- Manoharan, G.; Faria, D.R. Enhanced Mental State Classification Using EEG-Based Brain-Computer Interface Through Deep Learning. In Intelligent Systems and Applications. IntelliSys 2024; Lecture Notes in Networks and Systems; Arai, K., Ed.; Springer: Cham, Switzerland, 2024; Volume 1067. [Google Scholar]
- Venkatesan, C.; Karthigaikumar, P.; Paul, A.; Satheeskumaran, S.; Kumar, R. ECG Signal Preprocessing and SVM Classifier-Based Abnormality Detection in Remote Healthcare Applications. IEEE Access 2018, 6, 9767–9773. [Google Scholar] [CrossRef]
- Zhang, Y.; Wei, S.; Zhang, L.; Liu, C. Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features. J. Med. Biol. Eng. 2018, 39, 381–392. [Google Scholar] [CrossRef]
- Bird, J.T.; Manso, L.; Ribeiro, E.P.; Ekárt, A.; Faria, D.R. A Study on Mental State Classification using EEG-based Brain-Machine Interface. In Proceedings of the International Conference on Intelligent Systems, Madeira, Portugal, 25–27 September 2018. [Google Scholar]
- ECG Dataset. Available online: https://www.kaggle.com/datasets/devavratatripathy/ecg-dataset (accessed on 5 May 2024).
- Chiu, T.Y.; Leonard, T.; Tsui, K.W. The matrix-logarithmic covariance model. J. Am. Stat. Assoc. 1996, 91, 198–210. [Google Scholar] [CrossRef]
- Amin, H.U.; Malik, A.S.; Ahmad, R.F.; Badruddin, N.; Kamel, N.; Hussain, M.; Chooi, W.-T. Feature extraction and classification for EEG signals using wavelet transform and machine learning techniques. Australas. Phys. Eng. Sci. Med. 2015, 38, 139–149. [Google Scholar] [CrossRef]
- Romdhane, T.F.; Ouni, R. Electrocardiogram analysis using discrete wavelet transform for anomalies detection. Comput. Sci. 2023, 4, 348. [Google Scholar]
- Aliyu, I.; Lim, C.G. Selection of optimal wavelet features for epileptic EEG Signal Classification with LSTM. Neural Comput. Appl. 2021, 35, 1077–1097. [Google Scholar] [CrossRef]
- Broll, A.; Goldhacker, M.; Hahnel, S.; Rosentritt, M. Generative deep learning approaches for the design of dental restorations: A narrative review. J. Dent. 2024, 145, 104988. [Google Scholar] [CrossRef]
- Khodja, H.A.; Boudjeniba, O. Application of WGAN-GP in recommendation and questioning the relevance of gan-based approaches. arXiv 2022, arXiv:2204.12527v2. [Google Scholar]
- Inverse Discrete Wavelet Transform (IDWT)—PyWavelets Documentation. Available online: https://pywavelets.readthedocs.io/en/latest/ref/idwt-inverse-discrete-wavelet-transform.html (accessed on 5 May 2024).
Frequency (Hz) | Wave | Description |
---|---|---|
30–100 | Gamma | Problem-solving, concentration |
13–30 | Beta | Awake state, excitement, thinking |
8–13 | Alpha | Daydreaming, inability to focus, restful |
4–8 | Theta | Drowsiness, reduced consciousness, sleep |
0–4 | Delta | Deep sleep, loss of bodily awareness |
Mental State | Datapoints per Channel | Subject A | Subject B | Subject C | Subject D |
---|---|---|---|---|---|
Concentration | Real | 15192 | 11364 | 15204 | 11364 |
Synthetic | 15192 | 11364 | 15204 | 11364 | |
Relaxation | Real | 15204 | 15204 | 15204 | 15204 |
Synthetic | 15204 | 15204 | 15204 | 15204 |
Model | Classifier | Real Data | Synthetic Data | Real + Synthetic Data |
---|---|---|---|---|
Bird et al. [30]: GPT-2 | Support vector machine (SVM) | 90.84 | 66.88 | 93.71 |
Bird et al. [30]: GPT-2 | Random forest (RF) | 88.14 | 70.71 | 96.69 |
Our model: GAN | Convolutional neural network (CNN) | 92 | 85.78 | 94.11 |
Our model: WGAN-GP | Convolutional neural network (CNN) | 92 | 98.42 | 98.45 |
Original Data + X% Synthetic Data | Concentration | Relaxation | Average |
---|---|---|---|
25% | 99.27 | 97.28 | 98.28 |
50% | 99.59 | 97.36 | 98.48 |
75% | 99.34 | 97.26 | 98.3 |
100% | 99.51 | 97.39 | 98.45 |
Original data + X% Synthetic Data | Concentration | Relaxation | Average |
---|---|---|---|
25% | 95.39 | 100 | 97.69 |
50% | 92.44 | 100 | 96.22 |
75% | 90.77 | 100 | 95.39 |
100% | 88.22 | 100 | 94.11 |
Model | Classifier | Real Data | Synthetic Data | Real + Synthetic Data |
---|---|---|---|---|
WGAN-GP | Support vector machine (SVM) | 98 | 95.8 | 97 |
WGAN-GP | Random factor (RF) | 97 | 98.57 | 98.40 |
Comparison | Accuracy of Model 1 (%) | Accuracy of Model 2 (%) | Wilcoxon p-Value | Significance |
---|---|---|---|---|
GPT-2 + SVM (Real) vs. GPT-2 + RF (Real) | 90.84 | 88.14 | 0.056 | NS |
GPT-2 + SVM (Real) vs. WGAN-GP + CNN (Real) | 90.84 | 92.0 | 0.042 | S |
GPT-2 + RF (Real) vs. WGAN-GP + CNN (Real) | 88.14 | 92.0 | 0.035 | S |
GPT-2 + SVM (Synthetic) vs. GAN + CNN (Synthetic) | 66.88 | 85.78 | 0.018 | S |
GPT-2 + SVM (Synthetic) vs. WGAN-GP + CNN (Synthetic) | 66.88 | 98.42 | 0.005 | S |
GAN + CNN (Synthetic) vs. WGAN-GP + CNN (Synthetic) | 85.78 | 98.42 | 0.011 | S |
GPT-2 + SVM (Real+Synthetic) vs. GAN + CNN (Real+Synthetic) | 93.71 | 94.11 | 0.123 | NS |
GPT-2 + SVM (Real+Synthetic) vs. WGAN-GP + CNN (Real+Synthetic) | 93.71 | 98.45 | 0.007 | S |
GAN + CNN (Real+Synthetic) vs. WGAN-GP + CNN (Real+Synthetic) | 94.11 | 98.45 | 0.004 | S |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Venugopal, A.; Resende Faria, D. Boosting EEG and ECG Classification with Synthetic Biophysical Data Generated via Generative Adversarial Networks. Appl. Sci. 2024, 14, 10818. https://doi.org/10.3390/app142310818
Venugopal A, Resende Faria D. Boosting EEG and ECG Classification with Synthetic Biophysical Data Generated via Generative Adversarial Networks. Applied Sciences. 2024; 14(23):10818. https://doi.org/10.3390/app142310818
Chicago/Turabian StyleVenugopal, Archana, and Diego Resende Faria. 2024. "Boosting EEG and ECG Classification with Synthetic Biophysical Data Generated via Generative Adversarial Networks" Applied Sciences 14, no. 23: 10818. https://doi.org/10.3390/app142310818
APA StyleVenugopal, A., & Resende Faria, D. (2024). Boosting EEG and ECG Classification with Synthetic Biophysical Data Generated via Generative Adversarial Networks. Applied Sciences, 14(23), 10818. https://doi.org/10.3390/app142310818