Enhancement of Image Classification Using Transfer Learning and GAN-Based Synthetic Data Augmentation
<p>Workflow of the proposed framework. (<b>a</b>) shows the overview of the original dataset with the class label; (<b>b</b>) synthetic images are generated using a modified lightweight-GAN model for data augmentation; (<b>c</b>) is traditional data augmentation based on basic image manipulation techniques; (<b>d</b>) pre-trained ImageNet model is fine-tuned on our dataset for plastic bottle classification; (<b>e</b>) is the evaluation metrics for classification.</p> "> Figure 2
<p>Generative adversarial networks architecture.</p> "> Figure 3
<p>The architecture of the generator.</p> "> Figure 4
<p>The architecture of the discriminator.</p> "> Figure 5
<p>InceptionV3 model architecture.</p> "> Figure 6
<p>Xception model architecture.</p> "> Figure 7
<p>Grid search method for finding the weights.</p> "> Figure 8
<p>Original plastic bottle images and synthetic plastic bottle images generated by modified lightweight-GAN.</p> ">
Abstract
:1. Introduction
- A new technique that enhances the imbalanced data problem using image data augmentation is proposed based on a GAN-based framework, named modified lightweight-GAN, that can generate high-quality images using a few original images.
- We propose a weighted average ensemble transfer learning-based method, IncepX-Ensemble, to classify six types of plastic bottle images.
- We construct a computationally efficient model and demonstrate its resilience based on the two presented strategies.
2. Related Works
3. Dataset
4. Methodology
4.1. Original Dataset Description
4.2. Synthetic Image Generation Using Modified Lightweight-GAN Model
4.2.1. Generative Adversarial Networks
4.2.2. Generator Network
4.2.3. Discriminator Network
4.3. Traditional Data Augmentation Techniques
- Flipping:There are two types of flipping used for image transformation; horizontal flipping is more common than vertical flipping. This augmentation is one of the simplest to employ and has shown to be effective on various datasets.Horizontal flipping formulas are depicted in Equations (5) and (6).Vertical flipping formulas are depicted in Equations (7) and (8).
- Rotation:The image is rotated right or left on an axis between [0–360] degree for rotation augmentations. The rotation degree parameter significantly impacts the safety of rotation augmentations. Outside of the rotating area, pixels are be filled with 0, and the formula of rotation is given in Equation (9).
- Translation:To avoid data-position bias, shifting the image left, right, up, or down is a valuable adjustment, so the neural network looks everywhere in the image to capture it. The original image is translated into the [0–255] value range.
- Noise added:Noise is an exciting augmentation technique; noise injection injects a matrix of random values usually drawn from a Gaussian distribution. Stochastic data expansion is applied when the neural network sees the same image, which is slightly different. This difference can be seen as adding noise to the data sample and letting the neural network learn generalized features rather than overfitting the dataset.
4.4. Transfer Learning
4.4.1. InceptionV3 and Xception
4.4.2. Ensemble Learning
4.5. Evaluation Metrics
5. Results
5.1. Experimental Setup
5.2. Performance Metrics of GAN
- The IS is an objective metric for assessing the quality of synthetic images generated by the generative adversarial networks model. The IS was proposed by [35], and it captures the two properties of generated images: image quality, and image diversity.
- The FID is a metric that measures the overall semantic realism that compares the distance between feature vectors calculated for real and generated images. FID score was proposed by [36] to improve the performance over inception score.
5.3. Implementation Details
5.4. Classification Performance Details
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
DL | Deep Learning |
GAN | Generative Adversarial Networks |
CNN | Convolutional Neural Network |
TL | Transfer Learning |
VAE | Variational Autoencoders |
PET | Polyethylene Terephthalate |
IS | Inception Score |
FID | Frechet Inception Distance |
DCGAN | Deep Convolutional GAN |
LSGAN | Least Squares GAN |
WGAN-GP | Wasserstein GAN-Gradient Penalty |
ACGAN | Auxiliary Classifier GAN |
CGAN | Conditional GAN |
References
- Huth-Fehre, T.; Feldhoff, R.; Kowol, F.; Freitag, H.; Kuttler, S.; Lohwasser, B.; Oleimeulen, M. Remote sensor systems for the automated identification of plastics. J. Near Infrared Spectrosc. 1998, 6, A7–A11. [Google Scholar] [CrossRef]
- Zhang, H.; Wen, Z.G. The consumption and recycling collection system of PET bottles: A case study of Beijing, China. Waste Manag. 2014, 34, 987–998. [Google Scholar] [CrossRef] [PubMed]
- Vo, A.H.; Vo, M.T.; Le, T. A novel framework for trash classification using deep transfer learning. IEEE Access 2019, 7, 178631–178639. [Google Scholar] [CrossRef]
- Hammaad, S. 7.25 Million AED is the Cost of Waste Recycling. Al-Bayan Newspaper, 11 March 2005. [Google Scholar]
- Ramli, S.; Mustafa, M.M.; Hussain, A.; Wahab, D.A. Histogram of intensity feature extraction for automatic plastic bottle recycling system using machine vision. Am. J. Environ. Sci. 2008, 4, 583. [Google Scholar] [CrossRef]
- Ramli, S.; Mustafa, M.M.; Hussain, A.; Wahab, D.A. Automatic detection of ‘rois’ for plastic bottle classification. In Proceedings of the 2007 5th Student Conference on Research and Development, Selangor, Malaysia, 11–12 December 2007; pp. 1–5. [Google Scholar]
- Shahbudin, S.; Hussain, A.; Wahab, D.A.; Marzuki, M.; Ramli, S. Support vector machines for automated classification of plastic bottles. In Proceedings of the 6th International Colloquium on Signal Processing and Its Applications (CSPA), Melaka, Malaysia, 21–23 May 2010; pp. 1–5. [Google Scholar]
- Scavino, E.; Wahab, D.A.; Hussain, A.; Basri, H.; Mustafa, M.M. Application of automated image analysis to the identification and extraction of recyclable plastic bottles. J. Zhejiang Univ.-Sci. A 2009, 10, 794–799. [Google Scholar] [CrossRef]
- Hazra, D.; Byun, Y.C.; Kim, W.J.; Kang, C.U. Synthesis of Microscopic Cell Images Obtained from Bone Marrow Aspirate Smears through Generative Adversarial Networks. Biology 2022, 11, 276. [Google Scholar] [CrossRef] [PubMed]
- Bargshady, G.; Zhou, X.; Barua, P.D.; Gururajan, R.; Li, Y.; Acharya, U.R. Application of CycleGAN and transfer learning techniques for automated detection of COVID-19 using X-ray images. Pattern Recognit. Lett. 2022, 153, 67–74. [Google Scholar] [CrossRef] [PubMed]
- Tachwali, Y.; Al-Assaf, Y.; Al-Ali, A. Automatic multistage classification system for plastic bottles recycling. Resour. Conserv. Recycl. 2007, 52, 266–285. [Google Scholar] [CrossRef]
- Wang, Z.; Peng, B.; Huang, Y.; Sun, G. Classification for plastic bottles recycling based on image recognition. Waste Manag. 2019, 88, 170–181. [Google Scholar] [CrossRef] [PubMed]
- Zulkifley, M.A.; Mustafa, M.M.; Hussain, A. Probabilistic white strip approach to plastic bottle sorting system. In Proceedings of the 2013 IEEE International Conference on Image Processing, Melbourne, Australia, 15–18 September 2013; pp. 3162–3166. [Google Scholar]
- Srivastav, D.; Bajpai, A.; Srivastava, P. Improved classification for pneumonia detection using transfer learning with gan based synthetic image augmentation. In Proceedings of the 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 28–29 January 2021; pp. 433–437. [Google Scholar]
- Alsabei, A.; Alsayed, A.; Alzahrani, M.; Al-Shareef, S. Waste Classification by Fine-Tuning Pre-trained CNN and GAN. Int. J. Comput. Sci. Netw. Secur. 2021, 21, 65–70. [Google Scholar]
- Bircanoğlu, C.; Atay, M.; Beşer, F.; Genç, Ö.; Kızrak, M.A. RecycleNet: Intelligent waste sorting using deep neural networks. In Proceedings of the 2018 Innovations in Intelligent Systems and Applications (INISTA), Thessaloniki, Greece, 3–5 July 2018; pp. 1–7. [Google Scholar]
- Pio, G.; Mignone, P.; Magazzù, G.; Zampieri, G.; Ceci, M.; Angione, C. Integrating genome-scale metabolic modelling and transfer learning for human gene regulatory network reconstruction. Bioinformatics 2022, 38, 487–493. [Google Scholar] [CrossRef] [PubMed]
- Du, X. Complex environment image recognition algorithm based on GANs and transfer learning. Neural Comput. Appl. 2020, 32, 16401–16412. [Google Scholar] [CrossRef]
- Mohammed, A.M.; Onieva, E.; Woźniak, M. Selective ensemble of classifiers trained on selective samples. Neurocomputing 2022, 482, 197–211. [Google Scholar] [CrossRef]
- Yang, M.; Thung, G. Classification of trash for recyclability status. CS229 Proj. Rep. 2016, 2016, 3. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; Volume 27. [Google Scholar]
- Munjal, P.; Paul, A.; Krishnan, N.C. Implicit discriminator in variational autoencoder. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
- Hendrycks, D.; Mazeika, M.; Kadavath, S.; Song, D. Using self-supervised learning can improve model robustness and uncertainty. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32. [Google Scholar]
- Jing, L.; Tian, Y. Self-supervised visual feature learning with deep neural networks: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 4037–4058. [Google Scholar] [CrossRef] [PubMed]
- Goyal, P.; Mahajan, D.; Gupta, A.; Misra, I. Scaling and benchmarking self-supervised visual representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019; pp. 6391–6400. [Google Scholar]
- Liu, B.; Zhu, Y.; Song, K.; Elgammal, A. Towards faster and stabilized gan training for high-fidelity few-shot image synthesis. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Lim, J.H.; Ye, J.C. Geometric gan. arXiv 2017, arXiv:1705.02894. [Google Scholar]
- Kim, S.; Lee, S. Spatially Decomposed Hinge Adversarial Loss by Local Gradient Amplifier. In Proceedings of the ICLR 2021 Conference, Vienna, Austria, 4 May 2020. [Google Scholar]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
- Hao, R.; Namdar, K.; Liu, L.; Haider, M.A.; Khalvati, F. A comprehensive study of data augmentation strategies for prostate cancer detection in diffusion-weighted MRI using convolutional neural networks. J. Digit. Imaging 2021, 34, 862–876. [Google Scholar] [CrossRef] [PubMed]
- Kamishima, T.; Hamasaki, M.; Akaho, S. TrBagg: A simple transfer learning method and its application to personalization in collaborative tagging. In Proceedings of the 2009 Ninth IEEE International Conference on Data Mining, Miami Beach, FL, USA, 6–9 December 2009; pp. 219–228. [Google Scholar]
- ImageNet Dataset. 2016. Available online: https://image-net.org/ (accessed on 12 July 2021).
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved techniques for training gans. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Volume 29. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Xia, X.; Xu, C.; Nan, B. Inception-v3 for flower classification. In Proceedings of the 2017 2nd International Conference on Image, Vision and Computing (ICIVC), Chengdu, China, 2–4 June 2017; pp. 783–787. [Google Scholar]
- Wu, X.; Liu, R.; Yang, H.; Chen, Z. An xception based convolutional neural network for scene image classification with transfer learning. In Proceedings of the 2020 2nd International Conference on Information Technology and Computer Application (ITCA), Guangzhou, China, 18–20 December 2020; pp. 262–267. [Google Scholar]
Sl Number | Class Name | Images per Class |
---|---|---|
0 | Bottle_ShapeA | 169 |
1 | Bottle_ShapeB | 238 |
2 | Bottle_ShapeC | 41 |
3 | Masinda | 249 |
4 | Pepsi | 339 |
5 | Samdasoo | 631 |
Total | 1667 |
Sl No. | Class Name | Images per Class | Training (60%) | Validation (10%) | Testing (30%) |
---|---|---|---|---|---|
0 | Bottl_ShapeA | 700 | 420 | 70 | 210 |
1 | Bottl_ShapeB | 700 | 420 | 70 | 210 |
2 | Bottle_ShapeC | 700 | 420 | 70 | 210 |
3 | Masinda | 700 | 420 | 70 | 210 |
4 | Pepsi | 700 | 420 | 70 | 210 |
5 | Samdasoo | 700 | 420 | 70 | 210 |
Total | 4200 | 2520 | 420 | 210 |
Sl No. | Accuracy | IS | FID |
---|---|---|---|
1 | DCGAN | 12.36 | 73.4 |
2 | LSGAN | 10.06 | 67.6 |
3 | WGAN-GP | 9.67 | 72.3 |
4 | TrGAN | 9.82 | 65.4 |
5 | ACGAN | 9.47 | 76.3 |
6 | CGAN | 9.89 | 70.0 |
7 | Modified lightweight-GAN | 9.42 | 64.7 |
Component | Description |
---|---|
Operating system | Windows 10 64 bit |
Browser | Google Chrome |
CPU | Intel(R) Core(TM) i5-8500K CPU @ 3.70 GHz |
RAM | 32 GB |
Programming language | Python 3.8.5 |
GPU | NVIDIA GeForce RTX 2070 |
CUDA | CUDA Toolkit version 11.2 |
cuDNN | cuDNN version 8.1 |
Tensorflow | Tensorflow version 2.6.0 |
IDE | jupyter |
Machine learning algorithm | Modified lightweight-GAN |
Machine learning algorithm | Xception |
Machine learning algorithm | InceptionV3 |
Model/Classifier | InceptionV3 | Xception | IncepX-Ensemble | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Acc | Pre | Rec | F1 | Acc | Pre | Rec | F1 | Acc | Pre | Rec | F1 | |
Original Data | 86.6 | 89.2 | 88.6 | 90.1 | 92.8 | 87.2 | 93.2 | 90.1 | 93.5 | 93.7 | 92.8 | 93.8 |
DCGAN | 81.2 | 82.4 | 79.6 | 80.4 | 90.8 | 92.1 | 92.6 | 91.5 | 92.4 | 94.7 | 95.2 | 94.6 |
LSGAN | 83.2 | 81.9 | 85.4 | 83.6 | 85.4 | 86.3 | 90.6 | 86.4 | 84.4 | 85.3 | 84.0 | 85.4 |
WGAN-GP | 93.1 | 92.6 | 94.2 | 93.9 | 93.6 | 93.2 | 94.2 | 94.4 | 97.2 | 97.4 | 96.4 | 97.6 |
ACGAN | 89.9 | 89.1 | 90.1 | 90.5 | 91.4 | 91.2 | 92.0 | 91.6 | 95.5 | 95.7 | 94.5 | 96.2 |
CGAN | 97.1 | 98.3 | 96.5 | 97.9 | 98.4 | 97.2 | 98.3 | 97.9 | 97.1 | 98.6 | 98.7 | 98.7 |
Modified Lightweight-GAN | 98.8 | 98.2 | 99.0 | 98.6 | 98.9 | 97.4 | 98.7 | 98.5 | 99.0 | 99.1 | 99.3 | 99.2 |
Tradition Augmentation/Classifier | InceptionV3 | Xception | IncepX-Ensemble | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Acc | Pre | Rec | F1 | Acc | Pre | Rec | F1 | Acc | Pre | Rec | F1 | |
Original Data | 86.2 | 75.0 | 86.1 | 86.0 | 86.2 | 75.2 | 89.0 | 86.8 | 88.2 | 87.1 | 94.2 | 89.0 |
Flipping | 87.1 | 83.2 | 91.0 | 86.0 | 88.0 | 91.1 | 79.8 | 84.5 | 87.1 | 88.1 | 93.0 | 89.1 |
Rotation | 88.5 | 79.7 | 86.5 | 82.2 | 86.1 | 82.0 | 83.5 | 75.8 | 87.0 | 87.1 | 84.1 | 73.0 |
Translation | 85.1 | 76.5 | 88.1 | 80.2 | 86.2 | 82.2 | 85.1 | 87.5 | 88.1 | 81.1 | 88.0 | 82.2 |
Noise Addition | 75.2 | 72.0 | 77.1 | 75.6 | 75.6 | 76.0 | 77.0 | 77.1 | 75.8 | 75.2 | 77.2 | 76.1 |
Modified Lightweight-GAN | 89.8 | 87.4 | 83.7 | 83.3 | 91.3 | 89.3 | 88.5 | 88.7 | 93.1 | 89.6 | 92.9 | 92.1 |
Original + Synthetic Image/Classifier | InceptionV3 | Xception | IncepX-Ensemble | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Acc | Pre | Rec | F1 | Acc | Pre | Rec | F1 | Acc | Pre | Rec | F1 | |
Original Data | 93.9 | 92.5 | 95.8 | 94.3 | 94.4 | 94.6 | 92.9 | 92.9 | 96.2 | 95.8 | 96.1 | 95.6 |
Rotation | 95.6 | 94.7 | 97.9 | 95.6 | 95.9 | 91.1 | 94.9 | 96.2 | 96.9 | 95.3 | 95.6 | 97.1 |
Translation | 94.6 | 94.9 | 93.0 | 95.4 | 94.5 | 93.9 | 92.6 | 94.9 | 95.2 | 93.8 | 93.2 | 95.7 |
ACGAN | 95.3 | 87.3 | 91.3 | 92.2 | 95.2 | 87.0 | 91.0 | 94.1 | 95.6 | 94.2 | 93.6 | 94.0 |
WGAN-GP | 95.6 | 95.4 | 96.1 | 96.0 | 96.2 | 95.9 | 89.6 | 95.5 | 96.8 | 95.4 | 96.2 | 96.1 |
CGAN | 94.6 | 95.0 | 96.1 | 95.3 | 75.6 | 76.0 | 77.0 | 77.1 | 95.8 | 92.5 | 95.4 | 96.0 |
Modified Lightweight-GAN | 96.2 | 95.2 | 93.7 | 96.3 | 97.6 | 96.3 | 97.5 | 98.2 | 98.9 | 96.6 | 95.9 | 99.1 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chatterjee, S.; Hazra, D.; Byun, Y.-C.; Kim, Y.-W. Enhancement of Image Classification Using Transfer Learning and GAN-Based Synthetic Data Augmentation. Mathematics 2022, 10, 1541. https://doi.org/10.3390/math10091541
Chatterjee S, Hazra D, Byun Y-C, Kim Y-W. Enhancement of Image Classification Using Transfer Learning and GAN-Based Synthetic Data Augmentation. Mathematics. 2022; 10(9):1541. https://doi.org/10.3390/math10091541
Chicago/Turabian StyleChatterjee, Subhajit, Debapriya Hazra, Yung-Cheol Byun, and Yong-Woon Kim. 2022. "Enhancement of Image Classification Using Transfer Learning and GAN-Based Synthetic Data Augmentation" Mathematics 10, no. 9: 1541. https://doi.org/10.3390/math10091541
APA StyleChatterjee, S., Hazra, D., Byun, Y. -C., & Kim, Y. -W. (2022). Enhancement of Image Classification Using Transfer Learning and GAN-Based Synthetic Data Augmentation. Mathematics, 10(9), 1541. https://doi.org/10.3390/math10091541