A Semi-Supervised Stacked Autoencoder Using the Pseudo Label for Classification Tasks
<p>Network structure of the AE.</p> "> Figure 2
<p>Network structure of the SAE.</p> "> Figure 3
<p>Network framework of the PL-SSAE.</p> "> Figure 4
<p>The influence of the regularization parameter and label percentage on the generalization performance.</p> "> Figure 5
<p>The influence of the hidden nodes on (<b>a</b>) accuracy and (<b>b</b>) training time.</p> "> Figure 6
<p>Comparison of the semi-supervised classification on (<b>a</b>) Convex, (<b>b</b>) USPS, (<b>c</b>) MNIST and (<b>d</b>) Fashion-MNIST datasets.</p> ">
Abstract
:1. Introduction
- A new semi-supervised SAE named the PL-SSAE is proposed. By integrating the pseudo label with the SAE, the pseudo labels of the unlabeled samples are generated and the category information in the unlabeled samples is effectively exploited to improve the generalization performance of the PL-SSAE. The experimental results on various benchmark datasets show that the semi-supervised classification performance of the PL-SSAE outperforms the SAE, SSAE, Semi-SAE and Semi-SSAE.
- The pseudo-label regularization term is constructed. The pseudo-label regularization term represents the classification loss of the pseudo-labeled samples, and it is added to the loss function to control the loss balance between the labeled and pseudo-labeled samples and to prevent over-fitting.
2. Related Works
2.1. Autoencoder
2.2. Stacked Autoencoder
3. Semi-Supervised Stacked Autoencoder Based on the Pseudo Label
3.1. Pseudo Label
3.2. Network Structure
3.3. Training Process
Algorithm 1: Training process of the PL-SSAE. |
Input: The labeled samples , the unlabeled samples , the number of hidden nodes , the regularization parameter , the number of mini-batch sizes , the number of iteration , learning rate , and the activation function |
Output: The mapping function . |
The unsupervised pre-training |
1: for to do |
2: if |
3: Let be the input and output of the first AE |
4: else |
5: Let be the input and output of the th AE |
6: Randomly initialize the network parameters of the th AE |
7: for to do |
8: Obtain mini-batch samples from the input sample |
9: Compute the hidden output of the AE by Equation (1) |
10: Calculate the reconstructed samples by Equation (2) |
11: Compute the reconstruction loss of the AE by Equation (3) |
12: Update the network parameters based on the stochastic gradient descent algorithm |
13: end for |
14: Assign the network parameters of the th AE to the th hidden layer |
15: Calculate the output of the th hidden layer by Equation (5) |
16: end for |
The supervised fine-tuning |
17: Input the labeled samples into the network |
18: for to do |
19: Obtain mini-batch samples from the input samples |
20: Predict the labels of the mini-batch samples by Equation (6) |
21: Calculate the classification loss by Equation (4) |
22: Update the network parameters and based on the stochastic gradient descent algorithm |
23: end for |
The pseudo-label generation |
24: Input the unlabeled samples into the network |
25: Compute the class prediction of the unlabeled samples by Equation (6) |
26: Generate the pseudo labels of the unlabeled samples by Equation (7) |
The semi-supervised fine-tuning |
27: Input the labeled samples and the pseudo-labeled samples into the network |
28: for to do |
29: Obtain mini-batch samples from the input samples |
30: Compute the class prediction of the input samples by Equation (6) |
31: Calculate the total classification loss by Equations (8)–(10) |
32: Update the network parameters and |
33: end for |
34: return the mapping function |
4. Experiments
4.1. Experimental Settings
4.1.1. Data Description
4.1.2. Implementation Details
4.2. Influence of Different Hyperparameters
4.3. Comparison of Semi-Supervised Classification
4.4. Comparison of Comprehensive Performance
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed]
- Bengio, Y.; Lamblin, P.; Popovici, D. Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 2007, 19, 153–160. [Google Scholar]
- Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed]
- Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef] [PubMed]
- Shao, H.D.; Xia, M.; Wan, J.F. Modified stacked autoencoder using adaptive Morlet wavelet for intelligent fault diagnosis of rotating machinery. IEEE/ASME Trans. Mechatron. 2021, 27, 24–33. [Google Scholar] [CrossRef]
- Jia, N.; Cheng, Y.; Liu, Y.Y.; Tian, Y.Y. Intelligent fault diagnosis of rotating machines based on wavelet time-frequency diagram and optimized stacked denoising auto-encoder. IEEE Sens. J. 2022, 22, 17139–17150. [Google Scholar] [CrossRef]
- Wang, S.D.; Lin, B.Y.; Zhang, Y.Y.; Qiao, S.B. SGAEMDA: Predicting miRNA-disease associations based on stacked graph autoencoder. Cells 2022, 11, 3984. [Google Scholar] [CrossRef]
- Wang, L.; You, Z.H.; Li, J.Q.; Hiang, Y.A. IMS-CDA: Prediction of circRNA-disease associations from the integration of multisource similarity information with deep stacked autoencoder model. IEEE Trans. Cybern. 2020, 51, 5522–5531. [Google Scholar] [CrossRef]
- Dao, T.N.; Lee, H.J. Stacked autoencoder-based probabilistic feature extraction for on-device network intrusion detection. IEEE Internet Things J. 2021, 9, 14438–14451. [Google Scholar] [CrossRef]
- Karthic, S.; Kumar, S.M. Wireless intrusion detection based on optimized LSTM with stacked auto encoder network. Intell. Autom. Soft Comput. 2022, 34, 439–453. [Google Scholar] [CrossRef]
- Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y.; Manzagol, P.A. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 2010, 11, 3371–3408. [Google Scholar]
- Ng, A. Sparse Autoencoder. Available online: http://graphics.stanford.edu/courses/cs233-21-spring/ReferencedPapers/SAE.pdf (accessed on 25 April 2023).
- Masci, J.; Meier, U.; Cireşan, D. Stacked convolutional auto-encoders for hierarchical feature extraction. In Proceedings of the International Conference on Artificial Neural Networks, Espoo, Finland, 14–17 June 2011; pp. 52–59. [Google Scholar]
- Tang, C.F.; Luktarhan, N.; Zhao, Y.X. SAAE-DNN: Deep learning method on intrusion detection. Symmetry 2020, 12, 1695. [Google Scholar] [CrossRef]
- Tawfik, N.; Elnemr, H.A.; Fakhr, M.; Dessouky, M.I. Multimodal medical image fusion using stacked auto-encoder in NSCT domain. J. Digit. Imaging 2022, 35, 1308–1325. [Google Scholar] [CrossRef] [PubMed]
- Yang, D.S.; Qin, J.; Pang, Y.H.; Huang, T.W. A novel double-stacked autoencoder for power transformers DGA signals with an imbalanced data structure. IEEE Trans. Ind. Electron. 2021, 69, 1977–1987. [Google Scholar] [CrossRef]
- Chen, J.M.; Fan, S.S.; Yang, C.H.; Zhou, C.; Zhu, H.Q. Stacked maximal quality-driven autoencoder: Deep feature representation for soft analyzer and its application on industrial processes. Inf. Sci. 2022, 596, 280–303. [Google Scholar] [CrossRef]
- Liu, P.J.; Pan, F.C.; Zhou, X.F.; Li, S.; Zeng, P.Y. Dsa-PAML: A parallel automated machine learning system via dual-stacked autoencoder. Neural Comput. Appl. 2022, 34, 12985–13006. [Google Scholar] [CrossRef]
- Xu, J.H.; Zhou, W.; Chen, Z.B.; Ling, S.Y. Binocular rivalry oriented predictive autoencoding network for blind stereoscopic image quality measurement. IEEE Trans. Instrum. Meas. 2020, 70, 5001413. [Google Scholar] [CrossRef]
- Pourebrahim, Y.; Razzazi, F.; Sameti, H. Semi-supervised parallel shared encoders for speech emotion recognition. Digit. Signal Process. 2021, 118, 103205. [Google Scholar] [CrossRef]
- Peng, Z.; Tian, S.W.; Yu, L.; Zhang, D.Z.; Wu, W.D.; Zhou, S.F. Semi-supervised medical image classification with adaptive threshold pseudo-labeling and unreliable sample contrastive loss. Biomed. Signal Process. Control 2023, 79, 104142. [Google Scholar] [CrossRef]
- Protopapadakis, E.; Doulamis, A.; Doulamis, N.; Maltezos, E. Stacked autoencoders driven by semi-supervised learning for building extraction from near infrared remote sensing imagery. Remote Sens. 2021, 13, 371. [Google Scholar] [CrossRef]
- Aouedi, O.; Piamrat, K.; Bagadthey, D. Handling partially labeled network data: A semi-supervised approach using stacked sparse autoencoder. Comput. Netw. 2022, 207, 108742. [Google Scholar] [CrossRef]
- Xiao, Y.W.; Wu, J.; Lin, Z.L.; Zhao, X.D. A semi-supervised deep learning method based on stacked sparse auto-encoder for cancer prediction using RNA-seq data. Comput. Methods Programs Biomed. 2018, 166, 99–105. [Google Scholar] [CrossRef]
- Lee, D.H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the International Conference on Machine Learning, Atlanta, Georgia, 13–21 June 2013; p. 896. [Google Scholar]
- Higuchi, Y.; Moritz, N.; Le Roux, J.; Hori, T. Momentum pseudo-labeling: Semi-supervised ASR with continuously improving pseudo-labels. IEEE J. Sel. Top. Signal Process. 2022, 16, 1424–1438. [Google Scholar] [CrossRef]
- Wang, J.X.; Ding, C.H.Q.; Chen, S.B.; He, G.G.; Luo, B. Semi-supervised remote sensing image semantic segmentation via consistency regularization and average update of pseudo-label. Remote Sens. 2020, 12, 3603. [Google Scholar] [CrossRef]
- Hull, J.J. A database for handwritten text recognition research. IEEE Trans. Pattern Anal. Mach. Intell. 1994, 16, 550–554. [Google Scholar] [CrossRef]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-mnist: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
- Blake, C.L.; Merz, C.J. UCI Repository of Machine Learning Databases. Available online: http://archive.ics.uci.edu/m (accessed on 10 May 2023).
- Xu, X.; Ren, W. A hybrid model of stacked autoencoder and modified particle swarm optimization for multivariate chaotic time series forecasting. Appl. Soft Comput. 2022, 116, 108321. [Google Scholar] [CrossRef]
- Abdelmaboud, A.; Al-Wesabi, F.N.; Al Duhayyim, M.; Eisa, T.A.E.; Hamza, M.A. Machine learning enabled e-learner non-verbal behavior detection in IoT environment. CMC-Comput. Mater. Contin. 2022, 72, 679–693. [Google Scholar] [CrossRef]
Datasets | Attributes | Classes | Training Data | Testing Data |
---|---|---|---|---|
Rectangles | 784 | 2 | 1200 | 50,000 |
Convex | 784 | 2 | 8000 | 50,000 |
USPS | 256 | 10 | 7291 | 2007 |
MNIST | 784 | 10 | 60,000 | 10,000 |
Fashion-MNIST | 784 | 10 | 60,000 | 10,000 |
Datasets | Network Structure | ||||
---|---|---|---|---|---|
SAE | SSAE | Semi-SAE | Semi-SSAE | PL-SSAE | |
Rectangles | 784-200-100-2 | ||||
Convex | 784-200-100-2 | ||||
USPS | 256-200-100-10 | ||||
MNIST | 784-400-200-100-10 | ||||
Fashion-MNIST | 784-200-100-50-10 |
Datasets | Accuracy (%) | |||||
---|---|---|---|---|---|---|
SAE | SSAE | Semi-SAE | Semi-SSAE | PL-SSAE | ||
Rectangles | p = 5 | 62.06 ± 0.68 | 62.72 ± 0.72 | 62.85 ± 0.83 | 62.86 ± 0.52 | 64.65 ± 0.40 |
p = 10 | 64.41 ± 0.55 | 64.95 ± 0.72 | 65.24 ± 0.53 | 65.33 ± 0.38 | 66.24 ± 0.27 | |
p = 15 | 71.61 ± 1.25 | 71.73 ± 1.17 | 72.12 ± 1.40 | 72.20 ± 1.12 | 73.17 ± 1.05 | |
p = 20 | 74.28 ± 1.32 | 74.61 ± 1.05 | 74.95 ± 0.85 | 75.23 ± 1.03 | 76.01 ± 1.13 | |
Convex | p = 5 | 59.50 ± 0.55 | 60.39 ± 0.60 | 60.38 ± 0.63 | 60.64 ± 0.60 | 61.32 ± 0.46 |
p = 10 | 63.82 ± 0.46 | 63.89 ± 0.67 | 63.93 ± 0.89 | 64.10 ± 0.51 | 64.76 ± 0.64 | |
p = 15 | 65.67 ± 0.86 | 65.84 ± 0.81 | 65.74 ± 0.61 | 65.92 ± 0.49 | 66.51 ± 0.66 | |
p = 20 | 66.01 ± 0.61 | 66.10 ± 0.54 | 66.54 ± 0.29 | 66.79 ± 0.42 | 67.27 ± 0.35 | |
USPS | p = 5 | 88.24 ± 0.24 | 88.84 ± 0.35 | 89.40 ± 0.22 | 89.52 ± 0.10 | 90.25 ± 0.10 |
p = 10 | 90.59 ± 0.39 | 90.52 ± 0.32 | 90.54 ± 0.24 | 90.66 ± 0.23 | 91.62 ± 0.33 | |
p = 15 | 91.75 ± 0.22 | 91.79 ± 0.20 | 91.83 ± 0.22 | 91.92 ± 0.06 | 92.32 ± 0.18 | |
p = 20 | 92.04 ± 0.19 | 92.07 ± 0.13 | 92.28 ± 0.19 | 92.35 ± 0.11 | 92.88 ± 0.11 | |
MNIST | p = 5 | 92.83 ± 0.20 | 93.41 ± 0.07 | 93.97 ± 0.17 | 94.11 ± 0.09 | 95.45 ± 0.12 |
p = 10 | 94.77 ± 0.11 | 95.10 ± 0.12 | 95.34 ± 0.11 | 95.53 ± 0.10 | 96.49 ± 0.11 | |
p = 15 | 95.46 ± 0.07 | 95.89 ± 0.10 | 96.21 ± 0.15 | 96.29 ± 0.06 | 97.08 ± 0.08 | |
p = 20 | 95.98 ± 0.13 | 96.23 ± 0.10 | 96.57 ± 0.11 | 96.60 ± 0.11 | 97.23 ± 0.04 | |
Fashion-MNIST | p = 5 | 77.41 ± 1.47 | 78.50 ± 1.26 | 80.01 ± 1.57 | 80.50 ± 1.33 | 82.44 ± 1.52 |
p = 10 | 81.37 ± 1.12 | 81.64 ± 0.97 | 82.03 ± 1.15 | 82.07 ± 0.91 | 83.33 ± 1.07 | |
p = 15 | 82.28 ± 0.54 | 82.50 ± 0.57 | 82.69 ± 0.67 | 82.77 ± 0.61 | 83.60 ± 0.60 | |
p = 20 | 82.54 ± 0.71 | 82.66 ± 0.68 | 82.78 ± 0.38 | 82.86 ± 0.41 | 83.82 ± 0.64 |
Datasets | Precision (%) | |||||
---|---|---|---|---|---|---|
SAE | SSAE | Semi-SAE | Semi-SSAE | PL-SSAE | ||
Rectangles | p = 5 | 62.95 ± 1.36 | 63.30 ± 1.12 | 63.70 ± 1.09 | 63.89 ± 1.30 | 65.20 ± 0.64 |
p = 10 | 65.11 ± 1.49 | 65.10 ± 1.32 | 65.98 ± 0.95 | 65.67 ± 0.98 | 67.14 ± 0.78 | |
p = 15 | 71.61 ± 1.01 | 71.05 ± 0.99 | 72.46 ± 1.29 | 72.31 ± 1.13 | 73.71 ± 1.36 | |
p = 20 | 73.34 ± 1.80 | 73.92 ± 1.73 | 74.04 ± 1.15 | 74.10 ± 1.03 | 75.42 ± 1.03 | |
Convex | p = 5 | 58.38 ± 1.05 | 58.96 ± 1.05 | 59.13 ± 1.00 | 59.19 ± 0.97 | 60.44 ± 0.82 |
p = 10 | 61.55 ± 0.77 | 61.58 ± 1.16 | 61.99 ± 1.07 | 62.10 ± 1.07 | 62.74 ± 1.38 | |
p = 15 | 64.93 ± 0.95 | 64.94 ± 1.09 | 64.96 ± 0.82 | 65.12 ± 0.74 | 65.61 ± 0.53 | |
p = 20 | 65.35 ± 0.98 | 65.95 ± 1.19 | 65.41 ± 1.21 | 66.03 ± 0.93 | 66.76 ± 0.98 | |
USPS | p = 5 | 88.22 ± 0.26 | 88.51 ± 0.28 | 88.75 ± 0.24 | 88.86 ± 0.16 | 89.55 ± 0.06 |
p = 10 | 89.91 ± 0.43 | 89.87 ± 0.36 | 89.97 ± 0.28 | 90.07 ± 0.23 | 91.23 ± 0.39 | |
p = 15 | 91.06 ± 0.28 | 91.12 ± 0.36 | 91.19 ± 0.21 | 91.35 ± 0.11 | 91.76 ± 0.26 | |
p = 20 | 91.45 ± 0.23 | 91.53 ± 0.25 | 91.77 ± 0.21 | 91.85 ± 0.12 | 92.26 ± 0.18 | |
MNIST | p = 5 | 92.92 ± 0.21 | 93.66 ± 0.07 | 93.93 ± 0.18 | 94.06 ± 0.10 | 95.38 ± 0.13 |
p = 10 | 94.75 ± 0.12 | 95.06 ± 0.11 | 95.31 ± 0.10 | 95.51 ± 0.08 | 96.47 ± 0.12 | |
p = 15 | 95.46 ± 0.06 | 95.87 ± 0.10 | 96.19 ± 0.16 | 96.27 ± 0.06 | 97.05 ± 0.08 | |
p = 20 | 95.95 ± 0.14 | 96.20 ± 0.09 | 96.55 ± 0.11 | 96.57 ± 0.11 | 97.23 ± 0.05 | |
Fashion-MNIST | p = 5 | 77.35 ± 1.56 | 78.08 ± 1.08 | 79.86 ± 1.36 | 80.35 ± 1.20 | 82.20 ± 1.04 |
p = 10 | 81.43 ± 1.30 | 82.00 ± 1.00 | 82.09 ± 1.44 | 82.37 ± 1.07 | 83.48 ± 1.10 | |
p = 15 | 82.38 ± 0.65 | 82.55 ± 0.58 | 82.79 ± 0.56 | 82.94 ± 0.77 | 83.68 ± 0.79 | |
p = 20 | 82.52 ± 0.70 | 82.74 ± 0.79 | 82.87 ± 0.58 | 82.91 ± 0.60 | 83.95 ± 0.75 |
Datasets | F1-Measure (%) | |||||
---|---|---|---|---|---|---|
SAE | SSAE | Semi-SAE | Semi-SSAE | PL-SSAE | ||
Rectangles | p = 5 | 60.48 ± 1.08 | 60.50 ± 0.86 | 60.73 ± 0.67 | 60.90 ± 0.72 | 62.85 ± 0.65 |
p = 10 | 63.59 ± 1.07 | 64.37 ± 1.04 | 64.59 ± 1.35 | 64.93 ± 1.28 | 66.19 ± 1.49 | |
p = 15 | 71.16 ± 0.86 | 72.26 ± 1.07 | 72.16 ± 1.15 | 72.54 ± 0.98 | 73.62 ± 1.02 | |
p = 20 | 74.79 ± 1.29 | 74.97 ± 1.31 | 75.40 ± 0.98 | 75.79 ± 1.38 | 76.56 ± 0.83 | |
Convex | p = 5 | 62.17 ± 1.09 | 62.98 ± 1.48 | 62.99 ± 1.10 | 62.98 ± 1.21 | 63.70 ± 0.98 |
p = 10 | 65.83 ± 1.34 | 66.20 ± 1.22 | 66.06 ± 1.53 | 66.84 ± 1.21 | 67.50 ± 1.04 | |
p = 15 | 66.69 ± 1.02 | 68.09 ± 1.34 | 67.84 ± 0.81 | 68.14 ± 0.96 | 68.89 ± 0.92 | |
p = 20 | 67.85 ± 0.80 | 68.50 ± 0.89 | 68.94 ± 1.13 | 69.06 ± 0.76 | 69.74 ± 0.82 | |
USPS | p = 5 | 87.22 ± 0.38 | 87.84 ± 0.78 | 88.37 ± 0.27 | 88.58 ± 0.13 | 89.28 ± 0.12 |
p = 10 | 89.77 ± 0.43 | 89.71 ± 0.36 | 89.75 ± 0.27 | 89.88 ± 0.24 | 90.97 ± 0.37 | |
p = 15 | 91.00 ± 0.26 | 91.02 ± 0.25 | 91.10 ± 0.24 | 91.16 ± 0.08 | 91.56 ± 0.21 | |
p = 20 | 91.30 ± 0.23 | 91.29 ± 0.20 | 91.57 ± 0.21 | 91.64 ± 0.16 | 92.20 ± 0.15 | |
MNIST | p = 5 | 92.87 ± 0.20 | 93.63 ± 0.07 | 93.89 ± 0.18 | 94.04 ± 0.10 | 95.40 ± 0.12 |
p = 10 | 94.71 ± 0.12 | 95.14 ± 0.11 | 95.29 ± 0.11 | 95.48 ± 0.10 | 96.42 ± 0.13 | |
p = 15 | 95.40 ± 0.06 | 95.86 ± 0.10 | 96.17 ± 0.16 | 96.26 ± 0.06 | 97.05 ± 0.08 | |
p = 20 | 95.93 ± 0.14 | 96.29 ± 0.10 | 96.54 ± 0.11 | 96.56 ± 0.11 | 97.22 ± 0.03 | |
Fashion-MNIST | p = 5 | 76.37 ± 1.69 | 77.45 ± 1.13 | 79.17 ± 1.27 | 79.87 ± 1.44 | 81.44 ± 1.70 |
p = 10 | 81.02 ± 1.08 | 81.56 ± 1.04 | 81.64 ± 1.32 | 81.80 ± 0.63 | 82.22 ± 0.96 | |
p = 15 | 81.96 ± 0.55 | 82.19 ± 0.65 | 82.34 ± 0.87 | 82.59 ± 0.91 | 83.01 ± 0.77 | |
p = 20 | 82.17 ± 0.88 | 82.29 ± 0.46 | 82.69 ± 0.49 | 82.77 ± 0.36 | 83.76 ± 0.53 |
Datasets | G-Mean (%) | |||||
---|---|---|---|---|---|---|
SAE | SSAE | Semi-SAE | Semi-SSAE | PL-SSAE | ||
Rectangles | p = 5 | 95.97 ± 0.12 | 96.42 ± 0.05 | 96.57 ± 0.10 | 96.66 ± 0.05 | 97.43 ± 0.07 |
p = 10 | 97.03 ± 0.07 | 97.29 ± 0.07 | 97.36 ± 0.06 | 97.47 ± 0.06 | 98.01 ± 0.08 | |
p = 15 | 97.43 ± 0.04 | 97.65 ± 0.06 | 97.86 ± 0.09 | 97.91 ± 0.04 | 98.36 ± 0.04 | |
p = 20 | 97.73 ± 0.07 | 97.94 ± 0.05 | 98.07 ± 0.07 | 98.08 ± 0.06 | 98.45 ± 0.02 | |
Convex | p = 5 | 86.87 ± 0.70 | 87.52 ± 0.86 | 88.45 ± 0.94 | 88.74 ± 0.90 | 89.94 ± 0.94 |
p = 10 | 89.27 ± 0.67 | 89.43 ± 0.58 | 89.66 ± 0.69 | 89.68 ± 0.47 | 90.28 ± 0.91 | |
p = 15 | 89.81 ± 0.32 | 89.96 ± 0.37 | 90.05 ± 0.40 | 90.11 ± 0.36 | 90.62 ± 0.48 | |
p = 20 | 89.96 ± 0.43 | 90.36 ± 0.40 | 90.64 ± 0.23 | 90.76 ± 0.25 | 91.15 ± 0.37 | |
USPS | p = 5 | 61.93 ± 0.57 | 62.30 ± 0.96 | 62.25 ± 1.10 | 62.38 ± 0.88 | 64.36 ± 0.23 |
p = 10 | 64.29 ± 0.48 | 64.87 ± 0.72 | 64.94 ± 0.59 | 65.27 ± 0.41 | 65.80 ± 0.38 | |
p = 15 | 71.40 ± 1.07 | 71.53 ± 0.88 | 71.74 ± 0.87 | 71.89 ± 0.99 | 72.77 ± 1.21 | |
p = 20 | 74.22 ± 1.33 | 74.54 ± 1.03 | 74.90 ± 1.09 | 75.13 ± 1.00 | 76.24 ± 0.97 | |
MNIST | p = 5 | 58.64 ± 1.10 | 59.81 ± 1.01 | 59.73 ± 0.77 | 59.94 ± 0.75 | 60.47 ± 0.49 |
p = 10 | 62.89 ± 0.61 | 62.92 ± 1.17 | 63.09 ± 1.04 | 63.29 ± 0.97 | 64.10 ± 1.04 | |
p = 15 | 65.05 ± 1.08 | 65.27 ± 0.86 | 65.23 ± 0.68 | 65.34 ± 1.10 | 66.39 ± 0.87 | |
p = 20 | 65.69 ± 0.78 | 65.69 ± 0.83 | 65.98 ± 0.55 | 66.24 ± 0.48 | 66.81 ± 0.23 | |
Fashion-MNIST | p = 5 | 92.64 ± 0.17 | 93.03 ± 0.18 | 93.37 ± 0.17 | 93.58 ± 0.05 | 93.89 ± 0.12 |
p = 10 | 94.27 ± 0.23 | 94.23 ± 0.16 | 94.24 ± 0.14 | 94.31 ± 0.16 | 94.89 ± 0.21 | |
p = 15 | 94.99 ± 0.16 | 94.98 ± 0.11 | 95.03 ± 0.16 | 95.04 ± 0.05 | 95.32 ± 0.12 | |
p = 20 | 95.11 ± 0.14 | 95.21 ± 0.12 | 95.24 ± 0.12 | 95.29 ± 0.10 | 95.72 ± 0.08 |
Datasets | Training Time (s) | |||||
---|---|---|---|---|---|---|
SAE | SSAE | Semi-SAE | Semi-SSAE | PL-SSAE | ||
Rectangles | p = 5 | 4.307 ± 0.501 | 4.551 ± 1.131 | 5.530 ± 0.001 | 6.199 ± 0.876 | 11.833 ± 0.672 |
p = 10 | 4.697 ± 0.536 | 4.887 ± 0.066 | 5.695 ± 0.459 | 6.608 ± 0.911 | 12.321 ± 0.812 | |
p = 15 | 4.989 ± 0.596 | 4.700 ± 0.020 | 5.965 ± 0.694 | 6.693 ± 0.679 | 12.432 ± 0.813 | |
p = 20 | 5.136 ± 0.496 | 5.185 ± 0.499 | 6.288 ± 0.640 | 6.758 ± 0.378 | 13.065 ± 1.012 | |
Convex | p = 5 | 5.790 ± 0.091 | 5.779 ± 0.476 | 17.188 ± 0.835 | 19.968 ± 0.747 | 37.163 ± 2.359 |
p = 10 | 7.635 ± 0.481 | 7.785 ± 0.440 | 19.104 ± 0.805 | 21.013 ± 0.719 | 39.100 ± 2.732 | |
p = 15 | 9.367 ± 0.454 | 9.802 ± 0.729 | 20.317 ± 0.880 | 22.618 ± 0.983 | 41.352 ± 2.340 | |
p = 20 | 11.757 ± 0.476 | 12.191 ± 0.975 | 22.115 ± 0.836 | 24.155 ± 0.910 | 43.395 ± 2.210 | |
USPS | p = 5 | 1.793 ± 0.163 | 1.828 ± 0.157 | 9.151 ± 0.563 | 11.218 ± 0.350 | 20.096 ± 0.991 |
p = 10 | 3.156 ± 0.278 | 3.713 ± 0.709 | 10.136 ± 0.788 | 11.989 ± 1.230 | 20.803 ± 1.714 | |
p = 15 | 4.689 ± 0.621 | 4.800 ± 0.930 | 11.104 ± 1.000 | 12.959 ± 0.794 | 21.658 ± 1.142 | |
p = 20 | 5.999 ± 0.522 | 6.321 ± 1.148 | 11.951 ± 0.987 | 14.367 ± 0.796 | 22.060 ± 1.392 | |
MNIST | p = 5 | 18.957 ± 0.896 | 17.142 ± 1.472 | 149.063 ± 2.258 | 172.881 ± 1.375 | 367.162 ± 2.683 |
p = 10 | 36.686 ± 0.738 | 35.380 ± 1.557 | 160.620 ± 2.823 | 183.834 ± 1.573 | 384.812 ± 3.075 | |
p = 15 | 51.161 ± 1.554 | 54.447 ± 1.151 | 173.278 ± 2.472 | 194.655 ± 2.311 | 398.623 ± 3.822 | |
p = 20 | 79.166 ± 1.570 | 82.322 ± 1.539 | 186.880 ± 2.461 | 209.533 ± 2.135 | 434.780 ± 2.810 | |
Fashion-MNIST | p = 5 | 16.120 ± 0.635 | 17.557 ± 0.902 | 149.418 ± 1.191 | 172.441 ± 0.942 | 368.544 ± 2.431 |
p = 10 | 33.148 ± 1.178 | 35.654 ± 1.837 | 160.116 ± 1.692 | 184.956 ± 1.280 | 380.945 ± 2.933 | |
p = 15 | 49.904 ± 1.304 | 53.817 ± 1.138 | 171.385 ± 1.382 | 197.478 ± 1.815 | 397.639 ± 3.021 | |
p = 20 | 77.729 ± 1.622 | 81.660 ± 1.800 | 189.076 ± 0.940 | 212.044 ± 2.509 | 437.919 ± 2.892 |
Datasets | Testing Time (s) | |||||
---|---|---|---|---|---|---|
SAE | SSAE | Semi-SAE | Semi-SSAE | PL-SSAE | ||
Rectangles | p = 5 | 0.050 ± 0.007 | 0.046 ± 0.001 | 0.046 ± 0.001 | 0.046 ± 0.001 | 0.047 ± 0.001 |
p = 10 | 0.051 ± 0.007 | 0.047 ± 0.002 | 0.047 ± 0.002 | 0.047 ± 0.002 | 0.047 ± 0.001 | |
p = 15 | 0.042 ± 0.004 | 0.046 ± 0.001 | 0.047 ± 0.001 | 0.047 ± 0.001 | 0.046 ± 0.001 | |
p = 20 | 0.047 ± 0.006 | 0.046 ± 0.001 | 0.047 ± 0.002 | 0.047 ± 0.001 | 0.046 ± 0.002 | |
Convex | p = 5 | 0.039 ± 0.007 | 0.042 ± 0.002 | 0.040 ± 0.001 | 0.040 ± 0.001 | 0.040 ± 0.001 |
p = 10 | 0.038 ± 0.006 | 0.040 ± 0.001 | 0.039 ± 0.001 | 0.039 ± 0.001 | 0.041 ± 0.001 | |
p = 15 | 0.047 ± 0.010 | 0.040 ± 0.001 | 0.040 ± 0.002 | 0.040 ± 0.001 | 0.039 ± 0.001 | |
p = 20 | 0.042 ± 0.008 | 0.042 ± 0.004 | 0.039 ± 0.001 | 0.040 ± 0.002 | 0.040 ± 0.001 | |
USPS | p = 5 | 0.008 ± 0.001 | 0.005 ± 0.001 | 0.005 ± 0.001 | 0.005 ± 0.001 | 0.004 ± 0.001 |
p = 10 | 0.008 ± 0.002 | 0.005 ± 0.002 | 0.005 ± 0.001 | 0.005 ± 0.001 | 0.005 ± 0.002 | |
p = 15 | 0.006 ± 0.001 | 0.005 ± 0.001 | 0.005 ± 0.002 | 0.004 ± 0.001 | 0.004 ± 0.001 | |
p = 20 | 0.007 ± 0.001 | 0.005 ± 0.001 | 0.005 ± 0.001 | 0.005 ± 0.002 | 0.005 ± 0.001 | |
MNIST | p = 5 | 0.017 ± 0.002 | 0.013 ± 0.003 | 0.012 ± 0.004 | 0.014 ± 0.001 | 0.012 ± 0.002 |
p = 10 | 0.014 ± 0.002 | 0.014 ± 0.001 | 0.015 ± 0.007 | 0.012 ± 0.004 | 0.012 ± 0.001 | |
p = 15 | 0.016 ± 0.002 | 0.013 ± 0.003 | 0.012 ± 0.005 | 0.013 ± 0.003 | 0.013 ± 0.002 | |
p = 20 | 0.016 ± 0.001 | 0.014 ± 0.001 | 0.014 ± 0.001 | 0.013 ± 0.003 | 0.015 ± 0.003 | |
Fashion-MNIST | p = 5 | 0.014 ± 0.001 | 0.014 ± 0.002 | 0.013 ± 0.003 | 0.014 ± 0.001 | 0.012 ± 0.001 |
p = 10 | 0.013 ± 0.003 | 0.014 ± 0.001 | 0.012 ± 0.004 | 0.014 ± 0.001 | 0.014 ± 0.002 | |
p = 15 | 0.014 ± 0.001 | 0.014 ± 0.001 | 0.014 ± 0.004 | 0.014 ± 0.001 | 0.014 ± 0.001 | |
p = 20 | 0.013 ± 0.004 | 0.013 ± 0.003 | 0.012 ± 0.006 | 0.014 ± 0.002 | 0.014 ± 0.003 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lai, J.; Wang, X.; Xiang, Q.; Quan, W.; Song, Y. A Semi-Supervised Stacked Autoencoder Using the Pseudo Label for Classification Tasks. Entropy 2023, 25, 1274. https://doi.org/10.3390/e25091274
Lai J, Wang X, Xiang Q, Quan W, Song Y. A Semi-Supervised Stacked Autoencoder Using the Pseudo Label for Classification Tasks. Entropy. 2023; 25(9):1274. https://doi.org/10.3390/e25091274
Chicago/Turabian StyleLai, Jie, Xiaodan Wang, Qian Xiang, Wen Quan, and Yafei Song. 2023. "A Semi-Supervised Stacked Autoencoder Using the Pseudo Label for Classification Tasks" Entropy 25, no. 9: 1274. https://doi.org/10.3390/e25091274
APA StyleLai, J., Wang, X., Xiang, Q., Quan, W., & Song, Y. (2023). A Semi-Supervised Stacked Autoencoder Using the Pseudo Label for Classification Tasks. Entropy, 25(9), 1274. https://doi.org/10.3390/e25091274