Word Embedding Distribution Propagation Graph Network for Few-Shot Learning
<p>The figure shows a 3-way-1 shot task, where 1, 2, 3 represent the support set, 4 represents the query set, and the red line segment represents the mini-mum similarity between 4 and other support set samples. After cyclic calculation of the L-layer Point Graph and Word Distribution Graph, the class output with the closest distance between 4 and the other three types of samples in the Word Distribution Graph is finally selected as the prediction result.</p> "> Figure 2
<p>Few-shot tasks and traditional image classification tasks in CUB-200-2011 dataset: (<b>a</b>) traditional classification; (<b>b</b>) few-shot classification.</p> "> Figure 3
<p>ECAResNet-12 network architecture.</p> "> Figure 4
<p>Cycle calculation process of the WPGN.</p> "> Figure 5
<p>Details about W2P strategy and P2W strategy in the WPGN. (<b>a</b>) W2P; (<b>b</b>) P2W.</p> "> Figure 6
<p>Four different classes of birds in CUB-200-2011.</p> "> Figure 7
<p>Effects of different layer numbers on classification accuracy.</p> "> Figure 8
<p>Impact of different layer numbers in the WPGN on classification accuracy: (<b>a</b>) layer 0; (<b>b</b>) layer 1; (<b>c</b>) layer 2; (<b>d</b>) layer 3; (<b>e</b>) layer 5; (<b>f</b>) ground truth.</p> "> Figure 8 Cont.
<p>Impact of different layer numbers in the WPGN on classification accuracy: (<b>a</b>) layer 0; (<b>b</b>) layer 1; (<b>c</b>) layer 2; (<b>d</b>) layer 3; (<b>e</b>) layer 5; (<b>f</b>) ground truth.</p> "> Figure 9
<p>Experimental results on the MiniImagenet and CIFAR-FS. (<b>a</b>) MiniImagenet; (<b>b</b>) CIFAR-FS.</p> "> Figure 10
<p>Training loss and test accuracy comparison.</p> "> Figure 11
<p>Practical application example of rare bird classification.</p> ">
Abstract
:1. Introduction
2. Related Work
2.1. Graph Neural Network
2.2. Transfer Learning
2.3. Semantic Information
2.4. Attention and Application
3. Method
3.1. Problem Definition
3.2. Feature Extraction with ECA-Net
3.3. Word Embedding Distribution Propagation Graph Network
3.3.1. Point Graph
3.3.2. Word Embedding Distribution Graph
3.3.3. Loop Computation
Updating the Point Graph
Updating the Word Embedding Distribution Graph
FReLU Activation Function
3.3.4. Loss Function
- 1.
- Calculate the point graph loss:
- 2.
- Calculate the word embedding distribution graph loss:
- 3.
- Calculate the model loss.
4. Experiment
4.1. Experimental Environment and Datasets
4.2. Experimental Settings
4.3. Evaluation
4.4. Experimental Results
4.5. Ablation Studies
4.6. Practical Application Example
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. arXiv, 2017; arXiv:1703.03400. [Google Scholar]
- Jamal, M.A.; Qi, G.J. Task agnostic meta-learning for few-shot learning. In Proceedings of the IEEE/CVF Conference on CVPR, Long Beach, CA, USA, 15–20 June 2019; pp. 11719–11727. [Google Scholar]
- Santoro, A.; Bartunov, S.; Botvinick, M.; Wierstra, D.; Lillicrap, T. Meta-learning with memory augmented neural networks. In Proceedings of the International Conference on Machine Learning (ICML), New York, NY, USA, 19–24 June 2016; pp. 1842–1850. [Google Scholar]
- Bertinetto, L.; Henriques, J.F.; Torr, P.H.; Vedaldi, A. Meta-learning with differentiable closed-form solvers. arXiv 2019, arXiv:1805.08136. [Google Scholar]
- Ohkuma, T.; Nakayama, H. Belonging network. In Proceedings of the 2020 IEICE-120 IEICE-PRMU-36-IEICE-PRMU-41, Online, 2–4 December 2020. [Google Scholar]
- Higashi, R.; Wada, T. Regularization using knowledge distillation in learning small datasets. IEICE Tech. Rep. 2020, 120, 133–138. [Google Scholar]
- Wang, X.; Yu, F.; Wang, R.; Darrell, T.; Gonzalez, J.E. TAFE-Net: Task-aware feature embeddings for low shot learning. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 1831–1840. [Google Scholar]
- Koch, G.; Zemel, R.; Salakhutdinov, R. Siamese neural for one-shot image recognition. In Proceedings of the ICML Deep Learning Workshop, Lille, France, 6–11 July 2015; Volume 2. [Google Scholar]
- Vinyals, O.; Blundell, C.; Lillicrap, T.; Wierstra, D. Matching networks for one shot learning. Adv. Neural Inf. Process. Syst. 2016, 29, 3630–3638. [Google Scholar]
- Zhang, C.; Cai, Y.; Lin, G.; Shen, C. DeepEMD: Few-Shot Image Classification with Differentiable Earth Mover’s Distance and Structured Classifiers. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 12203–12213. [Google Scholar]
- Li, A.; Huang, W.; Lan, X.; Feng, J.; Li, Z.; Wang, L. Boosting few-shot learning with adaptive margin loss. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 12576–12584. [Google Scholar]
- Bateni, P.; Goyal, R.; Masrani, V.; Wood, F.; Sigal, L. Improved few-shot visual classification. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 14493–14502. [Google Scholar]
- Ma, N.; Zhang, X.; Sun, J. Funnel activation for visual recognition. arXiv 2020, arXiv:2007.11824. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Garcia, V.; Bruna, J. Few-shot learning with graph neural networks. arXiv 2017, arXiv:1711.04043. [Google Scholar]
- Kim, J.; Kim, T.; Kim, S.; Yoo, C.D. Edge-labeling graph neural network for few-shot learning. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 11–20. [Google Scholar]
- Yang, L.; Li, L.; Zhang, Z.; Zhou, X.; Zhou, E.; Liu, Y. DPGN: Distribution propagation graph network for few-shot learning. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 13390–13399. [Google Scholar]
- Yu, Z.; Raschka, S. Looking back to lower-level information in few-shot learning. Information 2020, 11, 345. [Google Scholar] [CrossRef]
- Gidaris, S.; Komodakis, N. Generating classification weights with gnn denoising autoencoders for few-shot learning. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 21–30. [Google Scholar]
- Yu, Z.; Chen, L.; Cheng, Z.; Luo, J. Transmatch: A transfer-learning scheme for semi-supervised few-shot learning. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 12856–12864. [Google Scholar]
- Schwartz, E.; Karlinsky, L.; Feris, R.; Giryes, R.; Bronstein, A.M. Baby steps towards few-shot learning with multiple semantics. arXiv 2020, arXiv:1906.01905. [Google Scholar]
- Schonfeld, E.; Ebrahimi, S.; Sinha, S.; Darrell, T.; Akata, Z. Generalized zero and few-shot learning via aligned variational autoencoders. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 8247–8255. [Google Scholar]
- Tokmakov, P.; Wang, Y.-X.; Hebert, M. Learning compositional representations for few-shot recognition. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27–28 October 2019; pp. 6372–6381. [Google Scholar]
- Li, A.; Luo, T.; Lu, Z.; Xiang, T.; Wang, L. Large-Scale few-shot learning: Knowledge transfer with class hierarchy. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7212–7220. [Google Scholar]
- Cheny, Z.; Fuy, Y.; Zhang, Y.; Jiang, Y.-G.; Xue, X.; Sigal, L. Multi-level semantic feature augmentation for one-shot learning. IEEE Trans. Image Process. 2019, 28, 4594–4605. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, T.; Wu, W.; Gao, Y.; Dong, L.; Luo, X.; Lin, L. Fine-Grained Representation Learning and Recognition by Exploiting Semantic Embedding. In Proceedings of the 26th ACM International Conference on Multimedia, Seoul, Korea, 22–26 October 2018; pp. 2023–2031. [Google Scholar]
- Jiang, Z.; Kang, B.; Zhou, K.; Feng, J. Few-shot classification via adaptive attention. arXiv 2020, arXiv:2008.02465. [Google Scholar]
- Abayomi-Alli, O.; Damaševičius, R.; Maskeliūnas, R.; Misra, S. Few-shot learning with a novel Voronoi tessellation-based image augmentation method for facial palsy detection. Electronics 2021, 10, 978. [Google Scholar] [CrossRef]
- Moon, J.; Le, N.; Minaya, N.; Choi, S.-I. Multimodal few-shot learning for gait recognition. Appl. Sci. 2020, 10, 7619. [Google Scholar] [CrossRef]
- Pennington, J.; Socher, R.; Manning, C.D. GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014. [Google Scholar]
- Miller, G.A. WordNet: A lexical database for English. Commun. ACM 1995, 11, 39–41. [Google Scholar] [CrossRef]
- Wah, C.; Branson, S.; Welinder, P.; Perona, P.; Belongie, S. The Caltechucsd Birds-200-2011 Dataset; California Institute of Technology: Pasadena, CA, USA, 2011. [Google Scholar]
- Chen, W.Y.; Liu, Y.C.; Kira, Z.; Wang, Y.C.F.; Huang, J.B. A closer look at few-shot classification. arXiv 2019, arXiv:1904.04232. [Google Scholar]
Distance | Manhattan | Euclidean | Mahalanobis | |
---|---|---|---|---|
Dataset | ||||
CUB-200-2011 | 83.51 | 83.81 | 84.34 | |
MiniImageNet | 69.92 | 70.34 | 70.69 | |
CIFAR-FS | 79.6 | 79.9 | 80.4 |
Function | ReLU | PReLU | Swish | LeakyReLU | FReLU |
---|---|---|---|---|---|
82.51 | 82.92 | 83.55 | 83.95 | 84.34 |
λ | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 |
---|---|---|---|---|---|
81.86 | 82.32 | 82.59 | 82.85 | 83.15 | |
λ | 0.6 | 0.7 | 0.8 | 0.9 | 1.0 |
83.32 | 83.51 | 83.82 | 84.34 | 83.88 |
GPU | Python | Torch | CUDA | Torchvision | Torchnet |
---|---|---|---|---|---|
1080Ti | 3.5.2 | 1.1 | 10.1 | 0.3.0 | 0.0.4 |
Dataset | Images | Classes | Train/Val/Test | Resolution |
---|---|---|---|---|
MiniImageNet | 60 k | 100 | 64/16/20 [9] | 84 × 84 |
CUB-200 | 11.7 k | 200 | 100/50/50 [33] | 84 × 84 |
CIFAR-FS | 60 k | 100 | 64/16/20 [4] | 32 × 32 |
Parameter | Value |
---|---|
Adam learning rate | 10−3 |
Decay learning rate | 10−1 |
Decay iterations | 12,000 |
Weight decay | 10−5 |
Layer number | 5 |
Model | Backbone | 1 Shot | 2 Shot | 5 Shot |
---|---|---|---|---|
MAML | ConvNet | 55.92 ± 0.87 | / | 72.09 ± 0.76 |
MatchingNet | ConvNet | 61.16 ± 0.95 | / | 72.86 ± 0.69 |
RelationNet | ConvNet | 62.45 ± 0.89 | / | 76.11 ± 0.66 |
CloserLook | ConvNet | 60.53 ± 0.87 | / | 79.34 ± 0.69 |
DPGN (Batch size: 30) | ConvNet | 75.52 ± 0.59 | 85.65 ± 0.52 | 89.31 ± 0.51 |
DPGN (Batch size: 40) | ConvNet | 76.05 ± 0.51 | / | 89.08 ± 0.38 |
WPGN (Batch size: 30) | ConvNet | 81.25 ± 0.46 | 88.62 ± 0.38 | 92.65 ± 0.42 |
FEAT | ResNet-12 | 68.87 ± 0.22 | / | 82.90 ± 0.15 |
DPGN (Batch size: 30) | ResNet-12 | 75.31 ± 0.31 | 87.72 ± 0.41 | 90.26 ± 0.30 |
DPGN (Batch size: 40) | ResNet-12 | 75.71 ± 0.47 | / | 91.48 ± 0.33 |
WPGN (Batch size: 30) | ResNet-12 | 83.05 ± 0.45 | 91.31 ± 0.34 | 93.91 ± 0.33 |
WPGN (Batch size: 30) | ECARes-12 | 84.34 ± 0.66 | 92.28 ± 0.41 | 94.41 ± 0.28 |
WPGN | DPGN | |
---|---|---|
Steps | 40,000 | 40,000 |
Time (minutes) | 416 | 724 |
Datasets | Word Embedding | Mahalanobis Distance | FReLU | ECA Attention | Accuracy |
---|---|---|---|---|---|
× | × | × | × | 75.31 | |
√ | × | × | × | 82.54 | |
CUB-200-2011 | √ | √ | × | × | 82.91 |
√ | √ | √ | × | 83.10 | |
√ | √ | √ | √ | 84.34 | |
× | × | × | × | 76.8 | |
√ | × | × | × | 78.9 | |
CIFAR-FS | √ | √ | × | × | 79.2 |
√ | √ | √ | × | 79.5 | |
√ | √ | √ | √ | 80.4 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, C.; Wang, L.; Han, C. Word Embedding Distribution Propagation Graph Network for Few-Shot Learning. Sensors 2022, 22, 2648. https://doi.org/10.3390/s22072648
Zhu C, Wang L, Han C. Word Embedding Distribution Propagation Graph Network for Few-Shot Learning. Sensors. 2022; 22(7):2648. https://doi.org/10.3390/s22072648
Chicago/Turabian StyleZhu, Chaoran, Ling Wang, and Cheng Han. 2022. "Word Embedding Distribution Propagation Graph Network for Few-Shot Learning" Sensors 22, no. 7: 2648. https://doi.org/10.3390/s22072648
APA StyleZhu, C., Wang, L., & Han, C. (2022). Word Embedding Distribution Propagation Graph Network for Few-Shot Learning. Sensors, 22(7), 2648. https://doi.org/10.3390/s22072648