Multi-Head Self-Attention-Enhanced Prototype Network with Contrastive–Center Loss for Few-Shot Relation Extraction
<p>A demonstration of 3-way 1-shot scenario. Words with underscores signify entity mentions. The model is trained on support set instances to predict the relationship between the two known entities in the query set.</p> "> Figure 2
<p>An illustration of the impact of prototype interaction information on query instances. Gray spheres represent prototype networks, while spheres of other colors represent representations of support instances. The green spheres with question marks represent representations of query instances. (<b>a</b>) Originally, the representation of the query instance closely resembles the blue prototype. (<b>b</b>) After interacting with information from different prototypes, the position of the query instance representation changes, thereby modifying the prototypes.</p> "> Figure 3
<p>The architecture of our proposed SACT for the FSRE task. SACT first introduces the input relation information, support set, and query set into a BERT encoder to obtain the relationship information representation in the upper part of the sentence encoder module and the sentence representations in the lower part. Subsequently, the prototype network is further enhanced through a multi-head self-attention mechanism and optimized using the contrast–center loss function. In the diagram, relationship information is represented by triangles, the support set is denoted by circles, circles with question marks represent the query set, and pentagrams symbolize prototype representations.</p> "> Figure 4
<p>An example of sentence representation generated by the sentence encoder. It illustrates how an input sentence is transformed into a numerical representation that can be utilized for further processing.</p> "> Figure 5
<p>Schematic diagram of the prototype enhancement process. The initial prototypes of N categories are input into a multi-head self-attention mechanism to obtain enhanced prototypes. These enhanced prototypes are then combined with the initial prototypes to form the final prototypes.</p> "> Figure 6
<p>Diagram of the contrastive–center loss. In the upper-left corner, we have the basic prototype representations. Through the influence of the center loss function, you can observe a significant reduction in the distance between positive samples. After being affected by the contrastive loss function, there is some increase in the distance between class centers and negative samples. However, the contrast–center loss function used by SACT not only reduces the distance between positive samples but also increases the distance between centers and negative samples.</p> "> Figure 7
<p>Comparison of three CNN encoder-based models with SACT on the FewRel 1.0 dataset.</p> "> Figure 8
<p>Comparison between HCPR and SACT in 1-shot setting.</p> "> Figure 9
<p>Comparison of SACT with other BERT-based models on the FewRel 1.0 dataset.</p> "> Figure 10
<p>Comparison of SACT with other CP-based models on FewRel 1.0 dataset.</p> "> Figure 11
<p>Comparison of SACT with other prototype network models on FewRel 1.0 dataset.</p> "> Figure 12
<p>Comparison of SACT with other models on the FewRel 2.0 dataset.</p> ">
Abstract
:1. Introduction
2. Related Work
2.1. Relation Extraction
2.2. Few-Shot Learning
2.3. Few-Shot Relation Extraction
3. Problem Formulation
4. Methodology
4.1. Framework
4.2. Sentence Encoder
4.2.1. Sentence Representations
4.2.2. Relation Representations
4.3. Prototype Enhancement Module
4.3.1. Basic Prototype
4.3.2. Enhanced Prototype
4.3.3. Final Prototype
4.4. Contrastive–Center Loss
5. Experimental Settings
5.1. Dataset
5.2. Baselines
5.3. Implementation Details
5.4. Main Results
5.5. Domain Adaptation Results
5.6. Ablation Study
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lauriola, I.; Lavelli, A.; Aiolli, F. An introduction to deep learning in natural language processing: Models, techniques, and tools. Neurocomputing 2022, 470, 443–456. [Google Scholar] [CrossRef]
- Xiao, G.; Corman, J. Ontology-Mediated SPARQL Query Answering over Knowledge Graphs. Big Data Res. 2021, 23, 100177. [Google Scholar] [CrossRef]
- Garcia, X.; Bansal, Y.; Cherry, C.; Foster, G.; Krikun, M.; Johnson, M.; Firat, O. The Unreasonable Effectiveness of Few-shot Learning for Machine Translation. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; Volume 202, pp. 10867–10878. [Google Scholar]
- Lawrie, D.; Yang, E.; Oard, D.W.; Mayfield, J. Neural Approaches to Multilingual Information Retrieval. In Advances in Information Retrieval; Kamps, J., Goeuriot, L., Crestani, F., Maistro, M., Joho, H., Davis, B., Gurrin, C., Kruschwitz, U., Caputo, A., Eds.; ECIR: Cham, Switzerland, 2023; pp. 521–536. [Google Scholar]
- Wang, Y.; Ma, W.; Zhang, M.; Liu, Y.; Ma, S. A Survey on the Fairness of Recommender Systems. ACM Trans. Inf. Syst. 2023, 41, 1–43. [Google Scholar] [CrossRef]
- Mintz, M.; Bills, S.; Snow, R.; Jurafsky, D. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, 2–7 August 2009; pp. 1003–1011. [Google Scholar]
- Ye, Q.; Liu, L.; Zhang, M.; Ren, X. Looking Beyond Label Noise: Shifted Label Distribution Matters in Distantly Supervised Relation Extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 3841–3850. [Google Scholar] [CrossRef]
- Zhang, N.; Deng, S.; Sun, Z.; Wang, G.; Chen, X.; Zhang, W.; Chen, H. Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, 2–7 June 2019; pp. 3016–3025. [Google Scholar] [CrossRef]
- Luo, X.; Zhou, W.; Wang, W.; Zhu, Y.; Deng, J. Attention-Based Relation Extraction With Bidirectional Gated Recurrent Unit and Highway Network in the Analysis of Geological Data. IEEE Access 2018, 6, 5705–5715. [Google Scholar] [CrossRef]
- Li, Y.; Long, G.; Shen, T.; Zhou, T.; Yao, L.; Huo, H.; Jiang, J. Self-attention enhanced selective gate with entity-aware embedding for distantly supervised relation extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 8269–8276. [Google Scholar]
- Lin, X.; Liu, T.; Jia, W.; Gong, Z. Distantly Supervised Relation Extraction using Multi-Layer Revision Network and Confidence-based Multi-Instance Learning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, 17 April 2021; pp. 165–174. [Google Scholar] [CrossRef]
- Augenstein, I.; Maynard, D.; Ciravegna, F. Relation Extraction from the Web Using Distant Supervision. In Knowledge Engineering and Knowledge Management; Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E., Eds.; Springer: Cham, Switzerland, 2014; pp. 26–41. [Google Scholar]
- Sun, Q.; Liu, Y.; Chua, T.S.; Schiele, B. Meta-Transfer Learning for Few-Shot Learning. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 403–412. [Google Scholar] [CrossRef]
- Lee, H.y.; Li, S.W.; Vu, T. Meta Learning for Natural Language Processing: A Survey. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, DC, USA, 10–15 July 2022; pp. 666–684. [Google Scholar] [CrossRef]
- Mettes, P.; van der Pol, E.; Snoek, C.G.M. Hyperspherical Prototype Networks. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Curran Associates Inc.: Red Hook, NY, USA, 2019. [Google Scholar]
- Yang, K.; Zheng, N.; Dai, X.; He, L.; Huang, S.; Chen, J. Enhance prototypical network with text descriptions for few-shot relation classification. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Shanghai, China, 19–23 October 2020; pp. 2273–2276. [Google Scholar]
- Han, J.; Cheng, B.; Lu, W. Exploring Task Difficulty for Few-Shot Relation Extraction. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, 17 April 2021; pp. 2605–2616. [Google Scholar] [CrossRef]
- Liu, Y.; Hu, J.; Wan, X.; Chang, T.H. A Simple yet Effective Relation Information Guided Approach for Few-Shot Relation Extraction. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, 22–27 May 2022; pp. 757–763. [Google Scholar] [CrossRef]
- Liu, Y.; Hu, J.; Wan, X.; Chang, T.H. Learn from Relation Information: Towards Prototype Representation Rectification for Few-Shot Relation Extraction. In Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, DC, USA, 10–15 July 2022; pp. 1822–1831. [Google Scholar] [CrossRef]
- Wen, M.; Xia, T.; Liao, B.; Tian, Y. Few-shot relation classification using clustering-based prototype modification. Knowl.-Based Syst. 2023, 268, 110477. [Google Scholar] [CrossRef]
- Zelenko, D.; Aone, C.; Richardella, A. Kernel Methods for Relation Extraction. J. Mach. Learn. Res. 2002, 3, 1083–1106. [Google Scholar]
- Deng, B.; Fan, X.; Yang, L. Entity relation extraction method using semantic pattern. Jisuanji Gongcheng/ Comput. Eng. 2007, 33, 212–214. [Google Scholar]
- Shlezinger, N.; Whang, J.; Eldar, Y.C.; Dimakis, A.G. Model-Based Deep Learning. Proc. IEEE 2023, 111, 465–499. [Google Scholar] [CrossRef]
- Shen, Y.; Huang, X. Attention-Based Convolutional Neural Network for Semantic Relation Extraction. In Proceedings of the Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 11–16 December 2016; pp. 2526–2536. [Google Scholar]
- Wang, L.; Zhu, C.; de Melo, G.; Zhiyuan, L. Relation Classification via Multi-Level Attention CNNs. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, 7–12 August 2016. [Google Scholar] [CrossRef]
- Ebrahimi, J.; Dou, D. Chain based RNN for relation classification. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, CO, USA, 31 May–5 June 2015; pp. 1244–1249. [Google Scholar]
- Nguyen, T.H.; Grishman, R. Combining Neural Networks and Log-linear Models to Improve Relation Extraction. arXiv 2015, arXiv:1511.05926. [Google Scholar]
- Li, F.; Zhang, M.; Fu, G.; Qian, T.; Ji, D.H. A Bi-LSTM-RNN Model for Relation Classification Using Low-Cost Sequence Features. arXiv 2016, arXiv:1608.07720. [Google Scholar]
- Zhou, P.; Shi, W.; Tian, J.; Qi, Z.; Li, B.; Hao, H.; Xu, B. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, 7–12 August 2016. [Google Scholar] [CrossRef]
- Huang, Y.Y.; Wang, W.Y. Deep Residual Learning for Weakly-Supervised Relation Extraction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 7–11 September 2017; pp. 1803–1807. [Google Scholar] [CrossRef]
- Zeng, D.; Dai, Y.; Li, F.; Sherratt, R.S.; Wang, J. Adversarial learning for distant supervised relation extraction. Comput. Mater. Contin. 2018, 55, 121–136. [Google Scholar]
- Qin, P.; Xu, W.; Wang, W.Y. Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 15–20 July 2018; pp. 2137–2147. [Google Scholar] [CrossRef]
- Qin, P.; Xu, W.; Wang, W.Y. DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 15–20 July 2018; pp. 496–505. [Google Scholar] [CrossRef]
- Santoro, A.; Bartunov, S.; Botvinick, M.; Wierstra, D.; Lillicrap, T. Meta-Learning with Memory-Augmented Neural Networks. In Proceedings of the 33rd International Conference on International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; Volume 48, pp. 1842–1850. [Google Scholar]
- Mishra, N.; Rohaninejad, M.; Chen, X.; Abbeel, P. A Simple Neural Attentive Meta-Learner. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Ren, M.; Liao, R.; Fetaya, E.; Zemel, R. Incremental few-shot learning with attention attractor networks. Adv. Neural Inf. Process. Syst. 2019, 32, 5275–5285. [Google Scholar]
- Finn, C.; Abbeel, P.; Levine, S. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, Sydney, Australia, 6–11 August 2017; pp. 1126–1135. [Google Scholar]
- Elsken, T.; Staffler, B.; Metzen, J.; Hutter, F. Meta-Learning of Neural Architectures for Few-Shot Learning. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Los Alamitos, CA, USA, 13–19 June 2020; pp. 12362–12372. [Google Scholar] [CrossRef]
- Koch, G.; Zemel, R.; Salakhutdinov, R. Siamese neural networks for one-shot image recognition. In Proceedings of the ICML Deep Learning Workshop, Lille, France, 6–11 July 2015; Volume 2. [Google Scholar]
- Vinyals, O.; Blundell, C.; Lillicrap, T.; kavukcuoglu, K.; Wierstra, D. Matching Networks for One Shot Learning. In Advances in Neural Information Processing Systems; Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2016; Volume 29. [Google Scholar]
- Snell, J.; Swersky, K.; Zemel, R. Prototypical Networks for Few-shot Learning. In Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2017; Volume 30. [Google Scholar]
- Han, X.; Zhu, H.; Yu, P.; Wang, Z.; Yao, Y.; Liu, Z.; Sun, M. FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018. [Google Scholar] [CrossRef]
- Gao, T.; Han, X.; Zhu, H.; Liu, Z.; Li, P.; Sun, M.; Zhou, J. FewRel 2.0: Towards More Challenging Few-Shot Relation Classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 7 November 2019; pp. 6250–6255. [Google Scholar]
- Ye, Z.X.; Ling, Z.H. Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 2872–2881. [Google Scholar] [CrossRef]
- Gao, T.; Han, X.; Liu, Z.; Sun, M. Hybrid Attention-Based Prototypical Networks for Noisy Few-Shot Relation Classification. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI’19/IAAI’19/EAAI’19, Honolulu, HI, USA, 27 January–1 February 2019; AAAI Press: Washington, DC, USA, 2019. [Google Scholar] [CrossRef]
- Wang, M.; Zheng, J.; Cai, F.; Shao, T.; Chen, H. DRK: Discriminative Rule-based Knowledge for Relieving Prediction Confusions in Few-shot Relation Extraction. In Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, 12–17 October 2022; pp. 2129–2140. [Google Scholar]
- Yang, S.; Zhang, Y.; Niu, G.; Zhao, Q.; Pu, S. Entity Concept-enhanced Few-shot Relation Extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Online, 1–6 August 2021; pp. 987–991. [Google Scholar] [CrossRef]
- Peng, H.; Gao, T.; Han, X.; Lin, Y.; Li, P.; Liu, Z.; Sun, M.; Zhou, J. Learning from Context or Names? An Empirical Study on Neural Relation Extraction. In In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 19–20 November 2020; pp. 3661–3672. [Google Scholar] [CrossRef]
- Dong, M.; Pan, C.; Luo, Z. MapRE: An Effective Semantic Mapping Approach for Low-resource Relation Extraction. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, 17 April 2021; pp. 2694–2704. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Liu, S.; Huang, D.; Wang, Y. Learning Spatial Fusion for Single-Shot Object Detection. arXiv 2019, arXiv:1911.09516. [Google Scholar]
- Wen, Y.; Zhang, K.; Li, Z.; Qiao, Y. A Discriminative Feature Learning Approach for Deep Face Recognition. In Computer Vision–ECCV 2016, Proceedings of the 14th European Conference Amsterdam, The Netherlands, 11–14 October 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 499–515. [Google Scholar]
- Yu, T.; Yang, M.; Zhao, X. Dependency-aware Prototype Learning for Few-shot Relation Classification. In Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, 12–17 October 2022; pp. 2339–2345. [Google Scholar]
- Zhang, P.; Lu, W. Better Few-Shot Relation Extraction with Label Prompt Dropout. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 6996–7006. [Google Scholar] [CrossRef]
Corpus | Task | #Relation | #Entity | #Sentences | #Test |
---|---|---|---|---|---|
FewRel 1.0 | Train | 64 | 89,600 | 44,800 | - |
Validation | 16 | 22,400 | 11,200 | - | |
Test (unpublished) | 20 | 28,000 | 14,000 | 10,000 | |
FewRel 2.0 | Validation | 10 | 2000 | 1000 | - |
Test (unpublished) | 15 | 3000 | 1500 | 10,000 |
Parameter | Value |
---|---|
Encoder | BERT |
Backend model | Bert /cp |
Learning_rate | 1 × 10/5 × 10 |
Max_length | 128 |
Hidden_size | 768 |
Batch_size | 4 |
Optimizer | AdamW |
Validation_step | 1000 |
Max training iterations | 30,000 |
Encoder | Model | 5-Way-1-Shot | 5-Way-5-Shot | 10-Way-1-Shot | 10-Way-5-Shot |
---|---|---|---|---|---|
CNN | Proto-CNN | 72.65/74.52 | 86.15/88.40 | 60.13/62.38 | 76.20/80.45 |
Proto-HATT | 75.01/— — | 87.09/90.12 | 62.48/— — | 77.50/83.05 | |
MLMAN | 79.01/— — | 88.86/92.66 | 67.37/75.59 | 80.07/87.29 | |
BERT | Proto-BERT | 84.77/89.33 | 89.54/94.13 | 76.85/83.41 | 83.42/90.25 |
TD-proto | — —/84.76 | — —/92.38 | — —/74.32 | — —/85.92 | |
ConceptFERE | — —/89.21 | — —/90.34 | — —/75.72 | — —/81.82 | |
DAPL | — —/85.94 | — —/94.28 | — —/77.59 | — —/89.26 | |
HCRP (BERT) | 90.90/93.76 | 93.22/95.66 | 84.11/89.95 | 87.79/92.10 | |
DRK | — —/89.94 | — —/92.42 | — —/81.94 | — —/85.23 | |
SimpleFSRE | 91.29/94.42 | 94.05/96.37 | 86.09/90.73 | 89.68/93.47 | |
Ours (BERT) | 92.31/94.83 | 94.05/97.07 | 86.92/90.46 | 89.36/93.65 | |
CP | — —/95.10 | — —/97.10 | — —/91.20 | — —/94.70 | |
MapRE | — —/95.73 | — —/97.84 | — —/93.18 | — —/95.64 | |
HCRP (CP) | 94.10/96.42 | 96.05/97.96 | 89.13/93.97 | 93.10/96.46 | |
LPD | 93.51/95.12 | 94.33/95.79 | 87.77/90.73 | 89.19/92.15 | |
CBPM | — —/90.89 | — —/94.68 | — —/82.54 | — —/89.67 | |
Ours (CP) | 96.48/97.14 | 97.93/97.98 | 93.88/95.24 | 95.61/96.27 |
Model | 5-Way-1-Shot | 5-Way-5-Shot | 10-Way-1-Shot | 10-Way-5-Shot |
---|---|---|---|---|
Proto-CNN * | 35.09 | 49.37 | 22.98 | 35.22 |
Proto-BERT * | 40.12 | 51.50 | 26.45 | 36.93 |
BERT-PAIR * | 56.25 | 67.44 | 43.64 | 53.17 |
Proto-CNN-ADV * | 42.21 | 58.71 | 28.91 | 44.35 |
Proto-BERT-ADV * | 41.90 | 54.74 | 27.36 | 37.40 |
HCRP | 76.34 | 83.03 | 63.77 | 72.94 |
Ours (CP) | 81.28 | 88.92 | 68.18 | 79.03 |
Model | 5-Way-1-Shot | 10-Way-1-Shot |
---|---|---|
SACT | 96.48 | 93.88 |
w/o modification prototype | 94.89 | 87.07 |
w/o Contractive-center loss | 94.86 | 87.47 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ma, J.; Cheng, J.; Chen, Y.; Li, K.; Zhang, F.; Shang, Z. Multi-Head Self-Attention-Enhanced Prototype Network with Contrastive–Center Loss for Few-Shot Relation Extraction. Appl. Sci. 2024, 14, 103. https://doi.org/10.3390/app14010103
Ma J, Cheng J, Chen Y, Li K, Zhang F, Shang Z. Multi-Head Self-Attention-Enhanced Prototype Network with Contrastive–Center Loss for Few-Shot Relation Extraction. Applied Sciences. 2024; 14(1):103. https://doi.org/10.3390/app14010103
Chicago/Turabian StyleMa, Jiangtao, Jia Cheng, Yonggang Chen, Kunlin Li, Fan Zhang, and Zhanlei Shang. 2024. "Multi-Head Self-Attention-Enhanced Prototype Network with Contrastive–Center Loss for Few-Shot Relation Extraction" Applied Sciences 14, no. 1: 103. https://doi.org/10.3390/app14010103
APA StyleMa, J., Cheng, J., Chen, Y., Li, K., Zhang, F., & Shang, Z. (2024). Multi-Head Self-Attention-Enhanced Prototype Network with Contrastive–Center Loss for Few-Shot Relation Extraction. Applied Sciences, 14(1), 103. https://doi.org/10.3390/app14010103