research-article

HEProto: A Hierarchical Enhancing ProtoNet based on Multi-Task Learning for Few-shot Named Entity Recognition

Authors:

Enhong ChenAuthors Info & Claims

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 296 - 305

https://doi.org/10.1145/3583780.3614908

Published: 21 October 2023 Publication History

Abstract

Few-shot Named Entity Recognition (NER) task, which aims to identify and classify entities from different domains with limited training samples, has long been treated as a basic step for knowledge graph (KG) construction. Great efforts have been made on this task with competitive performance, however, they usually treat the two subtasks, namely span detection and type classification, as mutually independent, and the integrity and correlation between subtasks have been largely ignored. Moreover, prior arts may fail to absorb the coarse-grained features of entities, resulting in a semantic-insufficient representation of entity types. To that end, in this paper, we propose a Hierarchical Enhancing ProtoNet (HEProto) based on multi-task learning, which is utilized to jointly learn these two subtasks and model their correlation. Specifically, we adopt contrastive learning to enhance the span boundary information and the type semantic representations in these two subtasks. Then, the hierarchical prototypical network is designed to leverage the coarse-grained information of entities in the type classification stage, which could help the model to better learn the fine-grained semantic representations. Along this line, we construct a similarity margin loss to reduce the similarity between fine-grained entities and other irrelevant coarse-grained prototypes. Finally, extensive experiments on the Few-NERD dataset prove that our solution outperforms competitive baseline methods. The source code of HEProto is available at \hrefhttps://github.com/fanshu6hao/HEProto https://github.com/fanshu6hao/HEProto.

References

[1]

Jiaao Chen, Zhenghui Wang, Ran Tian, Zichao Yang, and Diyi Yang. 2020. Local Additivity Based Data Augmentation for Semi-supervised NER. In EMNLP (1). Association for Computational Linguistics, 1241--1251.

[2]

Pengxiang Cheng and Katrin Erk. 2020. Attending to Entities for Better Text Understanding. In AAAI. AAAI Press, 7554--7561.

[3]

Leyang Cui, Yu Wu, Jian Liu, Sen Yang, and Yue Zhang. 2021. Template-Based Named Entity Recognition Using BART. In ACL/IJCNLP (Findings) (Findings of ACL, Vol. ACL/IJCNLP 2021). Association for Computational Linguistics, 1835--1845.

[4]

Sarkar Snigdha Sarathi Das, Arzoo Katiyar, Rebecca J. Passonneau, and Rui Zhang. 2022. CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning. In ACL (1). Association for Computational Linguistics, 6338--6353.

[5]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT (1). Association for Computational Linguistics, 4171--4186.

[6]

Ning Ding, Guangwei Xu, Yulin Chen, Xiaobin Wang, Xu Han, Pengjun Xie, Haitao Zheng, and Zhiyuan Liu. 2021. Few-NERD: A Few-shot Named Entity Recognition Dataset. In ACL/IJCNLP (1). Association for Computational Linguistics, 3198--3213.

[7]

Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In ICML (Proceedings of Machine Learning Research, Vol. 70). PMLR, 1126--1135.

[8]

Sumam Francis, Jordy Van Landeghem, and Marie-Francine Moens. 2019. Transfer Learning for Named Entity Recognition in Financial and Biomedical Documents. Inf., Vol. 10, 8 (2019), 248.

[9]

Alexander Fritzler, Varvara Logacheva, and Maksim Kretov. 2019. Few-shot classification in named entity recognition task. In SAC. ACM, 993--1000.

[10]

Ning Gao, Nikos Karampatziakis, Rahul Potharaju, and Silviu Cucerzan. 2019. Active Entity Recognition in Low Resource Settings. In CIKM. ACM, 2261--2264.

[11]

Marta Garnelo, Dan Rosenbaum, Christopher Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo Jimenez Rezende, and S. M. Ali Eslami. 2018. Conditional Neural Processes. In ICML (Proceedings of Machine Learning Research, Vol. 80). PMLR, 1690--1699.

[12]

Jiafeng Guo, Gu Xu, Xueqi Cheng, and Hang Li. 2009. Named entity recognition in query. In SIGIR. ACM, 267--274.

[13]

Yutai Hou, Wanxiang Che, Yongkui Lai, Zhihan Zhou, Yijia Liu, Han Liu, and Ting Liu. 2020. Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network. In ACL. Association for Computational Linguistics, 1381--1393.

[14]

Jiaxin Huang, Chunyuan Li, Krishan Subudhi, Damien Jose, Shobana Balakrishnan, Weizhu Chen, Baolin Peng, Jianfeng Gao, and Jiawei Han. 2020. Few-shot named entity recognition: A comprehensive study. arXiv preprint arXiv:2012.14978 (2020).

[15]

Yucheng Huang, Kai He, Yige Wang, Xianli Zhang, Tieliang Gong, Rui Mao, and Chen Li. 2022. COPNER: Contrastive Learning with Prompt Guiding for Few-shot Named Entity Recognition. In COLING. International Committee on Computational Linguistics, 2515--2527.

[16]

Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised Contrastive Learning. In NeurIPS.

[17]

Gregory Koch, Richard Zemel, Ruslan Salakhutdinov, et al. 2015. Siamese neural networks for one-shot image recognition. In ICML deep learning workshop, Vol. 2.

[18]

Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural Architectures for Named Entity Recognition. In HLT-NAACL. The Association for Computational Linguistics, 260--270.

[19]

Dong-Ho Lee, Akshen Kadakia, Kangmin Tan, Mahak Agarwal, Xinyu Feng, Takashi Shibuya, Ryosuke Mitani, Toshiyuki Sekiya, Jay Pujara, and Xiang Ren. 2022. Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER. In ACL (1). Association for Computational Linguistics, 2687--2700.

[20]

Xiaoya Li, Fan Yin, Zijun Sun, Xiayu Li, Arianna Yuan, Duo Chai, Mingxin Zhou, and Jiwei Li. 2019. Entity-Relation Extraction as Multi-Turn Question Answering. In ACL (1). Association for Computational Linguistics, 1340--1350.

[21]

Zhongwei Li, Xuancong Wang, AiTi Aw, Eng Siong Chng, and Haizhou Li. 2018. Named-Entity Tagging and Domain adaptation for Better Customized Translation. In NEWS@ACL. Association for Computational Linguistics, 41--46.

[22]

Shayne Longpre, Kartik Perisetla, Anthony Chen, Nikhil Ramesh, Chris DuBois, and Sameer Singh. 2021. Entity-Based Knowledge Conflicts in Question Answering. In EMNLP (1). Association for Computational Linguistics, 7052--7063.

[23]

Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In ICLR (Poster). OpenReview.net.

[24]

Ruotian Ma, Xin Zhou, Tao Gui, Yiding Tan, Linyang Li, Qi Zhang, and Xuanjing Huang. 2022b. Template-free Prompt Tuning for Few-shot NER. In NAACL-HLT. Association for Computational Linguistics, 5721--5732.

[25]

Tingting Ma, Huiqiang Jiang, Qianhui Wu, Tiejun Zhao, and Chin-Yew Lin. 2022a. Decomposed Meta-Learning for Few-Shot Named Entity Recognition. In ACL (Findings). Association for Computational Linguistics, 1584--1596.

[26]

Xuezhe Ma and Eduard H. Hovy. 2016. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. In ACL (1). The Association for Computer Linguistics.

[27]

Stephen Mayhew, Chen-Tse Tsai, and Dan Roth. 2017. Cheap Translation for Cross-Lingual Named Entity Recognition. In EMNLP. Association for Computational Linguistics, 2536--2545.

[28]

Nikhil Mishra, Mostafa Rohaninejad, Xi Chen, and Pieter Abbeel. 2018. A Simple Neural Attentive Meta-Learner. In ICLR (Poster). OpenReview.net.

[29]

Alex Nichol, Joshua Achiam, and John Schulman. 2018. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999 (2018).

[30]

Matthew E. Peters, Waleed Ammar, Chandra Bhagavatula, and Russell Power. 2017. Semi-supervised sequence tagging with bidirectional language models. In ACL (1). Association for Computational Linguistics, 1756--1765.

[31]

Soumyadeep Roy, Sudip Chakraborty, Aishik Mandal, Gunjan Balde, Prakhar Sharma, Anandhavelu Natarajan, Megha Khosla, Shamik Sural, and Niloy Ganguly. 2021. Knowledge-Aware Neural Networks for Medical Forum Question Classification. In CIKM. ACM, 3398--3402.

[32]

Avirup Sil and Alexander Yates. 2013. Re-ranking for joint named-entity recognition and linking. In CIKM. ACM, 2369--2374.

[33]

Jake Snell, Kevin Swersky, and Richard S. Zemel. 2017. Prototypical Networks for Few-shot Learning. In NIPS. 4077--4087.

Digital Library

[34]

Meihan Tong, Shuai Wang, Bin Xu, Yixin Cao, Minghui Liu, Lei Hou, and Juanzi Li. 2021. Learning from Miscellaneous Other-Class Words for Few-shot Named Entity Recognition. In ACL/IJCNLP (1). Association for Computational Linguistics, 6236--6247.

[35]

Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).

[36]

Oriol Vinyals, Charles Blundell, Tim Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. 2016. Matching Networks for One Shot Learning. In NIPS. 3630--3638.

[37]

Peiyi Wang, Runxin Xu, Tianyu Liu, Qingyu Zhou, Yunbo Cao, Baobao Chang, and Zhifang Sui. 2022. An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling. In NAACL-HLT. Association for Computational Linguistics, 5012--5024.

[38]

Yaqing Wang, Haoda Chu, Chao Zhang, and Jing Gao. 2021a. Learning from Language Description: Low-shot Named Entity Recognition via Decomposed Framework. In EMNLP (Findings). Association for Computational Linguistics, 1618--1630.

[39]

Yaqing Wang, Quanming Yao, James T. Kwok, and Lionel M. Ni. 2021b. Generalizing from a Few Examples: A Survey on Few-shot Learning. ACM Comput. Surv., Vol. 53, 3 (2021), 63:1--63:34.

[40]

Qianhui Wu, Zijia Lin, Guoxin Wang, Hui Chen, Bö rje F. Karlsson, Biqing Huang, and Chin-Yew Lin. 2020. Enhanced Meta-Learning for Cross-Lingual Named Entity Recognition with Minimal Resources. In AAAI. AAAI Press, 9274--9281.

[41]

Canwen Xu, Feiyang Wang, Jialong Han, and Chenliang Li. 2019. Exploiting Multiple Embeddings for Chinese Named Entity Recognition. In CIKM. ACM, 2269--2272.

[42]

Yi Yang and Arzoo Katiyar. 2020. Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning. In EMNLP (1). Association for Computational Linguistics, 6365--6375.

[43]

Ningyu Zhang, Qianghuai Jia, Shumin Deng, Xiang Chen, Hongbin Ye, Hui Chen, Huaixiao Tou, Gang Huang, Zhao Wang, Nengwei Hua, and Huajun Chen. 2021. AliCG: Fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba. In KDD. ACM, 3895--3905.

[44]

Ruixiang Zhang, Tong Che, Zoubin Ghahramani, Yoshua Bengio, and Yangqiu Song. 2018. MetaGAN: An Adversarial Approach to Few-Shot Learning. In NeurIPS. 2371--2380.

[45]

Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In ACL (1). Association for Computational Linguistics, 1441--1451.

[46]

Lili Zhao, Linan Yue, Yanqing An, Yuren Zhang, Jun Yu, Qi Liu, and Enhong Chen. 2022. CPEE: Civil Case Judgment Prediction centering on the Trial Mode of Essential Elements. In CIKM. ACM, 2691--2700.

Digital Library

Cited By

Li YTan ZXiao W(2025)LLM for Uniform Information Extraction Using Multi-task Learning OptimizationWeb and Big Data. APWeb-WAIM 2024 International Workshops10.1007/978-981-96-0055-7_2(17-29)Online publication date: 31-Jan-2025
https://doi.org/10.1007/978-981-96-0055-7_2
Yang ZLiu YOuyang CZhao SZhu C(2024)Improving Few-Shot Named Entity Recognition with Causal InterventionsBig Data Mining and Analytics10.26599/BDMA.2024.90200527:4(1375-1395)Online publication date: Dec-2024
https://doi.org/10.26599/BDMA.2024.9020052
Wang YZhu YXiong WCai C(2024)A few-shot word-structure embedded model for bridge inspection reports learningAdvanced Engineering Informatics10.1016/j.aei.2024.10266462(102664)Online publication date: Oct-2024
https://doi.org/10.1016/j.aei.2024.102664
Show More Cited By

Recommendations

A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NER
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

The objective of few-shot named entity recognition is to identify named entities with limited labeled instances. Previous works have primarily focused on optimizing the traditional token-wise classification framework, while neglecting the exploration of ...
An Empirical Study of Multi-domain and Multi-task Learning in Chinese Named Entity Recognition
Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning
Abstract
Named entity recognition (NER) often suffers from lack of annotation data. Multi-domain and multi-task learning solve this problem in some degree. However, previous multi-domain and multi-task learning are often studied in English. In the other ...
A Multi-task Biomedical Named Entity Recognition Method Based on Data Augmentation
Chinese Computational Linguistics
Abstract
The rapid development of artificial intelligence has led to an explosion of literature in the biomedical field, and Biomedical Named Entity Recognition (BioNER) can quickly and accurately identify key information from unstructured text. This task ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

October 2023

5508 pages

ISBN:9798400701245

DOI:10.1145/3583780

General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Conference

CIKM '23

Sponsor:

CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2023

Birmingham, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
342
Total Downloads

Downloads (Last 12 months)200
Downloads (Last 6 weeks)12

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li YTan ZXiao W(2025)LLM for Uniform Information Extraction Using Multi-task Learning OptimizationWeb and Big Data. APWeb-WAIM 2024 International Workshops10.1007/978-981-96-0055-7_2(17-29)Online publication date: 31-Jan-2025
https://doi.org/10.1007/978-981-96-0055-7_2
Yang ZLiu YOuyang CZhao SZhu C(2024)Improving Few-Shot Named Entity Recognition with Causal InterventionsBig Data Mining and Analytics10.26599/BDMA.2024.90200527:4(1375-1395)Online publication date: Dec-2024
https://doi.org/10.26599/BDMA.2024.9020052
Wang YZhu YXiong WCai C(2024)A few-shot word-structure embedded model for bridge inspection reports learningAdvanced Engineering Informatics10.1016/j.aei.2024.10266462(102664)Online publication date: Oct-2024
https://doi.org/10.1016/j.aei.2024.102664
Xu DChen WPeng WZhang CXu TZhao XWu XZheng YWang YChen E(2024)Large language models for generative information extraction: a surveyFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40555-y18:6Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1007/s11704-024-40555-y
Hou SQian YChen JZhao JLeng H(2024)MBA-NER: Multi-Granularity Entity Boundary-Aware Contrastive Enhanced for Two-Stage Few-Shot Named Entity RecognitionPattern Recognition and Computer Vision10.1007/978-981-97-8490-5_2(17-30)Online publication date: 7-Nov-2024
https://doi.org/10.1007/978-981-97-8490-5_2

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten