Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3583780.3614908acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

HEProto: A Hierarchical Enhancing ProtoNet based on Multi-Task Learning for Few-shot Named Entity Recognition

Published: 21 October 2023 Publication History

Abstract

Few-shot Named Entity Recognition (NER) task, which aims to identify and classify entities from different domains with limited training samples, has long been treated as a basic step for knowledge graph (KG) construction. Great efforts have been made on this task with competitive performance, however, they usually treat the two subtasks, namely span detection and type classification, as mutually independent, and the integrity and correlation between subtasks have been largely ignored. Moreover, prior arts may fail to absorb the coarse-grained features of entities, resulting in a semantic-insufficient representation of entity types. To that end, in this paper, we propose a Hierarchical Enhancing ProtoNet (HEProto) based on multi-task learning, which is utilized to jointly learn these two subtasks and model their correlation. Specifically, we adopt contrastive learning to enhance the span boundary information and the type semantic representations in these two subtasks. Then, the hierarchical prototypical network is designed to leverage the coarse-grained information of entities in the type classification stage, which could help the model to better learn the fine-grained semantic representations. Along this line, we construct a similarity margin loss to reduce the similarity between fine-grained entities and other irrelevant coarse-grained prototypes. Finally, extensive experiments on the Few-NERD dataset prove that our solution outperforms competitive baseline methods. The source code of HEProto is available at \hrefhttps://github.com/fanshu6hao/HEProto https://github.com/fanshu6hao/HEProto.

References

[1]
Jiaao Chen, Zhenghui Wang, Ran Tian, Zichao Yang, and Diyi Yang. 2020. Local Additivity Based Data Augmentation for Semi-supervised NER. In EMNLP (1). Association for Computational Linguistics, 1241--1251.
[2]
Pengxiang Cheng and Katrin Erk. 2020. Attending to Entities for Better Text Understanding. In AAAI. AAAI Press, 7554--7561.
[3]
Leyang Cui, Yu Wu, Jian Liu, Sen Yang, and Yue Zhang. 2021. Template-Based Named Entity Recognition Using BART. In ACL/IJCNLP (Findings) (Findings of ACL, Vol. ACL/IJCNLP 2021). Association for Computational Linguistics, 1835--1845.
[4]
Sarkar Snigdha Sarathi Das, Arzoo Katiyar, Rebecca J. Passonneau, and Rui Zhang. 2022. CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning. In ACL (1). Association for Computational Linguistics, 6338--6353.
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT (1). Association for Computational Linguistics, 4171--4186.
[6]
Ning Ding, Guangwei Xu, Yulin Chen, Xiaobin Wang, Xu Han, Pengjun Xie, Haitao Zheng, and Zhiyuan Liu. 2021. Few-NERD: A Few-shot Named Entity Recognition Dataset. In ACL/IJCNLP (1). Association for Computational Linguistics, 3198--3213.
[7]
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In ICML (Proceedings of Machine Learning Research, Vol. 70). PMLR, 1126--1135.
[8]
Sumam Francis, Jordy Van Landeghem, and Marie-Francine Moens. 2019. Transfer Learning for Named Entity Recognition in Financial and Biomedical Documents. Inf., Vol. 10, 8 (2019), 248.
[9]
Alexander Fritzler, Varvara Logacheva, and Maksim Kretov. 2019. Few-shot classification in named entity recognition task. In SAC. ACM, 993--1000.
[10]
Ning Gao, Nikos Karampatziakis, Rahul Potharaju, and Silviu Cucerzan. 2019. Active Entity Recognition in Low Resource Settings. In CIKM. ACM, 2261--2264.
[11]
Marta Garnelo, Dan Rosenbaum, Christopher Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo Jimenez Rezende, and S. M. Ali Eslami. 2018. Conditional Neural Processes. In ICML (Proceedings of Machine Learning Research, Vol. 80). PMLR, 1690--1699.
[12]
Jiafeng Guo, Gu Xu, Xueqi Cheng, and Hang Li. 2009. Named entity recognition in query. In SIGIR. ACM, 267--274.
[13]
Yutai Hou, Wanxiang Che, Yongkui Lai, Zhihan Zhou, Yijia Liu, Han Liu, and Ting Liu. 2020. Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network. In ACL. Association for Computational Linguistics, 1381--1393.
[14]
Jiaxin Huang, Chunyuan Li, Krishan Subudhi, Damien Jose, Shobana Balakrishnan, Weizhu Chen, Baolin Peng, Jianfeng Gao, and Jiawei Han. 2020. Few-shot named entity recognition: A comprehensive study. arXiv preprint arXiv:2012.14978 (2020).
[15]
Yucheng Huang, Kai He, Yige Wang, Xianli Zhang, Tieliang Gong, Rui Mao, and Chen Li. 2022. COPNER: Contrastive Learning with Prompt Guiding for Few-shot Named Entity Recognition. In COLING. International Committee on Computational Linguistics, 2515--2527.
[16]
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised Contrastive Learning. In NeurIPS.
[17]
Gregory Koch, Richard Zemel, Ruslan Salakhutdinov, et al. 2015. Siamese neural networks for one-shot image recognition. In ICML deep learning workshop, Vol. 2.
[18]
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural Architectures for Named Entity Recognition. In HLT-NAACL. The Association for Computational Linguistics, 260--270.
[19]
Dong-Ho Lee, Akshen Kadakia, Kangmin Tan, Mahak Agarwal, Xinyu Feng, Takashi Shibuya, Ryosuke Mitani, Toshiyuki Sekiya, Jay Pujara, and Xiang Ren. 2022. Good Examples Make A Faster Learner: Simple Demonstration-based Learning for Low-resource NER. In ACL (1). Association for Computational Linguistics, 2687--2700.
[20]
Xiaoya Li, Fan Yin, Zijun Sun, Xiayu Li, Arianna Yuan, Duo Chai, Mingxin Zhou, and Jiwei Li. 2019. Entity-Relation Extraction as Multi-Turn Question Answering. In ACL (1). Association for Computational Linguistics, 1340--1350.
[21]
Zhongwei Li, Xuancong Wang, AiTi Aw, Eng Siong Chng, and Haizhou Li. 2018. Named-Entity Tagging and Domain adaptation for Better Customized Translation. In NEWS@ACL. Association for Computational Linguistics, 41--46.
[22]
Shayne Longpre, Kartik Perisetla, Anthony Chen, Nikhil Ramesh, Chris DuBois, and Sameer Singh. 2021. Entity-Based Knowledge Conflicts in Question Answering. In EMNLP (1). Association for Computational Linguistics, 7052--7063.
[23]
Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In ICLR (Poster). OpenReview.net.
[24]
Ruotian Ma, Xin Zhou, Tao Gui, Yiding Tan, Linyang Li, Qi Zhang, and Xuanjing Huang. 2022b. Template-free Prompt Tuning for Few-shot NER. In NAACL-HLT. Association for Computational Linguistics, 5721--5732.
[25]
Tingting Ma, Huiqiang Jiang, Qianhui Wu, Tiejun Zhao, and Chin-Yew Lin. 2022a. Decomposed Meta-Learning for Few-Shot Named Entity Recognition. In ACL (Findings). Association for Computational Linguistics, 1584--1596.
[26]
Xuezhe Ma and Eduard H. Hovy. 2016. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. In ACL (1). The Association for Computer Linguistics.
[27]
Stephen Mayhew, Chen-Tse Tsai, and Dan Roth. 2017. Cheap Translation for Cross-Lingual Named Entity Recognition. In EMNLP. Association for Computational Linguistics, 2536--2545.
[28]
Nikhil Mishra, Mostafa Rohaninejad, Xi Chen, and Pieter Abbeel. 2018. A Simple Neural Attentive Meta-Learner. In ICLR (Poster). OpenReview.net.
[29]
Alex Nichol, Joshua Achiam, and John Schulman. 2018. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999 (2018).
[30]
Matthew E. Peters, Waleed Ammar, Chandra Bhagavatula, and Russell Power. 2017. Semi-supervised sequence tagging with bidirectional language models. In ACL (1). Association for Computational Linguistics, 1756--1765.
[31]
Soumyadeep Roy, Sudip Chakraborty, Aishik Mandal, Gunjan Balde, Prakhar Sharma, Anandhavelu Natarajan, Megha Khosla, Shamik Sural, and Niloy Ganguly. 2021. Knowledge-Aware Neural Networks for Medical Forum Question Classification. In CIKM. ACM, 3398--3402.
[32]
Avirup Sil and Alexander Yates. 2013. Re-ranking for joint named-entity recognition and linking. In CIKM. ACM, 2369--2374.
[33]
Jake Snell, Kevin Swersky, and Richard S. Zemel. 2017. Prototypical Networks for Few-shot Learning. In NIPS. 4077--4087.
[34]
Meihan Tong, Shuai Wang, Bin Xu, Yixin Cao, Minghui Liu, Lei Hou, and Juanzi Li. 2021. Learning from Miscellaneous Other-Class Words for Few-shot Named Entity Recognition. In ACL/IJCNLP (1). Association for Computational Linguistics, 6236--6247.
[35]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).
[36]
Oriol Vinyals, Charles Blundell, Tim Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. 2016. Matching Networks for One Shot Learning. In NIPS. 3630--3638.
[37]
Peiyi Wang, Runxin Xu, Tianyu Liu, Qingyu Zhou, Yunbo Cao, Baobao Chang, and Zhifang Sui. 2022. An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling. In NAACL-HLT. Association for Computational Linguistics, 5012--5024.
[38]
Yaqing Wang, Haoda Chu, Chao Zhang, and Jing Gao. 2021a. Learning from Language Description: Low-shot Named Entity Recognition via Decomposed Framework. In EMNLP (Findings). Association for Computational Linguistics, 1618--1630.
[39]
Yaqing Wang, Quanming Yao, James T. Kwok, and Lionel M. Ni. 2021b. Generalizing from a Few Examples: A Survey on Few-shot Learning. ACM Comput. Surv., Vol. 53, 3 (2021), 63:1--63:34.
[40]
Qianhui Wu, Zijia Lin, Guoxin Wang, Hui Chen, Bö rje F. Karlsson, Biqing Huang, and Chin-Yew Lin. 2020. Enhanced Meta-Learning for Cross-Lingual Named Entity Recognition with Minimal Resources. In AAAI. AAAI Press, 9274--9281.
[41]
Canwen Xu, Feiyang Wang, Jialong Han, and Chenliang Li. 2019. Exploiting Multiple Embeddings for Chinese Named Entity Recognition. In CIKM. ACM, 2269--2272.
[42]
Yi Yang and Arzoo Katiyar. 2020. Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning. In EMNLP (1). Association for Computational Linguistics, 6365--6375.
[43]
Ningyu Zhang, Qianghuai Jia, Shumin Deng, Xiang Chen, Hongbin Ye, Hui Chen, Huaixiao Tou, Gang Huang, Zhao Wang, Nengwei Hua, and Huajun Chen. 2021. AliCG: Fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba. In KDD. ACM, 3895--3905.
[44]
Ruixiang Zhang, Tong Che, Zoubin Ghahramani, Yoshua Bengio, and Yangqiu Song. 2018. MetaGAN: An Adversarial Approach to Few-Shot Learning. In NeurIPS. 2371--2380.
[45]
Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In ACL (1). Association for Computational Linguistics, 1441--1451.
[46]
Lili Zhao, Linan Yue, Yanqing An, Yuren Zhang, Jun Yu, Qi Liu, and Enhong Chen. 2022. CPEE: Civil Case Judgment Prediction centering on the Trial Mode of Essential Elements. In CIKM. ACM, 2691--2700.

Cited By

View all
  • (2024)A few-shot word-structure embedded model for bridge inspection reports learningAdvanced Engineering Informatics10.1016/j.aei.2024.10266462(102664)Online publication date: Oct-2024
  • (2024)Large language models for generative information extraction: a surveyFrontiers of Computer Science10.1007/s11704-024-40555-y18:6Online publication date: 11-Nov-2024
  • (2024)MBA-NER: Multi-Granularity Entity Boundary-Aware Contrastive Enhanced for Two-Stage Few-Shot Named Entity RecognitionPattern Recognition and Computer Vision10.1007/978-981-97-8490-5_2(17-30)Online publication date: 7-Nov-2024

Index Terms

  1. HEProto: A Hierarchical Enhancing ProtoNet based on Multi-Task Learning for Few-shot Named Entity Recognition

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
      October 2023
      5508 pages
      ISBN:9798400701245
      DOI:10.1145/3583780
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 October 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. few-shot NER
      2. multi-task learning
      3. prototypical network

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China

      Conference

      CIKM '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)261
      • Downloads (Last 6 weeks)19
      Reflects downloads up to 16 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A few-shot word-structure embedded model for bridge inspection reports learningAdvanced Engineering Informatics10.1016/j.aei.2024.10266462(102664)Online publication date: Oct-2024
      • (2024)Large language models for generative information extraction: a surveyFrontiers of Computer Science10.1007/s11704-024-40555-y18:6Online publication date: 11-Nov-2024
      • (2024)MBA-NER: Multi-Granularity Entity Boundary-Aware Contrastive Enhanced for Two-Stage Few-Shot Named Entity RecognitionPattern Recognition and Computer Vision10.1007/978-981-97-8490-5_2(17-30)Online publication date: 7-Nov-2024

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media