Article

CALM: Context Augmentation with Large Language Model for Named Entity Recognition

Authors:

Tristan Luiggi,

Tanguy Herserant,

Vincent GuigueAuthors Info & Claims

Linking Theory and Practice of Digital Libraries: 28th International Conference on Theory and Practice of Digital Libraries, TPDL 2024, Ljubljana, Slovenia, September 24–27, 2024, Proceedings, Part I

Pages 273 - 291

https://doi.org/10.1007/978-3-031-72437-4_16

Published: 26 September 2024 Publication History

Abstract

In prior research on Named Entity Recognition (NER), the focus has been on addressing challenges arising from data scarcity and overfitting, particularly in the context of increasingly complex transformer-based architectures. A framework based on information retrieval (IR), using a search engine to increase input samples and mitigate overfitting tendencies, has been proposed. However, the effectiveness of such system is limited, as they were not designed for this specific application. While this approach serves as a solid foundation, we maintain that LLMs offer capabilities surpassing those of search engines, with greater flexibility in terms of semantic analysis and generation. To overcome these challenges, we propose CALM an innovative context augmentation method, designed for adaptability through prompting. In our study, prompts are meticulously defined as pairs comprising specific tasks and their corresponding response strategies. This careful definition of prompts is pivotal in realizing optimal performance. Our findings illustrate that the resultant context enhances the robustness and performances on NER datasets. We achieve state-of-the-art F1 scores on WNUT17 and CoNLL++. We also delve into the qualitative impact of prompting.

References

[1]

Bontcheva, K., Roberts, I., Derczynski, L., Rout, D.: The gate crowdsourcing plugin: crowdsourcing annotated corpora made easy. In: EACL 2014 - Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 97–100 (2014)., https://aclanthology.org/E14-2025

[2]

Chiu JP and Nichols E Named entity recognition with bidirectional LSTM-CNNs Trans. Assoc. Comput. Linguist. 2016 4 357-370

[3]

Cui, L., Wu, Y., Liu, J., Yang, S., Zhang, Y.: Template-based named entity recognition using BART. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1835–1845, June 2021., https://arxiv.org/abs/2106.01760v1

[4]

Derczynski, L., Nichols, E., Van Erp, M., Limsopatham, N.: Results of the WNUT2017 shared task on novel and emerging entity recognition. In: Proceedings of the 3rd Workshop on Noisy User-Generated Text, pp. 140–147 (2017)

[5]

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

[6]

Ding, N., et al.: Prompt-learning for fine-grained entity typing. In: Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 6917–6930, August 2021., https://arxiv.org/abs/2108.10604v1

[7]

Hancock, B., Bringmann, M., Varma, P., Liang, P., Wang, S., Ré, C.: Training classifiers with natural language explanations. In: ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), vol. 1, pp. 1884–1895 (2018)., https://aclanthology.org/P18-1175

[8]

Hou, Y., Liu, Y., Che, W., Liu, T.: Sequence-to-sequence data augmentation for dialogue language understanding (2018). https://aclanthology.org/C18-1105

[9]

Hu, J., Shen, Y., Liu, Y., Wan, X., Chang, T.H.: Hero-gang neural model for named entity recognition (2022)

[10]

Huffman SB Wermter S, Riloff E, and Scheler G Learning information extraction patterns from examples Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing 1996 Heidelberg Springer 246-260

[11]

Iyyer, M., Wieting, J., Gimpel, K., Zettlemoyer, L.: Adversarial example generation with syntactically controlled paraphrase networks. In: NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, vol. 1, pp. 1875–1885 (2018)., https://aclanthology.org/N18-1170

[12]

Jeong, M., Kang, J.: Regularizing models via pointwise mutual information for named entity recognition. CoRR abs/2104.07249 (2021). https://arxiv.org/abs/2104.07249

[13]

Jeong, M., Kang, J.: Enhancing label consistency on document-level named entity recognition (2022)

[14]

Jiang, A.Q., et al.: Mistral 7B (2023)

[15]

Kobayashi, S.: Contextual augmentation: data augmentation by words with paradigmatic relations. In: NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, vol. 2, pp. 452–457 (2018)., https://aclanthology.org/N18-2072

[16]

Kocaman, V., Talby, D.: Biomedical named entity recognition at scale (2020)

[17]

Kumar, V., Ai, A., Choudhary, A., Cho, E.: Data augmentation using pre-trained transformer models (2020). https://aclanthology.org/2020.lifelongnlp-1.3

[18]

Kurata, G., Xiang, B., Zhou, B.: Labeled data generation with encoder-decoder LSTM for semantic slot filling. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 8–12 September 2016, pp. 725–729 (2016).

[19]

Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)

[20]

Lee, D.H., et al.: LEAN-LIFE: a label-efficient annotation framework towards learning from explanation. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 372–379 (2020)., https://aclanthology.org/2020.acl-demos.42

[21]

Lee, D.H., et al.: AutoTrigger: label-efficient and robust named entity recognition with auxiliary trigger extraction. In: EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, pp. 3003–3017, September 2021., https://arxiv.org/abs/2109.04726v3

[22]

Li, J., et al.: BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database 2016, baw068 (2016)

[23]

Li, X., Sun, X., Meng, Y., Liang, J., Wu, F., Li, J.: Dice loss for data-imbalanced NLP tasks. CoRR abs/1911.02855 (2019). http://arxiv.org/abs/1911.02855

[24]

Lin, B.Y., et al.: TriggerNER: learning with entity triggers as explanations for named entity recognition. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 8503–8511 (2020)., https://aclanthology.org/2020.acl-main.752

[25]

Liu P, Yuan W, Fu J, Jiang Z, Hayashi H, and Neubig G Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing ACM Comput. Surv. 2023 55 9 1-35

Digital Library

[26]

Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach (2019). arXiv preprint arXiv:1907.11692, vol. 364 (2019)

[27]

Loshchilov, I., Hutter, F.: Fixing weight decay regularization in adam (2018)

[28]

Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. CoRR abs/1310.4546 (2013). http://arxiv.org/abs/1310.4546

[29]

Min, J., McCoy, R.T., Das, D., Pitler, E., Linzen, T.: Syntactic data augmentation increases robustness to inference heuristics. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 2339–2352 (2020)., https://aclanthology.org/2020.acl-main.212

[30]

Morton, T.S., LaCivita, J.: WordFreak: an open tool for linguistic annotation (2003). https://aclanthology.org/N03-4009

[31]

OpenAI: GPT-4 technical report (2023)

[32]

Peters, M.E., et al.: Deep contextualized word representations (2018)

[33]

Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019). http://arxiv.org/abs/1910.10683

[34]

Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Natural Language Processing Using Very Large Corpora, pp. 157–176 (1999)

[35]

Rei, M.: Semi-supervised multitask learning for sequence labeling. CoRR abs/1704.07156 (2017). http://arxiv.org/abs/1704.07156

[36]

Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. arXiv preprint cs/0306050 (2003)

[37]

Seyler, D., Dembelova, T., Del Corro, L., Hoffart, J., Weikum, G.: A study of the importance of external knowledge in the named entity recognition task. In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 241–246. Association for Computational Linguistics, Melbourne, Australia, July 2018., https://aclanthology.org/P18-2039

[38]

Singh, T.D., Nongmeikapam, K., Ekbal, A., Bandyopadhyay, S.: Named entity recognition for Manipuri using support vector machine. In: Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, vol. 2, pp. 811–818 (2009)

[39]

Srivastava, S., Labutov, I., Mitchell, T.: Joint concept learning and semantic parsing from natural language explanations. In: EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings, pp. 1527–1536 (2017)., https://aclanthology.org/D17-1161

[40]

Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: BRAT: a web-based tool for NLP-assisted text annotation (2012). https://aclanthology.org/E12-2021

[41]

Sutton, C., McCallum, A.: An introduction to conditional random fields (2010)

[42]

Taillé, B., Guigue, V., Gallinari, P.: Contextualized embeddings in named-entity recognition: an empirical study on generalization. In: Jose, J., et al. (eds.) Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, 14–17 April 2020, Proceedings, Part II 42, pp. 383–391. Springer, Cham (2020).

Digital Library

[43]

Touvron, H., et al.: LLaMA 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023)

[44]

Ushio, A., Camacho-Collados, J.: T-NER: an all-round python library for transformer-based named entity recognition. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. Association for Computational Linguistics (2021).

[45]

Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

[46]

Wang, S., et al.: GPT-NER: named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023)

[47]

Wang, X., et al.: Automated concatenation of embeddings for structured prediction. CoRR abs/2010.05006 (2020). https://arxiv.org/abs/2010.05006

[48]

Wang, X., et al.: Improving named entity recognition by external context retrieving and cooperative learning. arXiv preprint arXiv:2105.03654 (2021)

[49]

Wang, Z., Shang, J., Liu, L., Lu, L., Liu, J., Han, J.: CrossWeigh: training named entity tagger from imperfect annotations. CoRR abs/1909.01441 (2019). http://arxiv.org/abs/1909.01441

[50]

Wang*, Z., et al.: Learning from explanations with neural execution tree, September 2019. http://inklab.usc.edu/project-NExT/

[51]

Wei, J., Zou, K.: EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, pp. 6382–6388 (2019)., https://aclanthology.org/D19-1670

[52]

White, J., et al.: A prompt pattern catalog to enhance prompt engineering with ChatGPT. arXiv preprint arXiv:2302.11382 (2023)

[53]

Wu, X., Lv, S., Zang, L., Han, J., Hu, S.: Conditional BERT contextual augmentation. In: Rodrigues, J., et al. (eds.) ICCS 2019. LNCS (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11539, pp. 84–95. Springer, Cham (2018)., https://arxiv.org/abs/1812.06705v1

Digital Library

[54]

Yamada, I., Asai, A., Shindo, H., Takeda, H., Matsumoto, Y.: LUKE: deep contextualized entity representations with entity-aware self-attention. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6442–6454. Association for Computational Linguistics, Online, November 2020., https://aclanthology.org/2020.emnlp-main.523

[55]

Yang, J., Zhang, Y., Li, L., Li, X.: YEDDA: a lightweight collaborative text span annotation tool. In: ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of System Demonstrations, pp. 31–36 (2018)., https://aclanthology.org/P18-4006

[56]

Yu, A.W., et al.: QANet: combining local convolution with global self-attention for reading comprehension. In: 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, April 2018. https://arxiv.org/abs/1804.09541v1

[57]

Zhang, S., Cheng, H., Gao, J., Poon, H.: Optimizing bi-encoder for named entity recognition via contrastive learning (2023)

[58]

Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTScore: evaluating text generation with BERT. arXiv preprint arXiv:1904.09675 (2019)

[59]

Zhang, X., Zhao, J., Lecun, Y.: Character-level convolutional networks for text classification (2015)

[60]

Zhou, W., Chen, M.: Learning from noisy labels for entity-centric information extraction. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2021).

[61]

Zhou, W., et al.: NERO: a neural rule grounding framework for label-efficient relation extraction. In: The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020, pp. 2166–2176, September 2019., https://arxiv.org/abs/1909.02177v4

Digital Library

Index Terms

CALM: Context Augmentation with Large Language Model for Named Entity Recognition
1. Computing methodologies
  1. Artificial intelligence

Index terms have been assigned to the content through auto-classification.

Recommendations

Learning multilingual named entity recognition from Wikipedia

We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...
Named entity recognition and resolution in legal text
Semantic Processing of Legal Texts

Named entities in text are persons, places, companies, etc. that are explicitly mentioned in text using proper nouns. The process of finding named entities in a text and classifying them to a semantic type, is called named entity recognition. Resolution ...
Two-stage approach to named entity recognition using Wikipedia and DBpedia
IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication

In natural language understanding, extraction of named entity (NE) mentions in given text and classification of the mentions into pre-defined NE types are important processes. Most NE recognition (NER) relies on resources such as a training corpus or NE ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Linking Theory and Practice of Digital Libraries: 28th International Conference on Theory and Practice of Digital Libraries, TPDL 2024, Ljubljana, Slovenia, September 24–27, 2024, Proceedings, Part I

Sep 2024

450 pages

ISBN:978-3-031-72436-7

DOI:10.1007/978-3-031-72437-4

Editors:
Apostolos Antonacopoulos
https://ror.org/01tmqtf75University of Salford, Salford, UK
,
Annika Hinze
University of Waikato, Hamilton, New Zealand
,
Benjamin Piwowarski
https://ror.org/02en5vm52Sorbonne University (CNRS), Paris, France
,
Mickaël Coustaty
https://ror.org/04mv1z119University of La Rochelle (L3i Laboratory), La Rochelle, France
,
Giorgio Maria Di Nunzio
https://ror.org/00240q980University of Padova, Padua, Italy
,
Francesco Gelati
https://ror.org/00g30e956University of Hamburg, Hamburg, Germany
,
Nicholas Vanderschantz
University of Waikato, Hamilton, New Zealand

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 26 September 2024

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents