RoPDA: Robust Prompt-Based Data Augmentation for Low-Resource Named Entity Recognition

Authors

  • Sihan Song National Key Laboratory for Novel Software Technology, Nanjing University, China Department of Computer Science and Technology, Nanjing University, China
  • Furao Shen National Key Laboratory for Novel Software Technology, Nanjing University, China School of Artificial Intelligence, Nanjing University, China
  • Jian Zhao National Key Laboratory for Novel Software Technology, Nanjing University, China School of Electronic Science and Engineering, Nanjing University, China

DOI:

https://doi.org/10.1609/aaai.v38i17.29868

Keywords:

NLP: Information Extraction, NLP: Applications

Abstract

Data augmentation has been widely used in low-resource NER tasks to tackle the problem of data sparsity. However, previous data augmentation methods have the disadvantages of disrupted syntactic structures, token-label mismatch, and requirement for external knowledge or manual effort. To address these issues, we propose Robust Prompt-based Data Augmentation (RoPDA) for low-resource NER. Based on pre-trained language models (PLMs) with continuous prompt, RoPDA performs entity augmentation and context augmentation through five fundamental augmentation operations to generate label-flipping and label-preserving examples. To optimize the utilization of the augmented samples, we present two techniques: self-consistency filtering and mixup. The former effectively eliminates low-quality samples with a bidirectional mask, while the latter prevents performance degradation arising from the direct utilization of labelflipping samples. Extensive experiments on three popular benchmarks from different domains demonstrate that RoPDA significantly improves upon strong baselines, and also outperforms state-of-the-art semi-supervised learning methods when unlabeled data is included.

Downloads

Published

2024-03-24

How to Cite

Song, S., Shen, F., & Zhao, J. (2024). RoPDA: Robust Prompt-Based Data Augmentation for Low-Resource Named Entity Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 38(17), 19017-19025. https://doi.org/10.1609/aaai.v38i17.29868

Issue

Section

AAAI Technical Track on Natural Language Processing II