Self-adapted Positional Encoding in the Transformer Encoder for Named Entity Recognition

Kehan Huangliang^11,12,
Xinyang Li^11,12,
Teng Yin^11,12,
Bo Peng^11,12 &
…
Haixian Zhang^11,12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14259))

Included in the following conference series:

International Conference on Artificial Neural Networks

1161 Accesses

Abstract

The task of named entity recognition (NER) is fundamental to natural language processing (NLP), as it forms the basis for various downstream applications such as question answering, text summarization, and machine translation. With the development of Transformer architecture, it has gained popularity in NLP due to its ability to model parallel distant contextual dependencies. Although positional encoding is crucial in transformer-based NER models for capturing the sequential feature of natural language and improving their accuracy in NER, most approach, which uses a fixed mathematical formula to assign a unique vector to each position, is a hard-coded encoding To address this issue, a self-adapted positional encoding module called self-adapter is proposed in a Transformer model. The proposed self-adapter incorporates two information fusers aimed at enhancing the embedding representational ability of the model. The first information fuser integrates information across different positions, enhancing the embedding representational ability for different ranges. The second information fuser integrates diverse dimensional information for one position, resulting in improved embedding representation. Besides, We modify the calculation of the attention score to enhance the utilization of the self-adapter. A mathematical analysis based on Fourier series is presented to demonstrate the effectiveness of the proposed method. This approach allows for dynamic positional encoding adjustment, facilitating adaptation to varying contextual inputs and more flexibility to capture word relationships. To evaluate the model, four NER datasets, including one English and three Chinese datasets, are used. The results show that the self-adapter substantially improves the Transformer’s performance in the NER task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Chinese Named Entity Recognition Using the Improved Transformer Encoder and the Lexicon Adapter

A Mixed Semantic Features Model for Chinese NER with Characters and Words

Enhanced Chinese named entity recognition with multi-granularity BERT adapter and efficient global pointer

Article Open access 12 March 2024

References

Chiu, J.P.C., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. In: Proceedings of the Transactions of the Association for Computational Linguistics, pp. 357–370 (2016)
Google Scholar
Yang, J., Liang, S., Zhang, Y.: Design challenges and misconceptions in neural sequence labeling. In: COLING, pp. 3879–3889 (2018)
Google Scholar
Yao, L., Liu, H., Liu, Y., Li, X., Anwar, M.W.: Biomedical named entity recognition based on deep neutral network. Int. J. Hybrid Inf. Technol. 8(8), 279–288 (2015)
Google Scholar
Ma, X., Hovy, E.: End-to-end sequence labeling via bidirectional LSTM-CNNS-CRF. In: ACL (2016)
Google Scholar
Lin, B.Y., Xu, F.F., Luo, Z., Zhu, K.: Multi-channel bilstm-crf model for emerging named entity recognition in social media. In: Proceedings of the 3rd Workshop on Noisy User-generated Text, pp. 160–165 (2017)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res., 2493–2537 (2011)
Google Scholar
Strubell, E., Verga, P., Belanger, D., McCallum, A.: Fast and accurate entity recognition with iterated dilated convolutions. In: ACL (2017)
Google Scholar
Žukov-Gregorič, A., Bachrach, Y., Coope, S.: Named entity recognition with parallel recurrent neural networks. In: ACL (2018)
Google Scholar
Zhai, F., Potdar, S., Xiang, B., Zhou, B.: Neural models for sequence chunking. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1, pp. 3365–3371 (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186 (2019)
Google Scholar
Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training. Technical Report, OpenAI (2018)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21(1), 5485–5551 (2020)
MathSciNet Google Scholar
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Yan, H., Deng, B., Li, X., Qiu, X.: TENER: adapting transformer encoder for named entity recognition. arXiv preprint: arXiv:1911.04474 (2019)
Li, X., Yan, H., Qiu, X., Huang, X.: FLAT: Chinese NER using flat-lattice transformer. In: ACL, pp. 6836–6842 (2020)
Google Scholar
Jin, Z., He, X., Wu, X., Zhao, X.: A hybrid transformer approach for Chinese NER with features augmentation. Expert Syst. Appl. 209, 118385 (2022)
Article Google Scholar
Neishi, M., Yoshinaga, N.: On the relation between position information and sentence length in neural machine translation. In: Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pp. 328–338 (2019)
Google Scholar
Li, Y., Si, S., Li, G., Hsieh, C.J., Bengio, S.: Learnable fourier features for multi-dimensional spatial positional encoding. In: Advances in Neural Information Processing Systems, vol. 34, pp. 15816–15829 (2021)
Google Scholar
Wang, B., et al.: On position embeddings in BERT. In: ICLR (2021)
Google Scholar
Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q.V., Salakhutdinov, R.: Transformer-XL: attentive language models beyond a fixed-length context. In: ACL, pp. 2978–2988 (2019)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint: arXiv:1301.3781 (2013)
Nguyen, T.H., Sil, A., Dinu, G., Florian, R.: Toward mention detection robustness with recurrent neural networks. arXiv preprint: arXiv:1602.07749 (2016)
Zheng, S., Wang, F., Bao, H., Hao, Y., Zhou, P., Xu, B.: Joint extraction of entities and relations based on a novel tagging scheme. In: ACL (2017)
Google Scholar
Li, P.H., Dong, R.P., Wang, Y.S., Chou, J.C., Ma, W.Y.: Leveraging linguistic structures for named entity recognition with bidirectional recursive neural networks. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2664–2669 (2017)
Google Scholar
Peters, M.E., et al.: Deep contextualized word representations. In: NAACL-HLT, pp. 2227–2237 (2018)
Google Scholar
Kuru, O., Can, O.A., Yuret, D.: CharNER: character-level named entity recognition. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 911–921 (2016)
Google Scholar
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: NAACL, pp. 260–270 (2016)
Google Scholar
Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. In: ACL, pp. 1554–1564 (2018)
Google Scholar
Guo, Q., Qiu, X., Liu, P., Shao, Y., Xue, X., Zhang, Z.: Star-transformer. In: NAACL, pp. 1315–1325 (2019)
Google Scholar
Arfken, G.B., Weber, H.J.: Mathematical methods for physicists (1999)
Google Scholar
Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In: NAACL, pp. 142–147 (2003)
Google Scholar
Peng, N., Dredze, M.: Named entity recognition for Chinese social media with jointly trained embeddings. In: EMNLP, pp. 548–554 (2015)
Google Scholar
Levow, G.A.: The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: Proceedings of the Fifth Workshop on Chinese Language Processing, SIGHAN@COLING/ACL 2006, Sydney, Australia, pp. 108–117 (2006)
Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint: arXiv:1508.01991 (2015)
Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. In: TACL, vol. 4, pp. 357–370 (2016)
Google Scholar
Akhundov, A., Trautmann, D., Groh, G.: Sequence labeling: a practical approach. arXiv preprint: arXiv:1808.03926 (2018)
Liu, P., Chang, S., Huang, X., Tang, J., Cheung, J.C.K.: Contextualized non-local neural networks for sequence learning. In: AAAI, vol. 33, no. 01, pp. 6762–6769 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu, China
Kehan Huangliang, Xinyang Li, Teng Yin, Bo Peng & Haixian Zhang
Grid Planning and Research Center of Guangdong Power Grid Corporation, Guangzhou, China
Kehan Huangliang, Xinyang Li, Teng Yin, Bo Peng & Haixian Zhang

Authors

Kehan Huangliang
View author publications
You can also search for this author in PubMed Google Scholar
Xinyang Li
View author publications
You can also search for this author in PubMed Google Scholar
Teng Yin
View author publications
You can also search for this author in PubMed Google Scholar
Bo Peng
View author publications
You can also search for this author in PubMed Google Scholar
Haixian Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haixian Zhang .

Editor information

Editors and Affiliations

Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
Lancaster University, Lancaster, UK
Plamen Angelov
Teesside University, Middlesbrough, UK
Chrisina Jayne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H. (2023). Self-adapted Positional Encoding in the Transformer Encoder for Named Entity Recognition. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14259. Springer, Cham. https://doi.org/10.1007/978-3-031-44223-0_43

Download citation

DOI: https://doi.org/10.1007/978-3-031-44223-0_43
Published: 22 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44222-3
Online ISBN: 978-3-031-44223-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Self-adapted Positional Encoding in the Transformer Encoder for Named Entity Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Chinese Named Entity Recognition Using the Improved Transformer Encoder and the Lexicon Adapter

A Mixed Semantic Features Model for Chinese NER with Characters and Words

Enhanced Chinese named entity recognition with multi-granularity BERT adapter and efficient global pointer

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Self-adapted Positional Encoding in the Transformer Encoder for Named Entity Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Chinese Named Entity Recognition Using the Improved Transformer Encoder and the Lexicon Adapter

A Mixed Semantic Features Model for Chinese NER with Characters and Words

Enhanced Chinese named entity recognition with multi-granularity BERT adapter and efficient global pointer

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation