A network security entity recognition method based on feature template and CNN-BiLSTM-CRF

Ya Qin ORCID: orcid.org/0000-0002-2685-3445^1,2,
Guo-wei Shen^1,2,
Wen-bo Zhao^1,2,
Yan-ping Chen^1,2,
Miao Yu³ &
…
Xin Jin⁴

616 Accesses
Explore all metrics

Abstract

By network security threat intelligence analysis based on a security knowledge graph (SKG), multi-source threat intelligence data can be analyzed in a fine-grained manner. This has received extensive attention. It is difficult for traditional named entity recognition methods to identify mixed security entities in Chinese and English in the field of network security, and there are difficulties in accurately identifying network security entities because of insufficient features extracted. In this paper, we propose a novel FT-CNN-BiLSTM-CRF security entity recognition method based on a neural network CNN-BiLSTM-CRF model combined with a feature template (FT). The feature template is used to extract local context features, and a neural network model is used to automatically extract character features and text global features. Experimental results showed that our method can achieve an F-score of 86% on a large-scale network security dataset and outperforms other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Network Security Entity Recognition Methods Based on the Deep Neural Network

Research on Named Entity Recognition Method of Network Threat Intelligence

Data and knowledge-driven named entity recognition for cyber security

Article Open access 03 May 2021

References

Bergstra J, Bengio Y, 2012. Random search for hyperparameter optimization. J Mach Learn Res, 13(1):281–305.
MathSciNet MATH Google Scholar
Chiu JPC, Nichols E, 2015. Named entity recognition with bidirectional LSTM-CNNs. https://doi.org/arxiv.org/abs/1511.08308
Google Scholar
Collobert R, Weston J, 2008. A unified architecture for natural language processing: deep neural networks with multitask learning. Proc ACM 25th Int Conf on Machine Learning, p. 160–167. https://doi.org/10.1145/1390156.1390177
Google Scholar
Collobert R, Weston J, Bottou L, et al., 2011. Natural language processing (almost) from scratch. J Mach Learn Res, 12(1):2493–2537.
MATH Google Scholar
Dong CH, Zhang JJ, Zong CQ, et al., 2016. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In: Lin CY, Xue N, Zhao D, et al. (Eds.), Natural Language Understanding and Intelligent Applications. Springer, Cham, p. 239–250. https://doi.org/10.1007/978-3-319-50496-4_20
Chapter Google Scholar
Dos Santos C, Guimarães V, 2015. Boosting named entity recognition with neural character embeddings. Proc 5^th Named Entity Workshop, joint with 53^rd ACL and the 7^th IJCNLP, p. 25–33. https://doi.org/10.18653/v1/w15-3904
Chapter Google Scholar
Feng YH, Yu H, Sun G, et al., 2018. Named entity recognition method based on BLSTM. Comput Sci, 45(2):261–268 (in Chinese). https://doi.org/10.11896/j.issn.1002-137X.2018.02.045
Google Scholar
Finkel JR, Manning CD, 2009. Joint parsing and named entity recognition. Human Language Technologies: the Annual Conf of the North American Chapter of the Association of Computational Linguistics, p. 326–334. https://doi.org/10.3115/1620754.1620802
Google Scholar
Gers FA, Schmidhuber A, Cummins F, 2000. Learning to forget: continual prediction with LSTM. Neur Comput, 12(10):2451–2471. https://doi.org/10.1162/089976600300015015
Article Google Scholar
Goller C, Kuchler A, 1996. Learning task-dependent distributed representations by backpropagation through structure. Proc Int Conf on Neural Networks, p. 347–352. https://doi.org/10.1109/icnn.1996.548916
Google Scholar
Hammerton J, 2003. Named entity recognition with long short-term memory. Proc 7^th Conf on Natural Language Learning at HLT-NAACL, p. 172–175. https://doi.org/10.3115/1119176.1119202
Google Scholar
Hochreiter S, Schmidhuber J, 1997. Long short-term memory. Neur Comput, 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Huang ZH, Wei X, Kai Y, 2015. Bidirectional LSTM-CRF models for sequence tagging. https://doi.org/arxiv.org/abs/1508.01991
Google Scholar
Joshi A, Lal R, Finin T, et al., 2013. Extracting cybersecurity related linked data from text. IEEE 7^th Int Conf on Semantic Computing, p. 252–259. https://doi.org/10.1109/icsc.2013.50
Google Scholar
Koeling R, 2000. Chunking with maximum entropy models. Proc 2^nd Workshop on Learning Language in Logic and the 4^th Conf on Computational Natural Language Learning, p. 139–141. https://doi.org/10.3115/1117601.1117634
Google Scholar
Lafferty JD, McCallum A, Pereira FCN, 2001. Conditional random fields: probabilistic models for segmenting and labeling sequence data. 18^th Int Conf on Machine Learning, p. 282–289.
Google Scholar
Lample G, Ballesteros M, Subramanian S, et al., 2016. Neural architectures for named entity recognition. Proc NAACLHLT, p. 260–270. https://doi.org/10.18653/v1/N16-1030
Google Scholar
LéCun Y, Bottou L, Bengio Y, et al., 1998. Gradient-based learning applied to document recognition. Proc IEEE, 86(11):2278–2324. https://doi.org/10.1109/5.726791
Article Google Scholar
Li JH, 2016. Overview of the technologies of threat intelligence sensing, sharing and analysis in cyber space. Chin J Network Inform Secur, 2(2):16–29 (in Chinese). https://doi.org/10.11959/j.issn.2096-109x.2016.00028
Google Scholar
Liu W, Li Y, Duan H, et al., 2016. Knowledge graph construction techniques. J Comput Res Dev, 53(3):582–600 (in Chinese). https://doi.org/10.7544/issn1000-1239.2016.20148228
Google Scholar
Luo G, Huang XJ, Li CY, et al., 2015. Joint named entity recognition and disambiguation. Proc Conf on Empirical Methods in Natural Language Processing, p. 879–888. https://doi.org/10.18653/v1/d15-1104
Google Scholar
Ma XZ, Hovy E, 2016. End-to-end sequence labeling via bidirectional LSTM-CNNs-CRF. https://doi.org/10.18653/v1/p16-1101
Google Scholar
Mikolov T, Chen K, Corrado G, et al., 2013a. Efficient estimation of word representations in vector space. https://doi.org/arxiv.org/abs/1301.3781
Google Scholar
Mikolov T, Sutskever I, Chen K, et al., 2013b. Distributed representations of words and phrases and their compositionality. https://doi.org/arxiv.org/abs/1310.4546
Google Scholar
Passos A, Kumar V, McCallum A, 2014. Lexicon infused phrase embeddings for named entity resolution. Proc 18^th Conf on Computational Language Learning, p. 78–86. https://doi.org/10.3115/v1/w14-1609
Google Scholar
Peng NY, Dredze M, 2015. Named entity recognition for Chinese social media with jointly trained embeddings. Proc Conf on Empirical Methods in Natural Language Processing, p. 548–554. https://doi.org/10.18653/v1/d15-1064
Google Scholar
Pennington J, Socher R, Manning C, 2014. Glove: global vectors for word representation. Proc Conf on Empirical Methods in Natural Language Processing, p. 1532–1543. https://doi.org/10.3115/v1/d14-1162
Google Scholar
Pham V, Bluche T, Kermorvant C, et al., 2014. Dropout improves recurrent neural networks for handwriting recognition. 14^th Int Conf on Frontiers in Handwriting Recognition, p. 285–290.
Google Scholar
Qiu QQ, Miao DQ, Zhang ZF, 2013. Named entity recognition on Chinese microblog. Comput Sci, 40(6):196–198 (in Chinese). https://doi.org/10.3969/j.issn.1002-137X.2013.06.042
Google Scholar
Rabiner LR, 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE, 77(2):257–286. https://doi.org/10.1109/5.18626
Article Google Scholar
Tang BZ, Cao HX, Wang XL, et al., 2014. Evaluating word representation features in biomedical named entity recognition tasks. Biomed Res Int, 2014:240403. https://doi.org/10.1155/2014/240403
Google Scholar
Yang YM, 1999. An evaluation of statistical approaches to text categorization. Inform Retriev, 1(1–2):69–90. https://doi.org/10.1023/A:1009982220290
Article Google Scholar
Yu HK, Zhang HP, Liu Q, et al., 2006. Chinese named entity identification using cascaded hidden Markov model. J Commun, 27(2):87–94 (in Chinese). https://doi.org/10.3321/j.issn:1000-436X.2006.02.013
Google Scholar
Zhang XY, Wang T, Chen HW, 2005. Research on named entity recognition. Comput Sci, 32(4):44–48 (in Chinese). https://doi.org/10.3969/j.issn.1002-137X.2005.04.014
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science and Technology, Guizhou University, Guiyang, 550025, China
Ya Qin, Guo-wei Shen, Wen-bo Zhao & Yan-ping Chen
Guizhou Provincial Key Laboratory of Public Big Data, Guiyang, 550025, China
Ya Qin, Guo-wei Shen, Wen-bo Zhao & Yan-ping Chen
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
Miao Yu
National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing, 100029, China
Xin Jin

Authors

Ya Qin
View author publications
You can also search for this author in PubMed Google Scholar
Guo-wei Shen
View author publications
You can also search for this author in PubMed Google Scholar
Wen-bo Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yan-ping Chen
View author publications
You can also search for this author in PubMed Google Scholar
Miao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guo-wei Shen.

Additional information

Project supported by the National Natural Science Foundation of China (No. 61802081), the Guizhou Provincial Natural Science Foundation, China (No. 20161052), the Guizhou Provincial Public Big Data Key Laboratory Open Project, China (No. 2017BDKFJJ024), the Guizhou University Doctoral Fund, China (No. 201526), and the Major Scientific and Technological Special Project of Guizhou Province, China (No. 20183001)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qin, Y., Shen, Gw., Zhao, Wb. et al. A network security entity recognition method based on feature template and CNN-BiLSTM-CRF. Frontiers Inf Technol Electronic Eng 20, 872–884 (2019). https://doi.org/10.1631/FITEE.1800520

Download citation

Received: 31 August 2018
Accepted: 11 March 2019
Published: 09 July 2019
Issue Date: June 2019
DOI: https://doi.org/10.1631/FITEE.1800520

Key words

CLC number

TP393.08

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Network Security Entity Recognition Methods Based on the Deep Neural Network

Research on Named Entity Recognition Method of Network Threat Intelligence

Data and knowledge-driven named entity recognition for cyber security

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Subscribe and save

Buy Now