Abstract
Symptom entities are widely distributed in Chinese electronic medical records. Previous approaches on symptom entity extraction usually extract continuous strings as symptom entities and require massive human efforts on corpus annotation. We describe the symptom entity as two-tuples of <subject, lesion> and design a soft pattern matching method to locate them in sentences in the EMR. Our bootstrapping approach which only requires a few annotated symptom tuples and it allows iterative extraction from mass electronic medical record databases without human supervision. Furthermore, the described method annotates symptom entities in EMR by the extracted tuples. Starting with 60 annotated entities, our approach reached an F value of 81.40 % in the extraction task of 3,150 entities from 992 sets of electronic medical records.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Linguist. Investig. Rev. Int. Linguist. Fr. Linguist. Gén. 30(24), 3–26 (2007)
Qu, C., Guan, Y., Yang, J., Liu, Y.: The construction of annotated corpora of named entities for Chinese electronic medical records. Chin. High Technol. Lett. 2(5) (2015)
Sittig, D.F., Singh, H.: Which electronic health record is better: A or B? Realities of comparing the effectiveness of electronic health records. J. Comput. Eff. Res. 3(5), 447–450 (2014)
Erica, B., Field, J.R., Sunny, W., et al.: Biobanks and electronic medical records: enabling cost-effective research. Sci. Transl. Med. 6(234), 86 (2014)
Wei, W.-Q., Feng, Q., Jiang, L., et al.: Characterization of statin dose response in electronic medical records. Clin. Pharmacol. Ther. 95(3), 331–338 (2014)
Eriksen, T.E., Risør, M.B.: What is called symptom? Med. Health Care Philos. 17(1), 89–102 (2014)
Yang, J., Yu, Q., Guan, Y., Jiang, Z.: An overview on research of electronic medical record oriented named entity recognition and entity relation extraction. Acta Autom. Sinica 40(8), 1537–1562 (2014)
Uzuner, Ö., South, B.R., Shen, S., et al.: 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 18(5), 552–556 (2011)
Savova, G.K., Masanz, J.J., Ogren, P.V., et al.: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010)
Feng, Y.: Intelligent recognition of named entity in EMRs. Chin. J. Biomed. Eng. 30(2), 256–262 (2011)
Li, D., Savova, G.: Conditional random fields and support vector machines for disorder named entity recognition in clinical texts. In: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing (BioNLP 2008), pp. 94–95 (2008)
Jiang, M., Chen, Y., Liu, M., et al.: A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. J. Am. Med. Inform. Assoc. 8(5), 601–606 (2011)
Jonnalagadda, S., Cohen, T., Wu, S., et al.: Enhancing clinical concept extraction with distributional semantics. J. Biomed. Inform. 45(1), 129–140 (2012)
Bruijn, B.D., Cherry, C., Kiritchenko, S., et al.: Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. J. Am. Med. Inform. Assoc. 18(5), 557–562 (2011)
Xu, G., Quan, G., Wang, Y.: Research of electronic medical record key information extraction based on HL7. J. Harbin Inst. Technol. 3(11), 89–94 (2011)
Zhang, L.: Chinese EMR word segmentation and named entity mining based on semi supervised learning. Harbin Institute of Technology (2014)
Zhao, J., Qin, B.: Design and implementation of event arguments extraction system based on BootStrapping. Intell. Comput. Appl. 2(1), 16–20 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Qin, T., Guan, Y. (2016). A Bootstrapping Approach to Symptom Entity Extraction on Chinese Electronic Medical Records. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-47674-2_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47673-5
Online ISBN: 978-3-319-47674-2
eBook Packages: Computer ScienceComputer Science (R0)