Abstract
Biomarkers are critical in cancer diagnosis, prognosis, and treatment planning. However, this information is often buried in unstructured text form. In this paper, we make an analogy between Biomarker Information Extraction and Aspect-Based Sentiment Analysis. We propose a system, Biomarker and Result Extraction Model (BioReX). BioReX employs BERT post-training methods to augment the BioBERT model with domain-specific and task-specific knowledge for biomarker extraction. It uses syntactic-based and semantic-based attention to associate results to corresponding biomarkers. Evaluation demonstrates the effectiveness of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dhillon, A., Singh, A., Bhalla, V.K.: A systematic review on biomarker identification for cancer diagnosis and prognosis in multi-omics: from computational needs to machine learning and deep learning. Arch. Comput. Meth. Eng. 30(2), 917–949 (2023)
Echle, A., Rindtorff, N.T., Brinker, T.J., Luedde, T., Pearson, A.T., Kather, J.N.: Deep learning in cancer pathology: a new generation of clinical biomarkers. Br. J. Cancer 124(4), 686–696 (2021)
Foran, D.J., et al.: Roadmap to a comprehensive clinical data warehouse for precision medicine applications in oncology. Cancer Inform. 16, 1176935117694349 (2017)
Gao, X., et al.: CBEx: a hybrid approach for cancer biomarker extraction. In: BIBM, pp. 2958–2958. IEEE (2020)
Islam, M.T., Shaikh, M., Nayak, A., Ranganathan, S.: Extracting biomarker information applying natural language processing and machine learning. In: ICBBE, pp. 1–4. IEEE (2010)
Karimi, A., Rossi, L., Prati, A.: Adversarial training for aspect-based sentiment analysis with BERT. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 8797–8803. IEEE (2021)
Lee, J., et al.: Automated extraction of biomarker information from pathology reports. BMC Med. Inform. Decis. Mak. 18(1), 1–11 (2018)
Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)
Liu, H., Chatterjee, I., Zhou, M., Lu, X.S., Abusorrah, A.: Aspect-based sentiment analysis: a survey of deep learning methods. IEEE Trans. Comput. Soc. Syst. 7(6), 1358–1375 (2020)
Liu, Q., Zhang, H., Zeng, Y., Huang, Z., Wu, Z.: Content attention model for aspect based sentiment analysis. In: WWW, pp. 1023–1032 (2018)
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)
Mohan, S., Li, D.: MedMentions: a large biomedical corpus annotated with UMLS concepts. In: 1st Conference on Automated Knowledge Base Construction, AKBC 2019, Amherst, MA, USA, 20–22 May 2019 (2019). https://doi.org/10.24432/C5G59C
Savova, G.K., et al.: Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. JAMIA 17(5), 507–513 (2010)
Soysal, E., et al.: CLAMP-a toolkit for efficiently building customized clinical natural language processing pipelines. JAMIA 25(3), 331–336 (2018)
Xu, H., Liu, B., Shu, L., Philip, S.Y.: BERT post-training for review reading comprehension and aspect-based sentiment analysis. In: ACL, pp. 2324–2335 (2019)
Zhang, X., et al.: Extracting comprehensive clinical information for breast cancer using deep learning methods. Int. J. Med. Inform. 132, 103985 (2019)
Acknowledgment
The work is partially supported by a grant from the National Institutes of Health (UL1TR003017), the Martin Tuchman’62 Chair Endowment, the Leir Foundation, and the National Science Foundation (CNS 2237328). We gratefully acknowledge Nancy Sazo, Huiqi Chu, and Evita Sadimin for medical knowledge support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Gao, W., Gao, X., Chen, W., Foran, D.J., Chen, Y. (2024). BioReX: Biomarker Information Extraction Inspired by Aspect-Based Sentiment Analysis. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14648. Springer, Singapore. https://doi.org/10.1007/978-981-97-2238-9_10
Download citation
DOI: https://doi.org/10.1007/978-981-97-2238-9_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2240-2
Online ISBN: 978-981-97-2238-9
eBook Packages: Computer ScienceComputer Science (R0)