EMSI-BERT: Asymmetrical Entity-Mask Strategy and Symbol-Insert Structure for Drug–Drug Interaction Extraction Based on BERT
<p>Framework of the asymmetrical Entity-Mask strategy and the Symbol-Insert structure for drug–drug interaction extraction based on BERT (FNN represents Feed-forward Neural Network).</p> "> Figure 2
<p>Framework of pre-training BERT (<math display="inline"><semantics> <mi>X</mi> </semantics></math> represents Input sequence; <math display="inline"><semantics> <mrow> <msubsup> <mi>E</mi> <mi>i</mi> <mi>T</mi> </msubsup> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msubsup> <mi>E</mi> <mi>A</mi> <mi>S</mi> </msubsup> </mrow> </semantics></math>, and <math display="inline"><semantics> <mrow> <msubsup> <mi>E</mi> <mi>i</mi> <mi>P</mi> </msubsup> </mrow> </semantics></math> represent Token embedding, Segment embedding, and Position embedding, respectively; <math display="inline"><semantics> <mrow> <msub> <mi>E</mi> <mi>i</mi> </msub> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>h</mi> <mi>i</mi> </msub> </mrow> </semantics></math> represent Embedding vector and Latent semantics of the <math display="inline"><semantics> <mi>i</mi> </semantics></math>-th word, respectively).</p> "> Figure 3
<p>Asymmetrical Entity-Mask strategy for pre-training BERT (FNN represents Feed-forward Neural Network; <math display="inline"><semantics> <mi>X</mi> </semantics></math> and <math display="inline"><semantics> <mrow> <msub> <mi>X</mi> <mi>M</mi> </msub> </mrow> </semantics></math> represent Input sequence and Masking sequence, respectively; <math display="inline"><semantics> <mrow> <msubsup> <mi>E</mi> <mrow> <msub> <mi>X</mi> <mi>M</mi> </msub> </mrow> <mi>T</mi> </msubsup> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msubsup> <mi>E</mi> <mrow> <msub> <mi>X</mi> <mi>M</mi> </msub> </mrow> <mi>P</mi> </msubsup> </mrow> </semantics></math>, and <math display="inline"><semantics> <mrow> <msub> <mi>E</mi> <mrow> <msub> <mi>X</mi> <mi>M</mi> </msub> </mrow> </msub> </mrow> </semantics></math> represent Token embedding sequence, Position embedding sequence, and Embedding representation of <math display="inline"><semantics> <mrow> <msub> <mi>X</mi> <mi>M</mi> </msub> </mrow> </semantics></math>, respectively; <math display="inline"><semantics> <mrow> <msub> <mi>c</mi> <mrow> <mi>M</mi> <mi>A</mi> <mi>S</mi> <mi>K</mi> </mrow> </msub> </mrow> </semantics></math> represents unnormalized category probability vector; <math display="inline"><semantics> <mrow> <msub> <mi>V</mi> <mrow> <mi>M</mi> <mi>A</mi> <mi>S</mi> <mi>K</mi> </mrow> </msub> </mrow> </semantics></math> represents normalized category probability vector).</p> "> Figure 4
<p>Symbol inserting sequence for different entity combinations (S1 and E1 represent the position of the first entity; S2 and E2 represent the position of the second entity).</p> "> Figure 5
<p>Construction of the Symbol-Insert-BERT structure (FNN represents Feed-forward Neural Network; <math display="inline"><semantics> <mi>X</mi> </semantics></math> represents Input sequence, where <math display="inline"><semantics> <mrow> <msub> <mi>e</mi> <mn>1</mn> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>e</mi> <mrow> <mn>10</mn> </mrow> </msub> </mrow> </semantics></math>, and <math display="inline"><semantics> <mrow> <msub> <mi>e</mi> <mrow> <mn>12</mn> </mrow> </msub> </mrow> </semantics></math> denote different drug entities, and <math display="inline"><semantics> <mrow> <msub> <mi>x</mi> <mn>2</mn> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>x</mi> <mn>3</mn> </msub> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>x</mi> <mrow> <mn>11</mn> </mrow> </msub> </mrow> </semantics></math>, and <math display="inline"><semantics> <mrow> <msub> <mi>x</mi> <mi>N</mi> </msub> </mrow> </semantics></math> denote non-drug entities; <math display="inline"><semantics> <mrow> <msubsup> <mi>E</mi> <mrow> <msub> <mi>X</mi> <mi>M</mi> </msub> </mrow> <mi>T</mi> </msubsup> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msubsup> <mi>E</mi> <mrow> <msub> <mi>X</mi> <mi>M</mi> </msub> </mrow> <mi>P</mi> </msubsup> </mrow> </semantics></math>, and <math display="inline"><semantics> <mrow> <msub> <mi>E</mi> <mrow> <msub> <mi>X</mi> <mi>M</mi> </msub> </mrow> </msub> </mrow> </semantics></math> represent Token embedding sequence, Position embedding sequence, and Embedding representation, respectively; <math display="inline"><semantics> <mi>O</mi> </semantics></math> represents the unnormalized category probability).</p> "> Figure 6
<p>Impact of epochs on the performance of EMSI-BERT.</p> "> Figure 7
<p>Impact of different features of Transformer layers as input on the performance of EMSI-BERT.</p> "> Figure 8
<p>Attention weight visualization between the [CLS] node and other words.</p> ">
Abstract
:1. Introduction
- The pre-training strategy of random masking is improved. In the pre-training BERT, an asymmetrical Entity-Mask strategy is proposed to compensate for the lack of entity orientation in the random masking strategy. Based on prior knowledge, the mask probability of drug entities is increased to better retain entities’ co-occurrence information. Ablation experiments confirm that the pre-training BERT with asymmetrical Entity-Mask strategy effectively improves the effect of downstream DDI classification.
- The fine-tuning structure to adapt to downstream tasks is investigated. In the fine-tuning BERT, a Symbol-Insert structure is proposed to preserve most of the structural information of the pre-training BERT and overcome the problem of different entity combinations sharing the same input sequence. The same input sequence is given different forms in the input layer by adding four symbols to the entity combinations, thereby allowing DDI extraction without destroying the structure of pre-training BERT. The experimental results show that the proposed structure can be adapted to the DDI extraction task effectively.
- The migration scheme of combining pre-training and fine-tuning is proposed. An EMSI-BERT method, which incorporates the asymmetrical Entity-Mask strategy into the pre-training and the Symbol-Insert structure into the fine-tuning of BERT, is proposed to realize DDI extraction with few labeled data. Compared with related methods, the proposed EMSI-BERT method is insensitive to data preprocessing and demonstrates comprehensive improvement in the two-classification task of DDI detection and the multi-classification task of DDI extraction, including Advise, Effect, Mechanism, and Int.
2. Related Work
Taxonomy | Method | Advantages | Disadvantages |
---|---|---|---|
Rule-based methods | Bunescu et al. [20], Fundel et al. [21], Segura-Bedmar et al. [22], An et al. [23] | These methods with customized templates are highly accurate for DDI extraction. | (1) The design of patterns or rules is sophisticated; (2) These methods suffer from low recall because of limited templates. |
Traditional machine learning-based methods | Cui et al. [24], Segura-Bedmar et al. [25], Kim et al. [26], FBK-irst [27], WBI [28], UTurku [29], RBF-Linear [30] | These methods offer a balance between accuracy and recall through various classification models. | (1) The design of hand-crafted features is sophisticated; (2) The cascading strategy of different features requires is elaborate designed. |
Deep learning-based methods | CNN [10,35,36,37,38], RNN [9,10,11,12,13,14,14,15,16,17,31,34,40,41,42,43,44], BERT [15,45] | These end-to-end methods reduce the complexity of manual feature extraction and avoid the error accumulation of cascading external models. | (1) These methods require a large amount of external information to achieve a better understanding for DDIs extraction; (2) These methods are poorly suited to co-occurring entities expression and adaptation of downstream tasks. |
3. Materials and Methods
4. EMSI-BERT for DDI Extraction
4.1. An Asymmetrical Entity-Mask Strategy for Pre-Training BERT
4.1.1. Entity-Mask-BERT Model Construction
4.1.2. Pre-Training of Entity-Mask-BERT
4.2. A Symbol-Insert Structure for Fine-Tuning BERT
4.2.1. Symbol-Insert-BERT Model Construction
4.2.2. Fine-Tuning of Symbol-Insert-BERT
5. Results and Discussion
5.1. Biomedical Corpus for Pre-Training BERT with Asymmetrical Entity-Mask Strategy
5.2. Domain-Labeled Dataset for Fine-Tuning BERT with Symbol-Insert Structure
- (1)
- Advice: describes the relevant opinion about the simultaneous use of two drugs, i.e., interaction may be expected, and UROXATRAL should not be used in combination with other alpha-blockers;
- (2)
- Effect: describes the interaction of drug effects, i.e., methionine may protect against the ototoxic effects of gentamicin;
- (3)
- Mechanism: describes the pharmacokinetic mechanism, i.e., Grepafloxacin, like other quinolones, may inhibit the metabolism of caffeine and theobromine;
- (4)
- Int: describes the DDI without any information, i.e., the interaction of omeprazole and ketoconazole has been established;
- (5)
- Other: describes co-occurrence but no relation between two entities, i.e., concomitantly given thiazide diuretics did not interfere with the absorption of a tablet of digoxin.
5.3. Experimental Results and Analysis
5.3.1. Performance Evaluation of the Proposed Method
5.3.2. Comparison of DDI Classification with Related Methods
5.3.3. Ablation Experiment
5.3.4. Model Visualization
- (1)
- Compared with traditional machine learning-based methods, which measure semantics in discrete space and design handcrafted features, the proposed EMSI-BERT method introduces probability embedding to measure semantics in continuous space and uses the end-to-end approach for DDI extraction, thus reducing the complexity of manual feature extraction and the accumulation error of multiple steps.
- (2)
- Compared with deep learning-based methods, such as BILSTM, CNN and BERT-related models, which are limited to the quality of the dataset and the amount of labeled data, the improved asymmetrical Entity-Mask strategy can compensate for the lack of entity orientation and retain entities’ co-occurrence information on the basis of the idea of distance supervision. Ablation experiments show that the asymmetrical Entity-Mask strategy alleviates the problem of data sparsity and effectively improves the effect of downstream DDI classification.
- (3)
- The Symbol-Insert structure, designed for fine-tuning BERT, overcomes the problem of different entity combinations sharing the same input sequence and achieves the end-to-end DDI extraction without destroying the structure of Entity-Mask-BERT. The experimental results show that the designed structure can be adapted to the DDI extraction task effectively. Moreover, the visualization in Section 5.3.4 illustrates that Symbol-Insert-BERT can extract entity-level features, syntactic features, and semantic features for DDI extraction from shallow to deep layers.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Qiu, Y.; Zhang, Y.; Deng, Y.; Liu, S.; Zhang, W. A Comprehensive review of Computational Methods for Drug-drug Interaction Detection. IEEE/ACM Trans. Comput. Biol. Bioinf. 2022, 19, 1968–1985. [Google Scholar] [CrossRef]
- Thv, A.; Ngn, T.K.; Quk, H.; Le, N.Q.K. On the Road to Explainable AI in Drug-drug Interactions Prediction: A Systematic Review. Comput. Struct. Biotechnol. J. 2022, 20, 2112–2123. [Google Scholar] [CrossRef]
- Quan, C.; Wang, M.; Ren, F. An Unsupervised Text Mining Method for Relation Extraction from Biomedical Literature. PLoS ONE 2014, 9, e102039. [Google Scholar] [CrossRef]
- Chen, J.; Bao, H.; Wei, P.; Qic, C.; Buz, T. Biomedical relation extraction via knowledge-enhanced reading comprehension. BMC Bioinform. 2022, 23, 20. [Google Scholar] [CrossRef] [PubMed]
- Ibrahim, H.; Abdo, A.; El Kerdawy, A.M.; Eldin, A.S. Signal Detection in Pharmacovigilance: A Review of Informatics-driven Approaches for the Discovery of Drug-Drug Interaction Signals in Different Data Sources. Artif. Intell. Life Sci. 2021, 1, 100005. [Google Scholar] [CrossRef]
- Feng, Y.-H.; Zhang, S.-W.; Zhang, Q.-Q.; Zhang, C.-H.; Shi, J.-Y. DeepMDDI: A Deep Graph Convolutional Network Framework for Multi-label Prediction of Drug-drug Interactions. Anal. Biochem. 2022, 646, 114631. [Google Scholar] [CrossRef] [PubMed]
- Wishart, D.S.; Feunang, Y.D.; Guo, A.C.; Lo, E.J.; Marcu, A.; Grant, J.R.; Sajed, T.; Johnson, D.; Li, C.; Sayeeda, Z. DrugBank 5.0: A Major Update to the DrugBank Database for 2018. Nucleic Acids Res. 2018, 46, D1074–D1082. [Google Scholar] [CrossRef]
- Zanzoni, A.; Montecchi-Palazzi, L.; Quondam, M.; Ausiello, G.; Helmer-Citterich, M.; Cesareni, G. MINT: A Molecular INTeraction Database. FEBS Lett. 2002, 513, 135–140. [Google Scholar] [CrossRef]
- Kerrien, S.; Aranda, B.; Breuza, L.; Bridge, A.; Broackes-Carter, F.; Chen, C.; Duesbury, M.; Dumousseau, M.; Feuermann, M.; Hinz, U.; et al. The IntAct Molecular Interaction Database in 2012. Nucleic Acids Res. 2011, 40, D841–D846. [Google Scholar] [CrossRef]
- Park, C.; Park, J.; Park, S. AGCN: Attention-based Graph Convolutional Networks for Drug-drug Interaction Extraction. Expert Syst. Appl. 2020, 159, 113538–113550. [Google Scholar] [CrossRef]
- Tran, T.; Kavuluru, R.; Kilicoglu, H. Attention-Gated Graph Convolutions for Extracting Drug Interaction Information from Drug Labels. ACM Trans. Comput. Health 2021, 2, 1–19. [Google Scholar] [CrossRef]
- Zhu, J.; Liu, Y.; Wen, C.; Wu, X. DGDFS: Dependence Guided Discriminative Feature Selection for Predicting Adverse Drug-Drug Interaction. IEEE Trans. Knowl. Data Eng. 2022, 34, 271–285. [Google Scholar] [CrossRef]
- Zhu, J.; Liu, Y.; Zhang, Y.; Chen, Z.; She, K.; Tong, R.S. DAEM: Deep Attributed Embedding based Multi-task Learning for Predicting Adverse Drug–drug Interaction. Expert Syst. Appl. 2023, 215, 119312. [Google Scholar] [CrossRef]
- Fatehifar, M.; Karshenas, H. Drug-Drug Interaction Extraction Using a Position and Similarity Fusion-based Attention Mechanism. J. Biomed. Inform. 2021, 115, 103707. [Google Scholar] [CrossRef]
- Zhu, Y.; Li, L.; Lu, H.; Zhou, A.; Qin, X. Extracting Drug-drug Interactions from Texts with BioBERT and Multiple Entity-aware Attentions. J. Biomed. Inform. 2020, 106, 103451. [Google Scholar] [CrossRef]
- Zaikis, D.; Vlahavas, I. TP-DDI: Transformer-based Pipeline for the Extraction of Drug-Drug Interactions. Artif. Intell. Med. 2021, 119, 102153. [Google Scholar] [CrossRef] [PubMed]
- Wu, H.; Xing, Y.; Ge, W.; Liu, X. Drug-drug Interaction Extraction via Hybrid Neural Networks on Biomedical Literature. J. Biomed. Inform. 2020, 106, 103432. [Google Scholar] [CrossRef]
- Zhang, Y.; Zheng, W.; Lin, H.; Wang, J.; Yang, Z.; Michel, D. Drug-drug Interaction Extraction via Hierarchical RNNs on Sequence and Shortest Dependency Paths. Bioinformatics 2018, 34, 828–835. [Google Scholar] [CrossRef] [PubMed]
- Zhou, D.; Miao, L.; He, Y. Position-aware Deep Multi-task Learning for Drug–Drug Interaction Extraction. Artif. Intell. Med. 2018, 87, 1–8. [Google Scholar] [CrossRef] [PubMed]
- Bunescu, R.; Mooney, R. Integrating Co-occurrence Statistics with Information Extraction for Robust Retrieval of Protein Interactions from Medline. In Proceedings of the Hlt-Naacl Bionlp Workshop on Linking Natural Language Processing and Biology at HLT-NAACL 06; Association for Computational Linguistics: New York, NY, USA, 2006; pp. 49–56. [Google Scholar] [CrossRef] [Green Version]
- Fundel, K.; Küffner, R.; Zimmer, R. RelEx-Relation Extraction Using Dependency Parse Trees. Bioinformatics 2007, 23, 365–371. [Google Scholar] [CrossRef] [PubMed]
- Segura-Bedmar, I.; Martínez, P.; de Pablo-Sánchez, C. A Linguistic Rule-based Approach to Extract Drug-drug Interactions from Pharmacological Documents. BMC Bioinf. 2011, 12, S1. [Google Scholar] [CrossRef]
- An, N.; Xiao, Y.; Yuan, J.; Yang, J.; Alterovitz, G. Extracting Causal Relations from the Literature with Word Vector Mapping. Comput. Biol. Med. 2019, 115, 103524. [Google Scholar] [CrossRef]
- Cui, B.; Lin, H.; Yang, Z. SVM-based Protein-Protein Interaction Extraction from Medline abstracts. In Proceedings of the 2007 Second International Conference on Bio-Inspired Computing: Theories and Applications, Zhengzhou, China, 14–17 September 2007; pp. 182–185. [Google Scholar] [CrossRef]
- Segura-Bedmar, I.; Martínez, P.; de Pablo-Sánchez, C. Using a Shallow Linguistic Kernel for Drug–drug Interaction Extraction. J. Biomed. Inform. 2011, 44, 789–804. [Google Scholar] [CrossRef]
- Kim, S.; Liu, H.; Yeganova, L.; Wilbur, W.J. Extracting Drug-drug Interactions from Literature Using a Rich Feature-based Linear Kernel Approach. J. Biomed. Inform. 2015, 55, 23–30. [Google Scholar] [CrossRef]
- Chowdhury, M.F.M.; Lavelli, A. FBK-irst: A Multi-Phase Kernel Based Approach for Drug-Drug Interaction Detection and Classification that Exploits Linguistic Information. In Proceedings of the 7th International Workshop Semantic Evaluation, Atlanta, GA, USA, 14–15 June 2013; Volume 2, pp. 351–355. [Google Scholar]
- Thomas, P.; Neves, M.; Rocktäschel, T.; Leser, U. WBI-DDI: Drug-Drug Interaction Extraction using Majority Voting. In Proceedings of the 2nd Joint Conference Lexical Computational Semantics, Atlanta, GA, USA, 13–14 June 2013; Volume 2, pp. 628–635. [Google Scholar]
- Björne, J.; Kaewphan, S.; Salakoski, T. UTurku: Drug Named Entity Recognition and Drug-Drug Interaction Extraction Using SVM Classification and Domain Knowledge. In Proceedings of the 2nd Joint Conference Lexical Computational Semantics, Atlanta, GA, USA, 13–14 June 2013; Volume 2, pp. 651–659. [Google Scholar]
- Raihani, A.; Laachfoubi, N. Extracting Drug-drug Interactions from Biomedical Text Using a Feature-based Kernel Approach. J. Theor. Appl. Inf. Technol. 2016, 92, 109–120. [Google Scholar]
- Shi, Y.; Quan, P.; Zhang, T.; Niu, L.F. DREAM: Drug-drug Interaction Extraction with Enhanced Dependency Graph and Attention Mechanism. Methods 2022, 203, 152–159. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Sun, X.; Jin, X.; Sutcliffe, R. Extracting Drug–drug Interactions from No-blinding Texts using Key Semantic Sentences and GHM Loss. J. Biomed. Inform. 2022, 135, 104192. [Google Scholar] [CrossRef] [PubMed]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. arXiv 2013, arXiv:1301.3781v3. [Google Scholar]
- Sahu, S.; Anand, A. Drug-drug Interaction Extraction from Biomedical Texts Using Long Short-term Memory Network. J. Biomed. Inform. 2018, 86, 15–24. [Google Scholar] [CrossRef]
- Liu, S.; Tang, B.; Chen, Q.; Wang, X. Drug-Drug Interaction Extraction via Convolutional Neural Networks. Comput. Math. Methods Med. 2016, 2016, 6918381. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Quan, C.; Hua, L.; Sun, X.; Bai, W. Multichannel Convolutional Neural Network for Biological Relation Extraction. Biomed. Res. Int. 2016, 2016, 1850404. [Google Scholar] [CrossRef] [PubMed]
- Zhao, Z.; Yang, Z.; Luo, L.; Lin, H.; Wang, J. Drug Drug Interaction Extraction from Biomedical Literature using Syntax Convolutional Neural Network. Bioinformatics 2016, 32, 3444–3453. [Google Scholar] [CrossRef] [PubMed]
- Sun, X.; Dong, K.; Ma, L.; Sutcliffe, R.; He, F.; Chen, S.; Feng, J. Drug-Drug Interaction Extraction via Recurrent Hybrid Convolutional Neural Networks with an Improved Focal Loss. Entropy 2019, 21, 37. [Google Scholar] [CrossRef] [PubMed]
- Lim, S.; Lee, K.; Kang, J. Drug Drug Interaction Extraction from the Literature using a Recursive Neural Network. PLoS ONE 2018, 13, e0190926. [Google Scholar] [CrossRef] [PubMed]
- Liu, J.; Huang, Z.; Ren, F.; Hua, L. Drug-Drug Interaction Extraction based on Transfer Weight Matrix and Memory Network. IEEE Access 2019, 7, 101260–101268. [Google Scholar] [CrossRef]
- Zheng, W.; Lin, H.; Luo, L.; Zhao, Z.; Li, Z.; Zhang, Y.; Yang, Z.; Wang, J. An Attention-based Effective Neural Model for Drug-drug Interactions Extraction. BMC Bioinf. 2017, 18, 445. [Google Scholar] [CrossRef] [PubMed]
- Huang, D.; Jiang, Z.; Li, Z.; Li, L. Drug–drug Interaction Extraction from Biomedical Literature Using Support Vector Machine and Long Short Term Memory Networks. Inform. Sci. 2017, 415–416, 100–109. [Google Scholar] [CrossRef]
- Yi, Z.; Li, S.; Yu, J.; Wu, Q. Drug-drug Interaction Extraction via Recurrent Neural Network with Multiple Attention Layers. Adv. Data Min. Appl. 2017, 10604, 554–566. [Google Scholar] [CrossRef]
- Xu, B.; Shi, X.; Yin, Y.; Zhao, Z.; Zheng, W.; Lin, H.; Yang, Z.; Wang, J.; Xia, F. Incorporating User Generated Content for Drug Drug Interaction Extraction Based on Full Attention Mechanism. IEEE Trans. Nanobiosci. 2019, 18, 360–367. [Google Scholar] [CrossRef]
- Peng, Y.; Yan, S.; Lu, Z. Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMO on Ten Benchmarking Datasets. In Proceedings of the 18th BioNLP Workshop and Shared Task, Florence, Italy, 1 August 2019; pp. 58–65. [Google Scholar] [CrossRef]
- Peng, S.; Vijay-Shanker, K. Investigation of Improving the Pre-training and Fine-tuning of BERT model for Biomedical Relation Extraction. BMC Bioinform. 2022, 23, 120. [Google Scholar] [CrossRef]
- Peters, M.; Neumann, M.; Iyyer, M.; Gardner, M.; Zettlemoyer, L. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, LA, USA, 1–6 June 2018; pp. 2227–2237. Available online: https://aclanthology.org/N18-1202 (accessed on 10 May 2021).
- Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019, arXiv:1810.04805v2. [Google Scholar]
- Shah, S.M.A.; Y Ou, Y. TRP-BERT: Discrimination of Transient Receptor Potential (TRP) Channels using Contextual Representations from Deep Bidirectional Transformer based on BERT. Comput. Biol. Med. 2021, 137, 104821. [Google Scholar] [CrossRef]
- Luzuriaga, J.; Munoz, E.; Rosales-Mendez, H.; Hogan, A. Merging Web Tables for Relation Extraction with Knowledge Graphs. IEEE Trans. Knowl. Data Eng. 2023, 35, 1803–1816. [Google Scholar] [CrossRef]
- Zhao, Q.; Xu, D.Z.; Li, J.Q.; Zhao, L.; Rajput, F.A. Knowledge Guided Distance Supervision for Biomedical Relation Extraction in Chinese Electronic Medical Records. Expert Syst. Appl. 2022, 204, 117606. [Google Scholar] [CrossRef]
- GitHub. Available online: https://github.com/huangzhong3315/InsertBERT (accessed on 1 December 2022).
- PubMed Dataset. Available online: https://www.nlm.nih.gov/databases/download/pubmed_medline.html (accessed on 23 March 2021).
- PharmNet. Available online: http://www.pharmnet.com.cn/search/ (accessed on 2 May 2022).
- Hugging Face. Available online: https://huggingface.co/bert-base-uncased (accessed on 1 May 2021).
- Tang, Y.; Zhang, L.; Bao, G.; Ren, F.J.; Pedrycz, W. Symmetric Implicational Algorithm Derived from Intuitionistic Fuzzy Entropy. Iran. J. Fuzzy Syst. 2022, 19, 27–44. [Google Scholar] [CrossRef]
- Tang, Y.; Pan, Z.; Pedrycz, W.; Ren, F.; Song, X. Viewpoint-based Kernel Fuzzy Clustering with Weight Information Granules. IEEE Trans. Emerg. Top. Comput. Intell. 2022, in press. [Google Scholar] [CrossRef]
- Yang, J.Q.; Chen, C.H.; Li, J.Y.; Liu, D.; Li, T.; Zhan, Z.H. Compressed-Encoding Particle Swarm Optimization with Fuzzy Learning for Large-Scale Feature Selection. Symmetry 2022, 14, 1142. [Google Scholar] [CrossRef]
Drug Entity A | Drug Entity B | Example |
---|---|---|
Replaced with [MASK] | Reserved | [MASK] increases the clearance of cyclosporine by 15% |
Reserved | Replaced with [MASK] | Terbinafine increases the clearance of [MASK] by 15% |
Hyperparameter | Value |
---|---|
Optimizer | Adam |
Learning rate | 1 × 10−5 |
Warm-up rate | 0.1 |
Batch-size | 256 |
Sentence length m | Dynamic padding |
Dimension of word embedding d | 768 |
Number of Transformer blocks | 12 |
Relation | Train | Test | ||||
---|---|---|---|---|---|---|
DrugBank | MedLine | Overall | DrugBank | MedLine | Overall | |
Advice | 818 | 8 | 826 | 214 | 7 | 221 |
Effect | 1535 | 152 | 1687 | 298 | 62 | 360 |
Mechanism | 1257 | 62 | 1319 | 278 | 24 | 302 |
Int | 178 | 10 | 188 | 94 | 2 | 96 |
Other | 22,118 | 1547 | 23,665 | 4367 | 345 | 4712 |
Relation | Train | Test | ||||
---|---|---|---|---|---|---|
DrugBank | MedLine | Overall | DrugBank | MedLine | Overall | |
Advice | 815 | 7 | 822 | 214 | 7 | 221 |
Effect | 1517 | 152 | 1669 | 298 | 62 | 360 |
Mechanism | 1257 | 62 | 1319 | 278 | 21 | 299 |
Int | 178 | 10 | 188 | 94 | 2 | 96 |
Other | 14,445 | 1179 | 15,624 | 2819 | 243 | 3062 |
Relation | Basic-BERT Initialization + Symbol-Insert-BERT | Entity-Mask-BERT Initialization + Symbol-Insert-BERT (EMSI-BERT) | ||||
---|---|---|---|---|---|---|
P | R | F1-Score | P | R | F1-Score | |
Advice | 87.85 | 85.84 | 86.83 | 85.52 | 88.20 | 86.86 |
Effect | 77.89 | 81.18 | 79.50 | 79.56 | 82.2 | 80.77 |
Mechanism | 82.97 | 76.84 | 79.79 | 88.46 | 84.89 | 86.64 |
Int | 66.17 | 46.8 | 54.87 | 72.13 | 45.83 | 56.05 |
Other | 80.83 | 77.50 | 79.13 | 83.22 | 80.74 | 81.96 |
Relation | Advice | Effect | Mechanism | Int | Other |
---|---|---|---|---|---|
Advice | 85.8 | 0.4 | 0 | 0.9 | 12.7 |
Effect | 1.6 | 81.1 | 2.8 | 0.2 | 14 |
Mechanism | 2 | 1.3 | 76.8 | 3.3 | 16.4 |
Int | 0 | 40.6 | 2 | 46.8 | 10.4 |
Other | 0.4 | 1.3 | 1.2 | 0.3 | 96.6 |
Relation | Advice | Effect | Mechanism | Int | Other |
---|---|---|---|---|---|
Advice | 88.2 | 0.4 | 0 | 0.9 | 10.4 |
Effect | 3.3 | 82 | 0.8 | 0 | 13.7 |
Mechanism | 2.3 | 1.6 | 84.8 | 0 | 11 |
Int | 0 | 38.5 | 2 | 45.8 | 13.5 |
Other | 0.2 | 0.6 | 0.5 | 0.3 | 98.1 |
Method | Advice | Effect | Mechanism | Int | Dec | Micro-Averaged F1-Score |
---|---|---|---|---|---|---|
Kim et al. [26] | 72.5 | 66.2 | 69.3 | 48.3 | 77.5 | 67.0 |
FBK-irst [27] | 69.2 | 62.8 | 67.9 | 54.0 | 80.0 | 65.1 |
WBI [28] | 63.2 | 61.0 | 61.8 | 51.0 | 75.9 | 60.9 |
UTurku [29] | 63.0 | 60.0 | 58.2 | 50.7 | 69.6 | 59.4 |
RBF-Linear [30] | 77.4 | 69.6 | 73.6 | 52.4 | 81.5 | 71.1 |
EMSI-BERT | 86.8 | 80.7 | 86.6 | 56.0 | 88.0 | 82.0 |
Model | Method | Advice | Effect | Mechanism | Int | Dec | Micro-Averaged F1-Score |
---|---|---|---|---|---|---|---|
CNN | CNN [35] | 77.7 | 69.3 | 70.2 | 46.3 | - | 69.8 |
SCNN [37] | - * | - | - | - | 77.2 | 68.4 | |
MCNN [36] | 78.0 | 68.2 | 72.2 | 51.0 | 79.0 | 70.2 | |
RHCNN [38] | 80.5 | 73.5 | 78.3 | 58.9 | - | 75.5 | |
AGCN [10] | 86.2 | 74.2 | 78.7 | 52.6 | - | 76.9 | |
RNN | Hierarchical RNN [18] | 80.3 | 71.8 | 74.0 | 54.3 | - | 72.9 |
TM-RNN [40] | 76.5 | 70.6 | 76.4 | 52.3 | - | 72.4 | |
DREAM [31] | 84.8 | 76.1 | 81.6 | 55.1 | - | 78.3 | |
Joint-LSTM [34] | 79.4 | 67.6 | 76.3 | 43.1 | - | 71.5 | |
M-BLSTM [19] | 80.1 | 70.4 | 73.0 | 48.0 | 78.5 | 71.8 | |
PM-BLSTM [19] | 81.6 | 71.3 | 74.4 | 48.6 | 78.9 | 73.0 | |
Att-BLSTM [41] | 85.1 | 76.6 | 77.5 | 57.7 | 84.0 | 77.3 | |
BLSTML-SVM [42] | 71.4 | 69.9 | 72.8 | 52.8 | - | 69.0 | |
Hierarchical BLSTMs [14] | 81.9 | 77.4 | 78.0 | 58.4 | - | 78.5 | |
GRU [43] | - | - | - | - | - | 72.2 | |
SGRU-CNN [17] | 82.8 | 72.2 | 78.0 | 50.4 | - | 74.7 | |
UGC-DDI [44] | 76.4 | 68.5 | 76.5 | 45.5 | - | 71.2 | |
BERT | Basic-BERT [45] | - | - | - | - | - | 79.9 |
BioBERT [15] | 86.1 | 80.1 | 84.6 | 56.6 | - | 80.9 | |
EMSI-BERT | 86.8 | 80.7 | 86.6 | 56.0 | 88.0 | 82.0 |
Model Structure | Micro-Average F1-Score |
---|---|
Basic-BERT initialization+ Symbol-Insert structure | 79.0 |
Entity-Mask-BERT+ Symbol-Insert structure (EMSI-BERT) | 82.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, Z.; An, N.; Liu, J.; Ren, F. EMSI-BERT: Asymmetrical Entity-Mask Strategy and Symbol-Insert Structure for Drug–Drug Interaction Extraction Based on BERT. Symmetry 2023, 15, 398. https://doi.org/10.3390/sym15020398
Huang Z, An N, Liu J, Ren F. EMSI-BERT: Asymmetrical Entity-Mask Strategy and Symbol-Insert Structure for Drug–Drug Interaction Extraction Based on BERT. Symmetry. 2023; 15(2):398. https://doi.org/10.3390/sym15020398
Chicago/Turabian StyleHuang, Zhong, Ning An, Juan Liu, and Fuji Ren. 2023. "EMSI-BERT: Asymmetrical Entity-Mask Strategy and Symbol-Insert Structure for Drug–Drug Interaction Extraction Based on BERT" Symmetry 15, no. 2: 398. https://doi.org/10.3390/sym15020398
APA StyleHuang, Z., An, N., Liu, J., & Ren, F. (2023). EMSI-BERT: Asymmetrical Entity-Mask Strategy and Symbol-Insert Structure for Drug–Drug Interaction Extraction Based on BERT. Symmetry, 15(2), 398. https://doi.org/10.3390/sym15020398