Document-Level Event Argument Extraction with Sparse Representation Attention
<p>Example of DEAE. The event trigger word is marked in <b>bold</b> text, and the arguments are marked in red text with <span class="underline">underlines</span>.</p> "> Figure 2
<p>Overview of our APSR. Giving a document as the input, we first encode the document with inter-sentential encoder and intra-sentential encoder with a mask matrix <span class="html-italic">M</span>. The detail of the sparse argument representation mask matrix <span class="html-italic">M</span> can be seen in the dashed box below. Then, the AMR parser module constructs semantic graphs to facilitate semantic interaction. Finally, we fuse the argument representations from two encoders and predict what argument role the candidate span plays.</p> "> Figure 3
<p>Case study, where we pick an instance from the RAMS test set. The event trigger word is marked in red <b>bold</b> text, and the arguments are marked in <b>bold</b> text with <span class="underline">underlines</span>.</p> ">
Abstract
:1. Introduction
- We propose a span-based model for DEAE with a sparse argument representation encoder, which consists of an inter- and intra-sentential encoder with well-designed sparse argument attention mechanism to encode the document from different perspectives.
- We propose three types of sparse argument attention masks (i.e., sequential, flashback, and banded, respectively), which are capable of introducing useful language bias.
- Experimental results on two widely used benchmark datasets, i.e., RAMS and WikiEvents, validate APSR’s superiority over the state-of-the-art baselines.
2. Related Work
2.1. Generation-Based DEAE Method
2.2. Span-Based DEAE Method
3. Approach
3.1. Task Formulation
3.2. Sparse Argument Representation Encoder
- Sequential. Sequential means presenting events or information in the order they occur chronologically or logically. We consider the events are described in the sequential order, that is, tokens in the former sentence can see tokens in the latter one:
- Flashback. Flashback refers to the narrative technique of inserting scenes or events that have occurred in the past into the current timeline of a story (e.g., in historical documentary and literature), so tokens in latter sentence can observe tokens in the former sentence:
- Banded. Considering that the arguments of an event are mostly scattered in neighbor sentences, we set that tokens can only observe tokens in neighbor sentences within the neighbor hop of 3:
3.3. AMR Parser Module
3.4. Fusion and Classification Module
4. Experiments
4.1. Research Questions
- RQ1: Can our proposed APSR model enhance the performance of DEAE compared with state-of-the-art baselines?
- RQ2: Which part of APSR contributes the most to the extraction accuracy?
- RQ3: How effective is the sparse argument attention mechanism for DEAE?
- RQ4: Does APSR solve the issues caused by the long-range dependency and distracting context?
4.2. Datasets and Evaluation Metrics
- Head F1: focuses exclusively on the accuracy of the head word within the event argument span. This metric evaluates the model’s performance in identifying the core part of the argument.
- Span F1: evaluates the correctness that the predicted argument spans completely align with the golden ones. This metric assesses both the recognition of the argument and the precision of its boundaries.
- Coref F1: measures the agreement between the extracted argument and the gold-standard argument [51] in terms of coreference. This metric emphasizes the model’s performance when maintaining contextual consistency.
4.3. Compared Baselines
- BERT-CRF [52], the first model using BERT-based BIO tagging scheme for semantic role labeling;
- [27], a model that adopts greedy decoding and type-constrained decoding mechanism to BERT-CRF;
- Two-Step [23], the first approach which identifies the head-words of event arguments;
- [27], a span-based method that utilizes type-constrained decoding mechanism to Two-Step model;
- TSAR [30], a two-steam span-based model with AMR-guided interaction mechanism.
4.4. Experimental Settings
5. Results and Discussion
5.1. Overall Performance
5.2. Ablation Study
5.3. Error Analysis
5.4. Case Study
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sankepally, R. Event information retrieval from text. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; p. 1447. [Google Scholar]
- Fincke, S.; Agarwal, S.; Miller, S.; Boschee, E. Language model priming for cross-lingual event extraction. Proc. AAAI Conf. Artif. Intell. 2022, 36, 10627–10635. [Google Scholar] [CrossRef]
- Antoine, B.; Yejin, C. Dynamic knowledge graph construction for zero-shot commonsense question answering. arXiv 2019, arXiv:1911.03876. [Google Scholar]
- Guan, S.; Cheng, X.; Bai, L.; Zhang, F.; Li, Z.; Zeng, Y.; Jin, X.; Guo, J. What is event knowledge graph: A survey. IEEE Trans. Knowl. Data Eng. 2022, 35, 7569–7589. [Google Scholar] [CrossRef]
- Liu, C.Y.; Zhou, C.; Wu, J.; Xie, H.; Hu, Y.; Guo, L. CPMF: A collective pairwise matrix factorization model for upcoming event recommendation. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 1532–1539. [Google Scholar]
- Horowitz, D.; Contreras, D.; Salamó, M. EventAware: A mobile recommender system for events. Pattern Recognit. Lett. 2018, 105, 121–134. [Google Scholar] [CrossRef]
- Li, M.; Zareian, A.; Lin, Y.; Pan, X.; Whitehead, S.; Chen, B.; Wu, B.; Ji, H.; Chang, S.F.; Voss, C.; et al. Gaia: A fine-grained multimedia knowledge extraction system. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Online, 5–10 July 2020; pp. 77–86. [Google Scholar]
- Souza Costa, T.; Gottschalk, S.; Demidova, E. Event-QA: A dataset for event-centric question answering over knowledge graphs. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual Event, 19–23 October 2020; pp. 3157–3164. [Google Scholar]
- Wang, J.; Jatowt, A.; Färber, M.; Yoshikawa, M. Improving question answering for event-focused questions in temporal collections of news articles. Inf. Retr. J. 2021, 24, 29–54. [Google Scholar] [CrossRef]
- Nguyen, T.H.; Cho, K.; Grishman, R. Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 300–309. [Google Scholar]
- Liu, X.; Luo, Z.; Huang, H. Jointly multiple events extraction via attention-based graph information aggregation. arXiv 2018, arXiv:1809.09078. [Google Scholar]
- Yang, S.; Feng, D.; Qiao, L.; Kan, Z.; Li, D. Exploring pre-trained language models for event extraction and generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 5284–5294. [Google Scholar]
- Du, X.; Cardie, C. Event extraction by answering (almost) natural questions. arXiv 2020, arXiv:2004.13625. [Google Scholar]
- Wei, K.; Sun, X.; Zhang, Z.; Zhang, J.; Zhi, G.; Jin, L. Trigger is not sufficient: Exploiting frame-aware knowledge for implicit event argument extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Virtual Event, 1–6 August 2021; pp. 4672–4682. [Google Scholar]
- Paolini, G.; Athiwaratkun, B.; Krone, J.; Ma, J.; Achille, A.; Anubhai, R.; Santos, C.N.d.; Xiang, B.; Soatto, S. Structured prediction as translation between augmented natural languages. arXiv 2021, arXiv:2101.05779. [Google Scholar]
- Hsu, I.H.; Huang, K.H.; Boschee, E.; Miller, S.; Natarajan, P.; Chang, K.W.; Peng, N. DEGREE: A Data-Efficient Generation-Based Event Extraction Model. arXiv 2021, arXiv:2108.12724. [Google Scholar]
- Lu, Y.; Liu, Q.; Dai, D.; Xiao, X.; Lin, H.; Han, X.; Sun, L.; Wu, H. Unified structure generation for universal information extraction. arXiv 2022, arXiv:2203.12277. [Google Scholar]
- Lu, Y.; Lin, H.; Xu, J.; Han, X.; Tang, J.; Li, A.; Sun, L.; Liao, M.; Chen, S. Text2Event: Controllable sequence-to-structure generation for end-to-end event extraction. arXiv 2021, arXiv:2106.09232. [Google Scholar]
- Li, S.; Ji, H.; Han, J. Document-level event argument extraction by conditional generation. arXiv 2021, arXiv:2104.05919. [Google Scholar]
- Liu, X.; Huang, H.; Shi, G.; Wang, B. Dynamic prefix-tuning for generative template-based event extraction. arXiv 2022, arXiv:2205.06166. [Google Scholar]
- Du, X.; Ji, H. Retrieval-augmented generative question answering for event argument extraction. arXiv 2022, arXiv:2211.07067. [Google Scholar]
- Liu, J.; Chen, Y.; Xu, J. Machine reading comprehension as data augmentation: A case study on implicit event argument extraction. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Virtual Event, 7–11 November 2021; pp. 2716–2725. [Google Scholar]
- Zhang, Z.; Kong, X.; Liu, Z.; Ma, X.; Hovy, E. A two-step approach for implicit event argument detection. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 7479–7485. [Google Scholar]
- Dai, L.; Wang, B.; Xiang, W.; Mo, Y. Bi-directional iterative prompt-tuning for event argument extraction. arXiv 2022, arXiv:2210.15843. [Google Scholar]
- Yang, X.; Lu, Y.; Petzold, L. Few-shot document-level event argument extraction. arXiv 2022, arXiv:2209.02203. [Google Scholar]
- He, Y.; Hu, J.; Tang, B. Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences? arXiv 2023, arXiv:2306.00502. [Google Scholar]
- Ebner, S.; Xia, P.; Culkin, R.; Rawlins, K.; Van Durme, B. Multi-sentence argument linking. arXiv 2019, arXiv:1911.03766. [Google Scholar]
- Lin, J.; Chen, Q.; Zhou, J.; Jin, J.; He, L. Cup: Curriculum learning based prompt tuning for implicit event argument extraction. arXiv 2022, arXiv:2205.00498. [Google Scholar]
- Fan, S.; Wang, Y.; Li, J.; Zhang, Z.; Shang, S.; Han, P. Interactive Information Extraction by Semantic Information Graph. In Proceedings of the IJCAI, Vienna, Austria, 23–29 July 2022; pp. 4100–4106. [Google Scholar]
- Xu, R.; Wang, P.; Liu, T.; Zeng, S.; Chang, B.; Sui, Z. A two-stream AMR-enhanced model for document-level event argument extraction. arXiv 2022, arXiv:2205.00241. [Google Scholar]
- Zhang, Z.; Ji, H. Abstract meaning representation guided graph encoding and decoding for joint information extraction. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies (NAACL-HLT2021), Online, 6–11 June 2021. [Google Scholar]
- Hsu, I.; Xie, Z.; Huang, K.H.; Natarajan, P.; Peng, N. AMPERE: AMR-aware prefix for generation-based event argument extraction model. arXiv 2023, arXiv:2305.16734. [Google Scholar]
- He, K.; Chen, X.; Xie, S.; Li, Y.; Dollár, P.; Girshick, R. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 16000–16009. [Google Scholar]
- Yuan, C.; Huang, H.; Cao, Y.; Wen, Y. Discriminative reasoning with sparse event representation for document-level event-event relation extraction. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023), Toronto, ON, Canada, 9–14 July 2023. [Google Scholar]
- Grishman, R.; Sundheim, B.M. Message understanding conference-6: A brief history. In Proceedings of the 16th Conference on Computational Linguistics—Volume 1 (COLING 1996), Copenhagen, Denmark, 5–9 August 1996; pp. 466–471. [Google Scholar]
- Zhou, J.; Shuang, K.; Wang, Q.; Yao, X. EACE: A document-level event argument extraction model with argument constraint enhancement. Inf. Process. Manag. 2024, 61, 103559. [Google Scholar] [CrossRef]
- Zeng, Q.; Zhan, Q.; Ji, H. EA2E: Improving Consistency with Event Awareness for Document-Level Argument Extraction. arXiv 2022, arXiv:2205.14847. [Google Scholar]
- Zhang, K.; Shuang, K.; Yang, X.; Yao, X.; Guo, J. What is overlap knowledge in event argument extraction? APE: A cross-datasets transfer learning model for EAE. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 393–409. [Google Scholar]
- Lin, Z.; Zhang, H.; Song, Y. Global constraints with prompting for zero-shot event argument classification. arXiv 2023, arXiv:2302.04459. [Google Scholar]
- Cao, P.; Jin, Z.; Chen, Y.; Liu, K.; Zhao, J. Zero-shot cross-lingual event argument extraction with language-oriented prefix-tuning. Proc. AAAI Conf. Artif. Intell. 2023, 37, 12589–12597. [Google Scholar] [CrossRef]
- Liu, W.; Cheng, S.; Zeng, D.; Qu, H. Enhancing document-level event argument extraction with contextual clues and role relevance. arXiv 2023, arXiv:2310.05991. [Google Scholar]
- Li, F.; Peng, W.; Chen, Y.; Wang, Q.; Pan, L.; Lyu, Y.; Zhu, Y. Event extraction as multi-turn question answering. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16–20 November 2020; pp. 829–838. [Google Scholar]
- Zhou, Y.; Chen, Y.; Zhao, J.; Wu, Y.; Xu, J.; Li, J. What the role is vs. what plays the role: Semi-supervised event argument extraction via dual question answering. Proc. AAAI Conf. Artif. Intell. 2021, 35, 14638–14646. [Google Scholar] [CrossRef]
- Banarescu, L.; Bonial, C.; Cai, S.; Georgescu, M.; Griffitt, K.; Hermjakob, U.; Knight, K.; Koehn, P.; Palmer, M.; Schneider, N. Abstract meaning representation for sembanking. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, Sofia, Bulgaria, 8–9 August 2013; pp. 178–186. [Google Scholar]
- Yang, Y.; Guo, Q.; Hu, X.; Zhang, Y.; Qiu, X.; Zhang, Z. An AMR-based link prediction approach for document-level event argument extraction. arXiv 2023, arXiv:2305.19162. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
- Astudillo, R.F.; Ballesteros, M.; Naseem, T.; Blodgett, A.; Florian, R. Transition-based parsing with stack-transformers. arXiv 2020, arXiv:2010.10669. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Zeng, S.; Xu, R.; Chang, B.; Li, L. Double graph based reasoning for document-level relation extraction. arXiv 2020, arXiv:2009.13752. [Google Scholar]
- Ji, H.; Grishman, R. Refining event extraction through cross-document inference. In Proceedings of the ACL-08: Hlt, 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Columbus, OH, USA, 15–20 June 2008; pp. 254–262. [Google Scholar]
- Shi, P.; Lin, J. Simple bert models for relation extraction and semantic role labeling. arXiv 2019, arXiv:1904.05255. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Angle | Generation-Based | Span-Based |
---|---|---|
Strength | effectively solve the same-role argument assignment | effectively capture cross-sentence and multi-hop structures |
Weakness | exhibit limitations when dealing with long-distance arguments | mainly focus on taking the graph structure as additional features to enrich span representation, ignoring the written pattern of the document |
Dataset | Split | #Docs | #Events | #Arguments |
---|---|---|---|---|
RMAS * | Train | 3194 | 7329 | 17,026 |
Dev | 399 | 924 | 2188 | |
Test | 400 | 871 | 2023 | |
WikiEvents ** | Train | 206 | 3241 | 4542 |
Dev | 20 | 345 | 428 | |
Test | 20 | 365 | 566 |
Method | Dev | Test | ||
---|---|---|---|---|
Span F1 | Head F1 | Span F1 | Head F1 | |
BERT-CRF | 38.1 | 45.7 | 39.3 | 47.1 |
39.2 | 46.7 | 40.5 | 48.0 | |
Two-Step | 38.9 | 46.4 | 40.1 | 47.7 |
40.3 | 48.0 | 41.8 | 49.7 | |
TSAR | 45.50 | 51.66 | 47.13 | 53.75 |
45.85 | 51.98 | 47.28 | 55.02 | |
45.56 | 51.70 | 47.16 | 54.18 | |
44.88 | 52.26 | 46.86 | 53.63 |
Method | Arg Identification | Arg Classification | ||
---|---|---|---|---|
Head F1 | Coref F1 | Head F1 | Coref F1 | |
BERT-CRF | 69.83 | 72.24 | 54.48 | 56.72 |
BERT-QA | 61.05 | 64.59 | 56.16 | 59.36 |
BERT-QA-Doc | 39.15 | 51.25 | 34.77 | 45.96 |
TSAR | 74.44 | 72.37 | 67.10 | 65.79 |
75.20 | 73.05 | 67.14 | 65.53 | |
76.60 | 75.49 | 69.57 | 68.83 | |
73.08 | 71.43 | 65.93 | 64.65 |
Method | Arg Identification | Arg Classification | ||
---|---|---|---|---|
Head F1 | Coref F1 | Head F1 | Coref F1 | |
76.60 | 75.49 | 69.57 | 68.83 | |
- Intra-sentential Encoder | 76.13 | 74.06 | 69.55 | 67.86 |
- Inter-sentential Encoder | 73.94 | 72.52 | 66.67 | 65.78 |
Model | Missing Head | Wrong Span | Wrong Role | Over-Extract |
---|---|---|---|---|
TSAR | 54 | 61 | 15 | 30 |
APSR | 50 | 54 | 11 | 23 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, M.; Chen, H. Document-Level Event Argument Extraction with Sparse Representation Attention. Mathematics 2024, 12, 2636. https://doi.org/10.3390/math12172636
Zhang M, Chen H. Document-Level Event Argument Extraction with Sparse Representation Attention. Mathematics. 2024; 12(17):2636. https://doi.org/10.3390/math12172636
Chicago/Turabian StyleZhang, Mengxi, and Honghui Chen. 2024. "Document-Level Event Argument Extraction with Sparse Representation Attention" Mathematics 12, no. 17: 2636. https://doi.org/10.3390/math12172636
APA StyleZhang, M., & Chen, H. (2024). Document-Level Event Argument Extraction with Sparse Representation Attention. Mathematics, 12(17), 2636. https://doi.org/10.3390/math12172636