Computer Science > Computation and Language

arXiv:2306.06918 (cs)

[Submitted on 12 Jun 2023 (v1), last revised 15 Jun 2023 (this version, v2)]

Title:The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation

Authors:Hao Peng, Xiaozhi Wang, Feng Yao, Kaisheng Zeng, Lei Hou, Juanzi Li, Zhiyuan Liu, Weixing Shen

View PDF

Abstract:Event extraction (EE) is a crucial task aiming at extracting events from texts, which includes two subtasks: event detection (ED) and event argument extraction (EAE). In this paper, we check the reliability of EE evaluations and identify three major pitfalls: (1) The data preprocessing discrepancy makes the evaluation results on the same dataset not directly comparable, but the data preprocessing details are not widely noted and specified in papers. (2) The output space discrepancy of different model paradigms makes different-paradigm EE models lack grounds for comparison and also leads to unclear mapping issues between predictions and annotations. (3) The absence of pipeline evaluation of many EAE-only works makes them hard to be directly compared with EE works and may not well reflect the model performance in real-world pipeline scenarios. We demonstrate the significant influence of these pitfalls through comprehensive meta-analyses of recent papers and empirical experiments. To avoid these pitfalls, we suggest a series of remedies, including specifying data preprocessing, standardizing outputs, and providing pipeline evaluation results. To help implement these remedies, we develop a consistent evaluation framework OMNIEVENT, which can be obtained from this https URL.

Comments:	Accepted at Findings of ACL 2023
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2306.06918 [cs.CL]
	(or arXiv:2306.06918v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2306.06918

Submission history

From: Hao Peng [view email]
[v1] Mon, 12 Jun 2023 07:38:31 UTC (776 KB)
[v2] Thu, 15 Jun 2023 07:23:57 UTC (776 KB)

Computer Science > Computation and Language

Title:The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators