Abstract
Entity recognition and disambiguation (ERD) is a crucial technique for knowledge base population and information extraction. In recent years, numerous papers have been published on this subject, and various ERD systems have been developed. However, there are still some confusions over the ERD field for a fair and complete comparison of these systems. Therefore, it is of emerging interest to develop a unified evaluation framework. In this paper, we present an easy-to-use evaluation framework (EUEF), which aims at facilitating the evaluation process and giving a fair comparison of ERD systems. EUEF is well designed and released to the public as an open source, and thus could be easily extended with novel ERD systems, datasets, and evaluation metrics. It is easy to discover the advantages and disadvantages of a specific ERD system and its components based on EUEF. We perform a comparison of several popular and publicly available ERD systems by using EUEF, and draw some interesting conclusions after a detailed analysis.
Similar content being viewed by others
References
Bizer, C., Lehmann, J., Kobilarov, G., et al., 2009. DBpedia—a crystallization point for the Web of Data. Web Semant. Sci. Serv. Agents World Wide Web, 7(3):154–165. http://dx.doi.org/10.1016/j.websem.2009.07.002
Carletta, J., 1996. Assessing agreement on classification tasks: the kappa statistic. Comput. Ling., 22(2):249–254.
Cornolti, M., Ferragina, P., Ciaramita, M., 2013. A framework for benchmarking entity-annotation systems. Proc. 22nd Int. Conf. on World Wide Web, p.249–260.
Finkel, J.R., Grenager, T., Manning, C., 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. Proc. 43rd Annual Meeting on Association for Computational Linguistics, p.363–370. http://dx.doi.org/10.3115/1219840.1219885
Hachey, B., Nothman, J., Radford, W., 2014. Cheap and easy entity evaluation. Proc. 52nd Annual Meeting of the Association for Computational Linguistics, p.464–469.
Hoffart, J., Yosef, M.A., Bordino, I., et al., 2011. Robust disambiguation of named entities in text. Proc. Conf. on Empirical Methods in Natural Language Processing, p.782–792.
Ji, H., Nothman, J., Hachey, B., et al., 2014. Overview of TAC-KBP2014 entity discovery and linking tasks. Proc. Text Analysis Conf.
Ji, H., Nothman, J., Hachey, B., et al., 2015. Overview of TAC-KBP2015 tri-lingual entity discovery and linking. Proc. Text Analysis Conf.
Ling, X., Singh, S., Weld, D.S., 2015. Design challenges for entity linking. Trans. Assoc. Comput. Ling., 3:315–328.
Milne, D., Witten, I.H., 2008. Learning to link with Wikipedia. Proc. 17th ACM Conf. on Information and Knowledge Management, p.509–518. http://dx.doi.org/10.1145/1458082.1458150
Milne, D., Witten, I.H., 2013. An open-source toolkit for mining Wikipedia. Artif. Intell., 194:222–239. http://dx.doi.org/10.1016/j.artint.2012.06.007
Ratinov, L., Roth, D., 2009. Design challenges and misconceptions in named entity recognition. Proc. 13th Conf. on Computational Natural Language Learning, p.147–155. http://dx.doi.org/10.3115/1596374.1596399
Ratinov, L., Roth, D., Downey, D., et al., 2011. Local and global algorithms for disambiguation to Wikipedia. Proc. 49th Annual Meeting of the Association for Computational Linguistics: Human Language, p.1375–1384.
Ristad, E.S., Yianilos, P.N., 1998. Learning string-edit distance. IEEE Trans. Patt. Anal. Mach. Intell., 20(5):522–532. http://dx.doi.org/10.1109/34.682181
Rizzo, G., van Erp, M., Troncy, R., 2014. Benchmarking the extraction and disambiguation of named entities on the semantic web. Proc. 9th Int. Conf. on Language Resources and Evaluation.
Shen, W., Wang, J., Han, J., 2015. Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng., 27(2):443–460. http://dx.doi.org/10.1109/TKDE.2014.2327028
Spitkovsky, V.I., Chang, A.X., 2012. A cross-lingual dictionary for English Wikipedia concepts. 8th Int. Conf. on Language Resources and Evaluation, p.3168–3175.
Usbeck, R., Röder, M., Ngonga Ngomo, A.C., et al., 2015. GERBIL: general entity annotator benchmarking framework. Proc. 24th Int. Conf. on World Wide Web, p.1133–1143. http://dx.doi.org/10.1145/2736277.2741626
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Natural Science Foundation of China (No. 61572434), the China Knowledge Centre for Engineering Sciences and Technology (No. CKC-EST-2015-2-5), and the Specialized Research Fund for the Doctoral Program of Higher Education (SRFDP), China (No. 20130101110-136)
ORCID: Hui CHEN, http://orcid.org/0000-0001-9709-977X
Rights and permissions
About this article
Cite this article
Chen, H., Wei, Bg., Li, Ym. et al. An easy-to-use evaluation framework for benchmarking entity recognition and disambiguation systems. Frontiers Inf Technol Electronic Eng 18, 195–205 (2017). https://doi.org/10.1631/FITEE.1500473
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1500473