Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

DBpedia-Based Entity Linking via Greedy Search and Adjusted Monte Carlo Random Walk

Published: 31 August 2017 Publication History

Abstract

Facing a large amount of entities appearing on the web, entity linking has recently become useful. It assigns an entity from a resource to one name mention to help users grasp the meaning of this name mention. Unfortunately, many possible entities can be assigned to one name mention. Apparently, the usually co-occurring name mentions are related and can be considered together to determine their best assignments. This approach is called collective entity linking and is often conducted based on entity graph. However, traditional collective entity linking methods either consume much time due to the large scale of entity graph or obtain low accuracy due to simplifying graph. To improve both accuracy and efficiency, this article proposes a novel collective entity linking algorithm. It first constructs an entity graph by connecting any two related entities, and then a probability-based objective function is proposed on this graph to ensure the high accuracy of the linking result. Via this function, we convert entity linking to the process of finding the nodes with the highest PageRank Values. Greedy search and an adjusted Monte Carlo random walk are proposed to fulfill this work. Experimental results demonstrate that our algorithm performs much better than traditional linking methods.

References

[1]
K. Avrachenkov, N. Litvak, D. Nemirovsky, and N. Osipova. 2007. Monte Carlo methods in pagerank computation: When one iteration is sufficient. SIAM J. Numer. Anal. 45, 2(2007), 890--904.
[2]
Junil Ahn and Kiseon Kim. 2012. Lower bound on expected complexity of depth-first tree search with multiple radii. IEEE Commun. Lett. 16, 6(2012), 805--808.
[3]
Ayman Alhelbawy and Robert Gaizauskas. 2014. Graph ranking for collective named entity disambiguation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. ACL, 75--80.
[4]
Omar Benjelloun, Hector Garcia-Molina, David Menestrina, Qi Su, Steven Euijong Whang, and Jennifer Widom. 2009. Swoosh: A generic approach to entity resolution. Int. J.VLDB 18, 1(2009), 255--276.
[5]
Roi Blanco, Giuseppe Ottaviano, and Edgar Meij. 2015. Fast and space-efficient entity linking in queries. In Proceedings of the 8th ACM International Conference on Web Search and Data Mining. ACM, 179--188.
[6]
Gianna M. Del Corso, Antonio Gullí, and Francesco Romani. 2005. Pagerank computation via a sparse linear system. Internet Math. 2, 3(2005), 251--273.
[7]
Silviu Cucerzan. 2007. Large-scale named entity disambiguation based on Wikipedia data. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. ACL, 708--716.
[8]
Diego Ceccarelli, Claudio Lucchese, Salvatore Orlando, Raffaele Perego, and Salvatore Trani. 2013. Learning relatedness measures for entity linking. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management. ACM, 139--148.
[9]
Sara Cohen. 2013. Indexing for subtree similarity-search using edit distance. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, New York, NY, 49--60.
[10]
Joachim Daiber, Max Jakob, Chris Hokamp, and Pablo N. Mendes. 2009. Improving efficiency and accuracy in multilingual entity extraction. In Proceedings of the 9th International Conference on Semantic Systems. ACM, Graz, Austria, 121--124.
[11]
Ben Hachey, Will Radford, and James R. Curran. 2011. Graph-based named entity linking with Wikipedia. In Proceedings of the 12th International Conference on Web Information System Engineering. ACM, 213--226.
[12]
Xianpei Han and Le Sun. 2011. A generative entity-mention model for linking entities with knowledge base. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. ACL, 945--954.
[13]
Xianpei Han, Le Sun, and Jun Zhao. 2011. Collective entity linking in web text: A graph-based method. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 765--774.
[14]
Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Furstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. 2011. Robust disambiguation of named entities in text. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. ACL, 782--792.
[15]
Johannes Hoffart, Stephan Seufert, Dat Ba Nguyen, Martin Theobald, and Gerhard Weikum. 2012. KORE: Keyphrase overlap relatedness for entity disambiguation. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM, 545--554.
[16]
Ben Hachey, Will Radford, Joel Nothman, Matthew Honnibal, and James R. Curran. 2013. Evaluating entity linking with wikipedia. Artif. Intell. 194(2013), 130--150.
[17]
Zhengyan He, Shujie Liu, Yang Song, Mu Li, Ming Zhou, and Houfeng Wang. 2013. Efficient collective entity linking with stacking. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. ACL, 426--435.
[18]
Zhengyan He, Shujie Liu, Mu Li, Ming Zhou, Longkai Zhang, and Houfeng Wang. 2013. Learning entity representation for entity disambiguation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. ACL, 30--34.
[19]
Johannes Hoffart, Yasemin Altun, and Gerhard Weikum. 2014. Discovering emerging entities with ambiguous names. In Proceedings of the 23rd International Conference on World Wide Web. ACM, 385--396.
[20]
Hongzhao Huang, Yunbo Cao, Xiaojiang Huang, Heng Ji, and Chin-Yew Lin. 2014. Collective tweet Wikification based on semi-supervised graph regularization. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. ACL, 380--390.
[21]
Heng Ji, Joel Nothman, and Ben Hachey. 2014. Overview of TAC-KBP 2014 entity discovery and linking tasks. In Proceedings of the Text Analysis Conference. 1--15.
[22]
Henry M. Kim and Markus Biehl. 2005. Exploiting the small-worlds of the semantic web to connect heterogeneous, local ontologies. Inf. Technol. Manag. 6, 1(2005), 89--96.
[23]
Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, and Soumen Chakrabarti. 2009. Collective annotation of wikipedia entities in web text. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 457--465.
[24]
Amy N. Langville and Carl D. Meyer. 2003. Deeper inside pagerank. Internet Math. 1, 3(2003), 335--380.
[25]
Amy N. Langville and Carl D. Meyer. 2006. Updating markov chains with an eye on google's pagerank. SIAM J. Matrix Anal. Appl. 27, 4(2006), 968--987.
[26]
Yuhua Li, David McLean, Zuhair A. Bandar, James D. OShea, and Keeley Crockett. 2006. Sentence similarity based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Eng. 18, 8(2006), 1138--1150.
[27]
Fabrizio Lamberti, Andrea Sanna, and Claudio Demartini. 2009. A relation-based pagerank algorithm for semantic web search engines. IEEE Trans. Knowl. Data Eng. 21, 1(2009), 123--136.
[28]
Edgar Meij, Krisztian Balog, and Daan Odijk. 2013. Entity linking and retrieval. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1127--1127.
[29]
Galileo Mark Namata, Stanley Kok, and Lise Getoor. 2011. Collective graph identification. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 87--95.
[30]
Axel-Cyrille Ngonga Ngomo and Sören Auer. 2011. LIMES: A time-efficient approach for large-scale link discovery on the web of data. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence. AAAI, 2312--2317.
[31]
Johannes Hoffart, Martin Theobald, and Gerhard Weikum. 2014. AIDA-light: High-throughput named-entity disambiguation. In Proceedings of the Workshop on Linked Data on the Web Co-Located with the 23rd International World Wide Web Conference. Vol. 1184.
[32]
Eric Sven Ristad and Peter N. Yianilos. 1998. Learning String-Edit Distance. IEEE Trans. Pattern Anal. Mach. Intell. 20, 5(1998), 522--532.
[33]
Lev Ratinov, Dan Roth, Doug Downey, and Mike Anderson. 2011. Local and global algorithms for disambiguation to wikipedia. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1374--1384.
[34]
Prithviraj Sen. 2012. Collective context-aware topic models for entity disambiguation. In Proceedings of the 21st International Conference on World Wide Web. ACM, 729--738.
[35]
Wei Shen, Jiawei Han, and Jianyong Wang. 2014. A probabilistic model for linking named entities in web text with heterogeneous information networks. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. ACM, 1199--1210.
[36]
Wei Shen, Jianyong Wang, and Jiawei Han. 2015. Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27, 2(2015), 443--460.
[37]
Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo, Michael Röder, Daniel Gerber, Sandro Athaide Coelho, Sören Auer, and Andreas Both. 2014. AGDISTIS-agnostic disambiguation of named entities using linked open data. In Proceedings of the 21st European Conference on Artificial Intelligence. AAAI, 1113--1114.
[38]
Zhicheng Zheng, Fangtao Li, Minlie Huang, and Xiaoyan Zhu. 2010. Learning to link entities with knowledge base. In Proceedings of the 2010 Annual Conference of the North American Chapter of the ACL. ACL, Los Angeles, 483--491.
[39]
Zhe Zuo, Gjergji Kasneci, Toni Grütze, and Felix Naumann. 2014. BEL: Bagging for entity linking. In Proceedings of 25th International Conference on Computational Linguistics. 2075--2086.

Cited By

View all

Index Terms

  1. DBpedia-Based Entity Linking via Greedy Search and Adjusted Monte Carlo Random Walk

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Information Systems
      ACM Transactions on Information Systems  Volume 36, Issue 2
      April 2018
      371 pages
      ISSN:1046-8188
      EISSN:1558-2868
      DOI:10.1145/3133943
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 31 August 2017
      Accepted: 01 April 2017
      Revised: 01 February 2017
      Received: 01 June 2016
      Published in TOIS Volume 36, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Collective entity linking
      2. adjusted monte carlo random walk
      3. greedy search
      4. multi-core
      5. pagerank

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • National Natural Science Foundation of China
      • Doctoral Program of Higher Education
      • China Postdoctoral Science Foundation
      • CCF-Tencent Open Fund

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)14
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 22 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)XLORE 3: A Large-Scale Multilingual Knowledge Graph from Heterogeneous Wiki Knowledge ResourcesACM Transactions on Information Systems10.1145/366052142:6(1-47)Online publication date: 19-Aug-2024
      • (2023)IntroductionE-Commerce Big Data Mining and Analytics10.1007/978-981-99-3588-8_1(1-18)Online publication date: 30-Jul-2023
      • (2022)SurfaceVoronoiACM Transactions on Graphics10.1145/3550454.355545341:6(1-12)Online publication date: 1-Dec-2022
      • (2021)Similarity Analysis of Knowledge Graph-based Company Embedding for Stocks Portfolio2021 IEEE 6th International Conference on Smart Cloud (SmartCloud)10.1109/SmartCloud52277.2021.00022(84-89)Online publication date: Nov-2021
      • (2021)A coarse-to-fine collective entity linking method for heterogeneous information networks▪Knowledge-Based Systems10.1016/j.knosys.2021.107286228:COnline publication date: 23-Aug-2021
      • (2020)Complementary dynamicsACM Transactions on Graphics10.1145/3414685.341781939:6(1-11)Online publication date: 27-Nov-2020
      • (2020)Towards Question-based High-recall Information RetrievalACM Transactions on Information Systems10.1145/338864038:3(1-35)Online publication date: 18-May-2020
      • (2020) CoGCN : Combining co‐attention with graph convolutional network for entity linking with knowledge graphs Expert Systems10.1111/exsy.1260638:1Online publication date: 11-Aug-2020
      • (2020)Named Entity Extraction for Knowledge Graphs: A Literature OverviewIEEE Access10.1109/ACCESS.2020.29739288(32862-32881)Online publication date: 2020
      • (2020)Enriching Context Information for Entity Linking with Web DataJournal of Computer Science and Technology10.1007/s11390-020-0280-135:4(724-738)Online publication date: 27-Jul-2020
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media