Abstract
Natural language question answering over knowledge graph has received widespread attention. However, the existing methods always aim to improve every phase of natural language question answering and neglect the defects; namely, not all query intentions can be identified and mapped to the correct SPARQL statement. In contrast, keyword search relies on the links among multiple keywords regardless of the exact logic relations in question. Therefore, we propose a framework (abbreviated as NLQSK for title of this paper) that introduces keyword search into natural language question answering to compensate for the defects mentioned above. First, we translate a natural language question into top-k SPARQL statements by using the existing methods. Second, we transform the valuable information that cannot be identified and mapped into keywords, and then, return the neighboring information in a knowledge graph by keyword index. Third, we combine the SPARQL block (i.e., the SPARQL statement and its result) and keyword search to produce the answer to the natural language question. Finally, the experiments on the benchmark dataset confirm that keyword search can compensate for the defects of natural language question answering and that NLQSK can answer more questions than the existing state-of-the-art question answering systems.
Similar content being viewed by others
References
Amsterdamer Y, Kukliansky A, Milo T (2015) NL2CM: a natural language interface to crowd mining. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 1433–1438
De Marneffe MC, Dozat T, Silveira N, Haverinen K, Ginter F, Nivre J, Manning CD (2014) Universal Stanford dependencies: a cross-linguistic typology. In: Proceedings of the international conference on language resources and evaluation (LREC), pp 4585–4592
Diefenbach D, Singh K, Maret P (2018) WDAqua-core1: a question answering service for RDF knowledge bases. In: Proceedings of the international world wide web conferences (WWW), pp 1087–1091
Dima C (2013) Intui2: a prototype system for question answering over linked data. In: Proceedings of the question answering over linked data lab (QALD-3) at CLEF, pp 1–12
Dubey M, Dasgupta S, Sharma A, Hoffner K, Lehmann J (2016) AskNow: a framework for natural language query formalization in SPARQL. In: Proceedings of the international semantic web conference (ISWC), pp 300–316
Elbassuoni S, Blanco R (2011) Keyword search over RDF graphs. In: Proceedings of the 20th ACM international conference on information and knowledge management (CIKM), pp 237–242
Elbassuoni S, Ramanath M, Schenkel R, Weikum G (2010) Searching RDF graphs with SPARQL and keywords. IEEE Data Eng Bull 33:16–24
Fader A, Soderland S, Etzioni O (2011) Identifying relations for open information extraction. In: Proceedings of the 2011 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 1535–1545
Ferré S (2013) Squall2sparql: a translator from controlled English to full SPARQL 1.1. Work. Multilingual question answering over linked data (QALD-3)
Fu H, Anyanwu K (2011) Effectively interpreting keyword queries on RDF databases with a rear view. In: Proceedings of the semantic web–ISWC, pp 193–208
Gai L, Chen W, Wang T (2015) A partition-based summary-graph-driven method for efficient RDF query processing. arXiv:1510.07749
Giannone C, Bellomaria V, Basili R (2013) A HMM-based approach to question answering against linked data. In: Proceedings of the question answering over linked data lab (QALD-3) at CLEF, pp 1–12
Gkirtzou K, Karozos K, Vassalos V (2015) Keywords-to-SPARQL translation for RDF data search and exploration. In: Proceedings of the international conference on theory and practice of digital libraries (TPDL), pp 111–123
He S, Zhang Y, Liu K, Zhao J (2014) CASIA@V2: a MLN-based question answering system over linked data. In: Proceedings of the question answering over linked data (QALD-4), pp 1–11
Hu X, Dang D, Yao Y, Ye L (2018) Natural language aggregate query over RDF data. Inf Sci 454:363–381
Hu X, Duan J, Dang D (2019) Crowdsourcing-based semantic relation recognition for natural language questions over RDF data. Enterp Inf Syst 13:935–958
Hu S, Zou L, Yu JX, Wang H, Zhao D (2018) Answering natural language questions by subgraph matching over knowledge graphs. IEEE Trans Knowl Data Eng 30:824–837
Joris G, Ferré S (2013) Scalewelis: a scalable query-based faceted search system on top of SPARQL endpoints. In: Proceedings of the work multilingual question answering over linked data (QALD-3), pp 1–5
Ladwig G, Tran T (2010) Combining query translation with query answering for efficient keyword search. In: Proceedings of the extended semantic web conference (ESWC), pp 288–303
Le W, Li F, Kementsietsidis A, Duan S (2014) Scalable keyword search on large RDF data. IEEE Trans Knowl Data Eng 26:2774–2788
Lian X, Chen L, Huang Z (2015) Keyword search over probabilistic RDF graphs. IEEE Trans Knowl Data Eng 27:1246–1260
Liu J, Li W, Luo L, Zhou J, Han X, Shi J (2017) Linked open data query based on natural language. Chin J Electron 26:230–235
Mazzeo GM, Zaniolo C (2016) Answering controlled natural language questions on RDF knowledge bases. In: Proceedings of the 19th international conference on extending database technology (EDBT), pp 608–611
Mervin R, Murugesh S, Jaya DA (2016) Representing natural language sentences in RDF graph and discourse representation for ontology mapping. Int J Appl Eng Res 11:632–635
Nakashole N, Weikum G, Suchanek F (2012) PATTY: a taxonomy of relational patterns with semantic types. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 1135–1145
Nakashole N, Weikum G, Suchanek F (2012) Discovering and exploring relations on the web. VLDB Endowment 5(12):1982–1985
Nakashole N, Weikum G, Suchanek F (2013) Discovering semantic relations from the web and organizing them with PATTY. ACM SIGMOD Rec 42(2):29–34
Peng P, Zou L, Qin Z (2017) Answering top-k query combined keywords and structural queries on RDF graphs. Inf Syst 67:19–35
Peng P, Zou L, Zhao D (2015) On the marriage of SPARQL and keywords. In: Proceedings of the Asia-Pacific web conference (APWeb), pp 3–16
Pradel C, Haemmerl´e O, Hernandez N (2012) A semantic web interface using patterns: the SWIP system. In: Proceedings of the graph structures for knowledge representation and reasoning, pp 172–187
Ratinov L, Roth D, Downey D, Anderson M (2011) Local and global algorithms for disambiguation to Wikipedia. In: Proceedings of the 49th annual meeting of the association for computational linguistics (ACL), pp 1375–1384
Rivero CR, Hernnández I, Ruiz D, Corchuelo R (2016) Mapping RDF knowledge bases using exchange samples. Known Based Syst 93:47–66
Rozinajová V, Macko P (2016) Using natural language to search linked data. In: Proceedings of the semantic keyword-based search on structured data sources, pp 179–189
Schuster S, Manning CD (2016) Enhanced English universal dependencies: an improved representation for natural language understanding tasks. In: Proceedings of the international conference on language resources and evaluation (LREC), pp 23–28
Shekarpour S, Marx E, Auer S, Sheth A (2017) RQUERY: Rewriting natural language queries on knowledge graphs to alleviate the vocabulary mismatch problem. In: Proceedings of the thirty-first AAAI conference on artificial intelligence (AAAI), pp 3936–3943
Tran T, Wang H, Rudolph S (2009) Top-k exploration of query candidates for efficient keyword search on graph-shaped (RDF) data. In: Proceedings of the IEEE 25th international conference on data engineering (ICDE), pp 405–416
Usbeck R, Ngomo A C N, Haarmann B, Krithara A, Röder M, Napolitano G (2017) 7th open challenge on question answering over linked data (QALD-7). Semantic web evaluation challenge, pp 59–69
Unger C, Bühmann L, Lehmann J (2012) Template-based question answering over RDF data. In: Proceedings of the 21st international conference on world wide web (WWW), pp 639–648
Yahya M, Berberich K, Elbassuoni S (2012) Deep answers for naturally asked questions on the web of data. In: Proceedings of the 21st international conference on world wide web (WWW), pp 445–449
Yahya M, Berberich K, Elbassuoni S (2012) Natural language questions for the web of data. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 379–390
Yahya M, Berberich K, Elbassuoni S (2013) Robust question answering over the web of linked data. In: Proceedings of the 22th ACM international conference on information and knowledge management (CIKM), pp 1107–1116
Yahya M (2016) Question answering and query processing for extended knowledge graphs. PhD thesis
Yang M, Ding B, Chaudhuri S, Chakrabarti K (2014) Finding patterns in a knowledge base using keywords to compose table answers. Proc VLDB Endow 7:1809–1820
Zheng W, Zou L, Lian X (2015) How to build templates for RDF question/answering: an uncertain graph similarity join approach. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, pp 1809–1824
Zou L, Huang R, Wang H (2014) Natural language question answering over RDF: a graph data driven approach. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, pp 313–324
Acknowledgements
This work was supported by the youth Project of science and technology research program of Chongqing Education Commission of China (No. KJQN201901414 and No. KJQN201901408), the Startup Foundation for Introducing Talent of Yangtze Normal University (No. 0107/011160052), the PhD Candidate Talent Development Project (No. BYJS201908), the Project of Chongqing Natural Science Foundation (No. cstc2019jcyj-msxmX0683 and No. cstc2019jcyj-msxm1579), major Project of science and technology research program of Chongqing Education Commission of China (No. KJZD-M201901401), the National Natural Science Foundation of China (Grant No. 61672102 and No. 61802244), the Program for New Century Excellent Talents in University of Ministry of Education of China (Grant No. NCET-10–0239) and the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2019JQ-668).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hu, X., Duan, J. & Dang, D. Natural language question answering over knowledge graph: the marriage of SPARQL query and keyword search. Knowl Inf Syst 63, 819–844 (2021). https://doi.org/10.1007/s10115-020-01534-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-020-01534-4