Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Query Generation for Answering Complex Questions in Russian Using a Syntax Parser

  • Published:
Scientific and Technical Information Processing Aims and scope

Abstract

This article presents a system that translates natural language questions into SPARQL queries. The question answering system includes a syntax parser that generates a parse tree of the input sentence; a component that generates a SPARQL query template based on the parse tree; and models that identify entities and relations to be inserted into the SPARQL query template. Entity extraction and relation ranking is performed using BERT. Training BERT for a Russian-language question answering system is faced with the problem of an insufficient volume of available training data. To counter this issue, we investigate the possibility of training multilingual BERT pretrained on the LC-QUAD2.0 dataset to perform the tasks of entity extraction and relation ranking on a small amount of Russian-language samples from the RuBQ dataset. The proposed question answering system, as tested on the RuBQ dataset, outperforms the accuracy of previous approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig 1.
Fig. 2.

REFERENCES

  1. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P., SQuAD: 100,000+ questions for machine comprehension of text, Proc. 2016 Conf. on Empirical Methods in Natural Language Processing, Austin, Texas, 2016, Association for Computational Linguistics, 2016, pp. 2383–2392.  https://doi.org/10.18653/v1/D16-1264

  2. Chen, D., Fisch, A., Weston, J., and Bordes, A., Reading Wikipedia to answer open-domain questions, Proc. 55th Annu. Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, Association for Computational Linguistics, 2017, vol. 1, pp. 1870–1879.  https://doi.org/10.18653/v1/P17-1171

  3. Seo, M., Lee, J., Kwiatkowski, T., Parikh, A., Farhadi, A., and Hajishirzi, H., Real-time open-domain question answering with dense-sparse phrase index, Proc. 57th Annu. Meeting of the Association for Computational Linguistics, Florence, 2019, Association for Computational Linguistics, 2019, pp. 4430–4441.  https://doi.org/10.18653/v1/P19-1436

  4. Bordes A., Usunier N., Chopra S., and Weston J., Large-scale simple question answering with memory networks, 2015. arXiv:1506.02075 [cs.LG].

  5. Vakulenko, S., Garcia, J.D.F., Polleres, A., de Rijke, M., and Cochez, M., Message passing for complex question answering over knowledge graphs, CIKM ’19: Proc. 28th ACM Int. Conf. on Information and Knowledge Management, Beijing, 2019, New York: Association for Computing Machinery, 2019, pp. 1431–1440.  https://doi.org/10.1145/3357384.3358026

  6. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K., BERT: Pre-training of deep bidirectional transformers for language understanding, Proc. 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minn., 2019, Association for Computational Linguistics, 2019, vol. 1, pp. 4171–4186.  https://doi.org/10.18653/v1/N19-1423

  7. Li, X., Sun, X., Meng, Y., Liang, J., Wu, F., and Li, J., Dice loss for data-imbalanced NLP tasks, Proc. 58th Annu. Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, 2020, pp. 465–476.  https://doi.org/10.18653/v1/2020.acl-main.45

  8. Korablinov, V. and Braslavski, P. RuBQ: A Russian dataset for question answering over Wikidata, The Semantic Web—ISWC 2020, Pan, J.Z., Tamma, V., d’Amato, C., Janowicz, K., Fu, Bo, Polleres, A., Seneviratne, O., and Kagal, L., Eds., Lecture Notes in Computer Science, vol. 12607, Cham: Springer, 2020, pp. 97–110.  https://doi.org/10.1007/978-3-030-62466-8_7

  9. Konovalov, V.P., Gulyaev, P.A., Sorokin, A.A., Kuratov, Y.M., and Burtsev, M.S., Exploring the BERT cross-lingual transfer for reading comprehension, Computational Linguistics and Intellectual Technologies, Selegei, V.P., Ed., Moscow: Ross. Gos. Gumanit. Univ., 2020, pp. 445–453.  https://doi.org/10.28995/2075-7182-2020-19-445-453

    Book  Google Scholar 

  10. Pires, T., Schlinger, E., and Garrette, D., How multilingual is multilingual BERT?, Proc. 57th Annu. Meeting of the Association for Computational Linguistics, Florence, 2019, Association for Computational Linguistics, 2019, pp. 4996–5001.  https://doi.org/10.18653/v1/P19-1493

  11. Dubey, M., Banerjee, D., Abdelkawi, A., and Lehmann, J., LC-QuAD 2.0: A large dataset for complex question answering over Wikidata and DBpedia, The Semantic Web—ISWC 2019, Ghidini, C., Hartig, O., Maleshkova, M., Svátek, V., Cruz, I., Hogan, A., Song, J., Lefrançois, M., and Gandon, F., Eds., Lecture Notes in Computer Science, vol. 11779, Cham: Springer, 2019, pp. 69–78.  https://doi.org/10.1007/978-3-030-30796-7_5

    Book  Google Scholar 

  12. Dai, Z., Li, L., and Xu, W., CFO: Conditional focused neural question answering with large-scale knowledge bases, Proc. 54th Annu. Meeting of the Association for Computational Linguistics Berlin, 2016, Association for Computational Linguistics, 2016, vol. 1, pp. 800–810.  https://doi.org/10.18653/v1/P16-1076

  13. Ture F. and Jojic, O., No need to pay attention: Simple recurrent neural networks work!, Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, 2017, Association for Computational Linguistics, 2017, pp. 2866–2872.  https://doi.org/10.18653/v1/D17-1307

  14. Mohammed, S., Shi, P., and Lin, J., Strong baselines for simple question answering over knowledge graphs with and without neural networks, Proc. 2018 Conf. of the North American Chapter of the Association for Computational Linguistics, New Orleans, 2018, Association for Computational Linguistics, 2018, pp. 291–296.  https://doi.org/10.18653/v1/N18-2047

  15. Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A., and Lehmann, J., Learning to rank query graphs for complex question answering over knowledge graphs, The Semantic Web—ISWC 2019, Ghidini, C., Hartig, O., Maleshkova, M., Svátek, V., Cruz, I., Hogan, A., Song, J., Lefrançois, M., and Gandon, F., Eds., Lecture Notes in Computer Science, vol. 11779, Cham: Springer, 2019, pp. 487–504.  https://doi.org/10.1007/978-3-030-30793-6_28

    Book  Google Scholar 

  16. Zafar, H., Napolitano, G., and Lehmann, J., Formal query generation for question answering over knowledge bases, The Semantic Web. ESWC 2018, Gangemi, A., Navigli, R., Vidal, M.-E., Hitzler, P., Troncy, R., Hollink, L., Tordai, A., and Alam, M., Eds., Lecture Notes in Computer Science, vol. 10843, Cham: Springer, 2018, pp. 714–728.  https://doi.org/10.1007/978-3-319-93417-4_46

    Book  Google Scholar 

  17. Ochieng, P., PAROT: Translating natural language to SPARQL, Expert Syst. Appl.: X, 2020, vol. 5, p. 100024.  https://doi.org/10.1016/j.eswax.2020.100024

    Article  Google Scholar 

  18. Evseev, D.A. and Arkhipov, M.Y., SPARQL query generation for complex question answering with BERT and BiLSTM-based model, Computational Linguistics and Intellectual Technologies, Moscow: Ross. Gos. Gumanit. Univ., 2020, pp. 276–282.  https://doi.org/10.28995/2075-7182-2020-19-270-282

    Book  Google Scholar 

  19. Diefenbach, D., Both, A., Singh, K., and Maret, P., Towards a question answering system over the semantic web, Semantic Web, 2020, vol. 11, no. 3, pp. 421–439.  https://doi.org/10.3233/SW-190343

    Article  Google Scholar 

Download references

CONFLICT OF INTEREST

The author declares that he has no conflicts of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. A. Evseev.

Additional information

Translated by A. Ovchinnikova

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Evseev, D.A. Query Generation for Answering Complex Questions in Russian Using a Syntax Parser. Sci. Tech. Inf. Proc. 49, 310–316 (2022). https://doi.org/10.3103/S0147688222050045

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0147688222050045

Keywords:

Navigation