Abstract
This paper presents Manócska, a verb frame database for Hungarian. It is called unified as it was built by merging all available verb frame resources. To be able to merge these, we had to cope with their structural and conceptual differences. After that, we transformed them into two easy to use formats: a TSV and an XML file. Manócska is open-access, the whole resource and the scripts which were used to create it are available in a github repository. This makes Manócska reproducible and easy to access, version, fix and develop in the future. During the merging process, several errors came into sight. These were corrected as systematically as possible. Thus, by integrating and harmonizing the resources, we produced a Hungarian verb frame database of a higher quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The resource and a detailed description of its structure can be found at
- 2.
- 3.
- 4.
Mazsola and Tádé are two puppets from a Hungarian puppet animated film which was popular in the early 1970s. The eponym of our database, Manócska is also a puppet from this film.
- 5.
- 6.
The rank value is computed by dividing the actual frame frequency of the given record and the summarized frame frequency for each resource, and finally by summarizing the divisions’ results.
- 7.
Due to licence reasons, the original resources could not be included but they can be asked for by the original copyright holders at the given addresses.
References
Baker, C.F., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet Project. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, ACL 1998, vol. 1, pp. 86–90. Association for Computational Linguistics, Stroudsburg (1998). https://doi.org/10.3115/980845.980860
Brew, C., Schulte im Walde, S.: Spectral clustering for German verbs. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002, - vol. 10, pp. 117–124. Association for Computational Linguistics, Stroudsburg (2002). https://doi.org/10.3115/1118693.1118709
Halácsy, P., Kornai, A., Németh, L., Rung, A., Szakadát, I., Trón, V.: Creating open language resources for Hungarian. In: Calzolari, N. (ed.) Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), pp. 203–210 (2004)
Indig, B., Vadász, N.: Windows in Human Parsing – How Far can a Preverb Go? In: Tadić, M., Bekavac, B. (eds.) Proceedings of the Tenth International Conference on Natural Language Processing (HrTAL2016) 2016, Dubrovnik, Croatia, 29–30 September 2016. Springer, Cham (2016). (accepted, in press)
Kalivoda, Á.: A magyar igei komplexumok vizsgálata [The Hungarian Verbal Complexes]. Master’s thesis, PPKE-BTK (2016). https://github.com/kagnes/hungarian_verbal_complex
Kornai, A., Nemeskey, D.M., Recski, G.: Detecting Optional Arguments of Verbs. In: Calzolari, N., et al. (eds.) Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA) (2016)
Oravecz, C., Váradi, T., Sass, B.: The Hungarian Gigaword Corpus. In: Calzolari, N., et al. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014). European Language Resources Association (ELRA) (2014)
Sass, B.: Igei szerkezetek gyakorisági szótára - Egy automatikus lexikai kinyerő eljárás és alkalmazása [A Frequency Dictionary of Verbal Structures - An Automatic Lexical Extraction Procedure and its Application]. Ph.D. thesis, Pázmány Péter Katolikus Egyetem ITK (2011)
Sass, B.: 28 millió szintaktikailag elemzett mondat és 500 000 igei szerkezet [28 Million Syntactically Parsed Sentences and 500 000 Verbal Structures]. In: Tanács, A., Varga, V., Vincze, V. (eds.) XI. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2015) [XI. Hungarian Conference on Computational Linguistics], pp. 399–403. SZTE TTIK Informatikai Tanszékcsoport, Szeged (2015)
Sass, B., Váradi, T., Pajzs, J., Kiss, M.: Magyar igei szerkezetek - A leggyakoribb vonzatok és szókapcsolatok szótára [Hungarian Verbal Structures - The Dictionary of the Most Frequent Arguments and Phrases]. Tinta Könyvkiadó, Budapest (2010)
Schuler, K.K.: VerbNet: A broad-coverage, comprehensive verb lexicon. Ph.D. thesis, University of Pennsylvania (2006). http://verbs.colorado.edu/~kipper/Papers/dissertation.pdf
Váradi, T.: The Hungarian National Corpus. In: Proceedings of the Third International Conference on Language Resources and Evaluation (LREC-2002), pp. 385–389. European Language Resources Association, Paris (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Kalivoda, Á., Vadász, N., Indig, B. (2018). Manócska: A Unified Verb Frame Database for Hungarian. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2018. Lecture Notes in Computer Science(), vol 11107. Springer, Cham. https://doi.org/10.1007/978-3-030-00794-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-00794-2_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00793-5
Online ISBN: 978-3-030-00794-2
eBook Packages: Computer ScienceComputer Science (R0)