Abstract
The paper describes an open-source set of linguistic tools, whose distinctive features are its customisability and compatibility with other NLP toolkits: texts in various natural languages and character encodings may be read from a number of popular data formats; all annotation tools may be run with several options to differentiate the format of input and output; rule lists used by individual tools may be supplemented or replaced by the user; external tools (including NLP tools designed in independent research centres) may be incorporated into the toolkit’s environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
The complete list of annotators can be found at http://psi-toolkit.amu.edu.pl/help/documentation.html.
- 6.
- 7.
- 8.
Tags that begin with an exclamation mark are so-called plane tags [6].
- 9.
- 10.
- 11.
- 12.
https://en.wikipedia.org/wiki/Outline_of_natural_language_processing, access: 8 August, 2015.
References
Bański, P., Przepiórkowski, A.: The TEI and the NCP: the model and its application. In: LREC2010 Workshop on Language Resources: From Storyboard to Sustainability and LR Lifecycle Management. ELRA, Valletta (2010)
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python, 1st edn. O’Reilly Media Inc., Sebastopol (2009)
Forcada, M.L., Ginestí-Rosell, M., Nordfalk, J., O’Regan, J., Ortiz-Rojas, S., Pérez-Ortiz, J.A., Sánchez-Martínez, F., Ramírez-Sánchez, G., Tyers, F.M.: Apertium: a free/open-source platform for rule-based machine translation. Mach. Transl. 25(2), 127–144 (2011)
Graliński, F.: Some methods of describing discontinuity in Polish and their cost-effectiveness. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 69–77. Springer, Heidelberg (2006)
Graliński, F.: Formalizacja nieciągłości zdań przy zastosowaniu rozszerzonej gramatyki bezkontekstowej. Ph.D. thesis, Adam Mickiewicz University in Poznań, The Faculty of Mathematics and Computer Science, Poznań, supervisor: Zygmunt Vetulani (2007)
Graliński, F., Jassem, K., Junczys-Dowmunt, M.: PSI-Toolkit: Natural language processing pipeline. Comput. Linguist. Appl. 458, 27–39 (2012)
Junczys-Dowmunt, M.: It’s all about the trees – towards a hybrid syntax-based MT system. In: 4th International Multiconference on Computer Science and Information Technology, Mrgowo, Poland, pp. 219–226 (2009)
Koehn, P.: Europarl: a parallel corpus for statistical machine translation. In: Conference Proceedings: The Tenth Machine Translation Summit, vol. 5, pp. 79–86 (2005)
Manicki, L.: Płytki parser języka polskiego (eng: A shallow parser for Polish) (2009). supervisor: Krzysztof Jassem
Obrębski, T., Stolarski, M.: UAM text tools – a text processing toolkit for Polish. In: Proceedings of 2nd Language and Technology Conference, pp. 301–304 (2005)
Przepiórkowski, A., Bańko, M., Górski, R., Barbara, L.T. (eds.): Narodowy Korpus Języka Polskiego. Wydawnictwo Naukowe PWN, Warsaw (2012)
Przepiórkowski, A., Bański, P.: XML text interchange format in the national corpus of Polish. In: Proceedings of Practical Applications in Language and Computers PALC, pp. 55–65 (2009)
Sleator, D.D., Temperley, D.: Parsing English with a link grammar. Technical report, Carnegie Mellon University Computer Science Technical report CMU-CS-91-196 (1995)
Verspoor, K., Baumgartner Jr., W., Roeder, C., Hunter, L.: Abstracting the types away from a UIMA type system. In: From Form to Meaning: Processing Texts Automatically, pp. 249–256 (2009)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Jassem, K., Grundkiewicz, R. (2016). An Example of a Compatible NLP Toolkit. In: Vetulani, Z., Uszkoreit, H., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2013. Lecture Notes in Computer Science(), vol 9561. Springer, Cham. https://doi.org/10.1007/978-3-319-43808-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-43808-5_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43807-8
Online ISBN: 978-3-319-43808-5
eBook Packages: Computer ScienceComputer Science (R0)