Nothing Special   »   [go: up one dir, main page]

Skip to main content

An Example of a Compatible NLP Toolkit

  • Conference paper
  • First Online:
Human Language Technology. Challenges for Computer Science and Linguistics (LTC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9561))

Included in the following conference series:

  • 694 Accesses

Abstract

The paper describes an open-source set of linguistic tools, whose distinctive features are its customisability and compatibility with other NLP toolkits: texts in various natural languages and character encodings may be read from a number of popular data formats; all annotation tools may be run with several options to differentiate the format of input and output; rule lists used by individual tools may be supplemented or replaced by the user; external tools (including NLP tools designed in independent research centres) may be incorporated into the toolkit’s environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://nlp.stanford.edu.

  2. 2.

    http://uima.apache.org.

  3. 3.

    http://gate.ac.uk/.

  4. 4.

    http://www.apertium.org.

  5. 5.

    The complete list of annotators can be found at http://psi-toolkit.amu.edu.pl/help/documentation.html.

  6. 6.

    http://utt.amu.edu.pl/files/utt.html.

  7. 7.

    http://psi-toolkit.wmi.amu.edu.pl/help/psi-format.html.

  8. 8.

    Tags that begin with an exclamation mark are so-called plane tags [6].

  9. 9.

    http://aspell.net/.

  10. 10.

    http://www.gala-global.org/oscarStandards/srx/srx20.html.

  11. 11.

    http://sourceforge.net/projects/morfologik.

  12. 12.

    https://en.wikipedia.org/wiki/Outline_of_natural_language_processing, access: 8 August, 2015.

References

  1. Bański, P., Przepiórkowski, A.: The TEI and the NCP: the model and its application. In: LREC2010 Workshop on Language Resources: From Storyboard to Sustainability and LR Lifecycle Management. ELRA, Valletta (2010)

    Google Scholar 

  2. Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python, 1st edn. O’Reilly Media Inc., Sebastopol (2009)

    MATH  Google Scholar 

  3. Forcada, M.L., Ginestí-Rosell, M., Nordfalk, J., O’Regan, J., Ortiz-Rojas, S., Pérez-Ortiz, J.A., Sánchez-Martínez, F., Ramírez-Sánchez, G., Tyers, F.M.: Apertium: a free/open-source platform for rule-based machine translation. Mach. Transl. 25(2), 127–144 (2011)

    Article  Google Scholar 

  4. Graliński, F.: Some methods of describing discontinuity in Polish and their cost-effectiveness. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 69–77. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Graliński, F.: Formalizacja nieciągłości zdań przy zastosowaniu rozszerzonej gramatyki bezkontekstowej. Ph.D. thesis, Adam Mickiewicz University in Poznań, The Faculty of Mathematics and Computer Science, Poznań, supervisor: Zygmunt Vetulani (2007)

    Google Scholar 

  6. Graliński, F., Jassem, K., Junczys-Dowmunt, M.: PSI-Toolkit: Natural language processing pipeline. Comput. Linguist. Appl. 458, 27–39 (2012)

    Article  Google Scholar 

  7. Junczys-Dowmunt, M.: It’s all about the trees – towards a hybrid syntax-based MT system. In: 4th International Multiconference on Computer Science and Information Technology, Mrgowo, Poland, pp. 219–226 (2009)

    Google Scholar 

  8. Koehn, P.: Europarl: a parallel corpus for statistical machine translation. In: Conference Proceedings: The Tenth Machine Translation Summit, vol. 5, pp. 79–86 (2005)

    Google Scholar 

  9. Manicki, L.: Płytki parser języka polskiego (eng: A shallow parser for Polish) (2009). supervisor: Krzysztof Jassem

    Google Scholar 

  10. Obrębski, T., Stolarski, M.: UAM text tools – a text processing toolkit for Polish. In: Proceedings of 2nd Language and Technology Conference, pp. 301–304 (2005)

    Google Scholar 

  11. Przepiórkowski, A., Bańko, M., Górski, R., Barbara, L.T. (eds.): Narodowy Korpus Języka Polskiego. Wydawnictwo Naukowe PWN, Warsaw (2012)

    Google Scholar 

  12. Przepiórkowski, A., Bański, P.: XML text interchange format in the national corpus of Polish. In: Proceedings of Practical Applications in Language and Computers PALC, pp. 55–65 (2009)

    Google Scholar 

  13. Sleator, D.D., Temperley, D.: Parsing English with a link grammar. Technical report, Carnegie Mellon University Computer Science Technical report CMU-CS-91-196 (1995)

    Google Scholar 

  14. Verspoor, K., Baumgartner Jr., W., Roeder, C., Hunter, L.: Abstracting the types away from a UIMA type system. In: From Form to Meaning: Processing Texts Automatically, pp. 249–256 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Krzysztof Jassem or Roman Grundkiewicz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Jassem, K., Grundkiewicz, R. (2016). An Example of a Compatible NLP Toolkit. In: Vetulani, Z., Uszkoreit, H., Kubis, M. (eds) Human Language Technology. Challenges for Computer Science and Linguistics. LTC 2013. Lecture Notes in Computer Science(), vol 9561. Springer, Cham. https://doi.org/10.1007/978-3-319-43808-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43808-5_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43807-8

  • Online ISBN: 978-3-319-43808-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics