Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1614108.1614135dlproceedingsArticle/Chapter ViewAbstractPublication PagesnaaclConference Proceedingsconference-collections
research-article
Free access

Tagging Icelandic text using a linguistic and a statistical tagger

Published: 22 April 2007 Publication History

Abstract

We describe our linguistic rule-based tagger IceTagger, and compare its tagging accuracy to the TnT tagger, a state-of-the-art statistical tagger, when tagging Icelandic, a morphologically complex language. Evaluation shows that the average tagging accuracy is 91.54% and 90.44%, obtained by IceTagger and TnT, respectively. When tag profile gaps in the lexicon, used by the TnT tagger, are filled with tags produced by our morphological analyser IceMorphy, TnT's tagging accuracy increases to 91.18%.

References

[1]
T. Brants. 2000. TnT: A statistical part-of-speech tagger. In Proceedings of the 6th Conference on Applied natural language processing, Seattle, WA, USA.
[2]
S. Helgadóttir. 2004. Testing Data-Driven Learning Algorithms for PoS Tagging of Icelandic. In H. Holmboe, editor, Nordisk Sprogteknologi 2004. Museum Tusculanums Forlag.
[3]
F. Karlsson, A. Voutilainen, J. Heikkilä, and A. Anttila. 1995. Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text. Mouton de Gruyter, Berlin, Germany.
[4]
H. Loftsson. 2006a. Tagging Icelandic text: A linguistic rule-based approach. Technical Report CS-06-04, Department of Computer Science, University of Sheffield.
[5]
H. Loftsson. 2006b. Tagging a Morphologically Complex Language Using Heuristics. In T. Salakoski, F. Ginter, S. Pyysalo, and T. Pahikkala, editors, Advances in Natural Language Processing, 5th International Conference on NLP, FinTAL 2006, Proceedings, Turku, Finland.
[6]
H. Loftsson. 2006c. Tagging Icelandic text: An experiment with integrations and combinations of taggers. Language Resources and Evaluation, 40(2):175--181.
[7]
A. Mikheev. 1997. Automatic Rule Induction for Unknown Word Guessing. Computational Linguistics, 21(4):543--565.
[8]
P. Nakov, Y. Bonev, G. Angelova, E. Cius, and W. Hahn. 2003. Guessing Morphological Classes of Unknown German Nouns. In Proceedings of Recent Advances in Natural Language Processing, Borovets, Bulgaria.
[9]
J. Pind, F. Magnússon, and S. Briem. 1991. The Icelandic Frequency Dictionary. The Institute of Lexicography, University of Iceland, Reykjavik, Iceland.
[10]
H. Þráinsson. 1994. Icelandic. In E. König and J. Auwera, editors, The Germanic Languages. Routledge, London.
[11]
C. Samuelsson and A Voutilainen. 1997. Comparing a Linguistic and a Stochastic tagger. In Proceedings of the 8th Conference of the European Chapter of the ACL (EACL), Madrid, Spain.
[12]
A. Voutilainen. 1995. A syntax-based part-of-speech analyzer. In Proceedings of the 7th Conference of the European Chapter of the ACL (EACL), Dublin, Ireland.

Cited By

View all
  • (2012)Learning to case-tag modern greek textProceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications10.1007/978-3-642-30448-4_45(353-360)Online publication date: 28-May-2012
  • (2008)Icelandic data driven part of speech taggingProceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers10.5555/1557690.1557700(33-36)Online publication date: 16-Jun-2008

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
NAACL-Short '07: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
April 2007
228 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 22 April 2007

Qualifiers

  • Research-article

Acceptance Rates

Overall Acceptance Rate 21 of 29 submissions, 72%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)39
  • Downloads (Last 6 weeks)10
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2012)Learning to case-tag modern greek textProceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications10.1007/978-3-642-30448-4_45(353-360)Online publication date: 28-May-2012
  • (2008)Icelandic data driven part of speech taggingProceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers10.5555/1557690.1557700(33-36)Online publication date: 16-Jun-2008

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media