Nothing Special   »   [go: up one dir, main page]

skip to main content
10.3115/1075096.1075134dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Self-organizing Markov models and their application to part-of-speech tagging

Published: 07 July 2003 Publication History

Abstract

This paper presents a method to develop a class of variable memory Markov models that have higher memory capacity than traditional (uniform memory) Markov models. The structure of the variable memory models is induced from a manually annotated corpus through a decision tree learning algorithm. A series of comparative experiments show the resulting models outperform uniform memory Markov models in a part-of-speech tagging task.

References

[1]
L. Rabiner. 1989. A tutorial on Hidden Markov Models and selected applications in speech recognition. in Proceedings of the IEEE, 77(2):257--285
[2]
A. Ratnaparkhi. 1996. A maximum entropy model for part-of-speech tagging. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
[3]
V. Vapnik. 1998. Statistical Learning Theory. Wiley, Chichester, UK.
[4]
I. Schröder. 2001. ICOPOST - Ingo's Collection Of POS Taggers. In http://nats-www.informatik.unihamburg.de/~ingo/icopost/.
[5]
T. Brants. 1998 Estimating HMM Topologies. In The Tbilisi Symposium on Logic, Language and Computation: Selected Papers.
[6]
T. Brants. 2000 TnT - A Statistical Part-of-Speech Tagger. In 6'th Applied Natural Language Processing.
[7]
H. Schütze and Y. Singer. 1994. Part-of-speech tagging using a variable memory Markov model. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).
[8]
D. Ron, Y. Singer and N. Tishby. 1996 The Power of Amnesia: Learning Probabilistic Automata with Variable Memory Length. In Machine Learning, 25(2-3):117--149.
[9]
J.-D. Kim, S.-Z. Lee and H.-C. Rim. 1999 HMM Specialization with Selective Lexicalization. In Proceedings of the Joint SIGDAT Conference on Empirical Methods in NLP and Very Large Corpora(EFNLP/VLC99).
[10]
F. Pla and A. Molina. 2001 Part-of-Speech Tagging with Lexicalized HMM. In Proceedings of the International Conference on Recent Advances in Natural Language Processing(RANLP2001).
[11]
R. Quinlan. 1986 Induction of decision trees. In Machine Learning, 1(1):81--106.
[12]
R. L'opez de M'antaras. 1991. A Distance-Based Attribute Selection Measure for Decision Tree Induction. In Machine Learning, 6(1):81--92.
[13]
R. L'opez de M'antaras, J. Cerquides and P. Garcia. 1998. Comparing Information-theoretic Attribute Selection Measures: A statistical approach. In Artificial Intelligence Communications, 11(2):91--100.
[14]
F. Jelinek and R. Mercer. 1980. Interpolated estimation of Markov source parameters from sparse data. In Proceedings of the Workshop on Pattern Recognition in Practice.
[15]
W. Gale and G. Sampson. 1995. Good-Turing frequency estimatin without tears. In Jounal of Quantitative Linguistics, 2:217--237

Cited By

View all
  • (2006)Self-organizing η-gram model for automatic word spacingProceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics10.3115/1220175.1220255(633-640)Online publication date: 17-Jul-2006
  • (2004)Word foldingProceedings of the First international joint conference on Natural Language Processing10.1007/978-3-540-30211-7_43(406-415)Online publication date: 22-Mar-2004

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ACL '03: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
July 2003
571 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 07 July 2003

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)36
  • Downloads (Last 6 weeks)5
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2006)Self-organizing η-gram model for automatic word spacingProceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics10.3115/1220175.1220255(633-640)Online publication date: 17-Jul-2006
  • (2004)Word foldingProceedings of the First international joint conference on Natural Language Processing10.1007/978-3-540-30211-7_43(406-415)Online publication date: 22-Mar-2004

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media