Nothing Special   »   [go: up one dir, main page]

skip to main content
10.3115/1220575.1220633dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
Article
Free access

Part-of-speech tagging using virtual evidence and negative training

Published: 06 October 2005 Publication History

Abstract

We present a part-of-speech tagger which introduces two new concepts: virtual evidence in the form of an "observed child" node, and negative training data to learn the conditional probabilities for the observed child. Associated with each word is a flexible feature-set which can include binary flags, neighboring words, etc. The conditional probability of Tag given Word + Features is implemented using a factored language-model with back-off to avoid data sparsity problems. This model remains within the framework of Dynamic Bayesian Networks (DBNs) and is conditionally-structured, but resolves the label bias problem inherent in the conditional Markov model (CMM).

References

[1]
Michele Banko and Robert C. Moore. 2004. Part of Speech Tagging in Context. Proceedings of COLING.
[2]
Jeff Bilmes. 2004. On Soft Evidence in Bayesian Networks. Tech. Rep. UWEETR-2004-0016, U. Washington Dept. of Electrical Engineering, 2004.
[3]
Jeff Bilmes and Katrin Kirchhoff. 2003. Factored language models and generalized parallel backoff. Proceedings of HLT-NAACL: Short Papers, 4--6.
[4]
Jeff Bilmes and Geoffrey Zweig. 2002. The graphical models toolkit: An open source software system for speech and time-series processing. Proceedings of ICASSP, vol4, 3916--3919.
[5]
Kenneth W. Church and Patrick Hanks. 1989. Word Association Norms, Mutual Information, and Lexicography. Proceedings of ACL, 76--83.
[6]
Michael Collins. 2002. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms. Proc. EMNLP.
[7]
Dan Klein and Christopher D. Manning. 2002. Conditional Structure versus Conditional Estimation in NLP Models. Proceedings of EMNLP, 9--16.
[8]
John Lafferty, Andrew McCallum and Fernando Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. Proceedings of ICML, 282--289.
[9]
Sang-Zoo Lee, Jun-ichi Tsujii and Hae-Chang Rim. 2000. Part-of-Speech Tagging Based on Hidden Markov Model Assuming Joint Independence. Proceedings of 38th ACL, 263--269.
[10]
Mitchell P. Marcus, Beatrice Santorini and Mary A. Marcinkiewicz. 1994. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19:313--330.
[11]
Andrew McCallum. 2003. Efficiently Inducing Features of Conditional Random Fields. Proceedings of UAI.
[12]
Andrew McCallum, Dayne Freitag and Fernando Pereira. 2000. Maximum-Entropy Markov Models for Information Extraction and Segmentation. Proc. 17th International Conf. on Machine Learning, 591--598.
[13]
Judea Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann.
[14]
Adwait Ratnaparkhi. 1996. A maximum entropy model for part-of-speech tagging. EMNLP 1, 133--142.
[15]
Noah A. Smith and Jason Eisner 2005. Contrastive Estimation: Training Log-Linear Models on Unlabeled Data. Proceedings of ACL.
[16]
Andreas Stolcke. 2002. SRILM -- an extensible language modeling toolkit. Proc. ICASSP, vol 2, 901--904.
[17]
Scott M. Thede and Mary P. Harper. 1999. A Second-Order Hidden Markov Model for Part-of-Speech Tagging. Proceedings of 37th ACL, 175--182.
[18]
Kristina Toutanova, Dan Klein, Christopher D. Manning, and Yoram Singer. 2003. Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network. Proceedings of HLT-NAACL, 252--259.

Cited By

View all
  • (2010)A multi-domain web-based algorithm for POS tagging of unknown wordsProceedings of the 23rd International Conference on Computational Linguistics: Posters10.5555/1944566.1944712(1274-1282)Online publication date: 23-Aug-2010
  • (2009)On the use of virtual evidence in conditional random fieldsProceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 310.5555/1699648.1699675(1289-1297)Online publication date: 6-Aug-2009
  1. Part-of-speech tagging using virtual evidence and negative training

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
      October 2005
      1054 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 06 October 2005

      Qualifiers

      • Article

      Acceptance Rates

      HLT '05 Paper Acceptance Rate 127 of 402 submissions, 32%;
      Overall Acceptance Rate 240 of 768 submissions, 31%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)48
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 25 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2010)A multi-domain web-based algorithm for POS tagging of unknown wordsProceedings of the 23rd International Conference on Computational Linguistics: Posters10.5555/1944566.1944712(1274-1282)Online publication date: 23-Aug-2010
      • (2009)On the use of virtual evidence in conditional random fieldsProceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 310.5555/1699648.1699675(1289-1297)Online publication date: 6-Aug-2009

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media