Nothing Special   »   [go: up one dir, main page]

skip to main content
10.3115/1073445.1073478dlproceedingsArticle/Chapter ViewAbstractPublication PagesnaaclConference Proceedingsconference-collections
Article
Free access

Feature-rich part-of-speech tagging with a cyclic dependency network

Published: 27 May 2003 Publication History

Abstract

We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24% accuracy on the Penn Treebank WSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.

References

[1]
Steven Abney, Robert E. Schapire, and Yoram Singer. 1999. Boosting applied to tagging and PP attachment. In EMNLP/VLC 1999, pages 38--45.
[2]
Thorsten Brants. 2000. TnT -- a statistical part-of-speech tagger. In ANLP 6, pages 224--231.
[3]
Eric Brill and Jun Wu. 1998. Classifier combination for improved lexical disambiguation. In ACL 36/COLING 17, pages 191--195.
[4]
Eric Brill. 1995. Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics, 21(4):543--565.
[5]
Eugene Charniak, Curtis Hendrickson, Neil Jacobson, and Mike Perkowitz. 1993. Equations for part-of-speech tagging. In AAAI 11, pages 784--789.
[6]
Stanley F. Chen and Ronald Rosenfeld. 2000. A survey of smoothing techniques for maximum entropy models. IEEE Transactions on Speech and Audio Processing, 8(1):37--50.
[7]
Kenneth W. Church. 1988. A stochastic parts program and noun phrase parser for unrestricted text. In ANLP 2, pages 136--143.
[8]
Michael Collins. 2002. Discriminative training methods for Hidden Markov Models: Theory and experiments with perceptron algorithms. In EMNLP 2002.
[9]
Robert G. Cowell, A. Philip Dawid, Steffen L. Lauritzen, and David J. Spiegelhalter. 1999. Probabilistic Networks and Expert Systems. Springer-Verlag, New York.
[10]
David Heckerman, David Maxwell Chickering, Christopher Meek, Robert Rounthwaite, and Carl Myers Kadie. 2000. Dependency networks for inference, collaborative filtering and data visualization. Journal of Machine Learning Research, 1(1):49--75.
[11]
Mark Johnson, Stuart Geman, Stephen Canon, Zhiyi Chi, and Stefan Riezler. 1999. Estimators for stochastic "unification-based" grammars. In ACL 37, pages 535--541.
[12]
Dan Klein and Christopher D. Manning. 2002. Conditional structure versus conditional estimation in NLP models. In EMNLP 2002, pages 9--16.
[13]
John Lafferty, Andrew McCallum, and Fernando Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML-2001, pages 282--289.
[14]
Sang-Zoo Lee, Jun ichi Tsujii, and Hae-Chang Rim. 2000. Part-of-speech tagging based on Hidden Markov Model assuming joint independence. In ACL 38, pages 263--169.
[15]
Mitchell P. Marcus, Beatrice Santorini, and Mary A. Marcinkiewicz. 1994. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19:313--330.
[16]
Ian Marshall. 1987. Tag selection using probabilistic methods. In Roger Garside, Geoffrey Sampson, and Geoffrey Leech, editors, The Computational analysis of English: a corpus-based approach, pages 42--65. Longman, London.
[17]
Adwait Ratnaparkhi. 1996. A maximum entropy model for part-of-speech tagging. In EMNLP 1, pages 133--142.
[18]
Scott M. Thede and Mary P. Harper. 1999. Second-order hidden Markov model for part-of-speech tagging. In ACL 37, pages 175--182.
[19]
Kristina Toutanova and Christopher Manning. 2000. Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In EMNLP/VLC 1999, pages 63--71.
[20]
Tong Zhang and Frank J. Oles. 2001. Text categorization based on regularized linear classification methods. Information Retrieval, 4:5--31.

Cited By

View all
  • (2024)Word embeddings-based transfer learning for boosted relational dependency networksMachine Language10.1007/s10994-023-06404-y113:3(1269-1302)Online publication date: 1-Mar-2024
  • (2023)Fine-Grained Domain Adaptation for Chinese Syntactic ProcessingACM Transactions on Asian and Low-Resource Language Information Processing10.1145/362951922:11(1-24)Online publication date: 20-Oct-2023
  • (2023)Bottom-up and Top-down Object Inference Networks for Image CaptioningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/358036619:5(1-18)Online publication date: 16-Mar-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
May 2003
293 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 27 May 2003

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 21 of 29 submissions, 72%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)102
  • Downloads (Last 6 weeks)11
Reflects downloads up to 24 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Word embeddings-based transfer learning for boosted relational dependency networksMachine Language10.1007/s10994-023-06404-y113:3(1269-1302)Online publication date: 1-Mar-2024
  • (2023)Fine-Grained Domain Adaptation for Chinese Syntactic ProcessingACM Transactions on Asian and Low-Resource Language Information Processing10.1145/362951922:11(1-24)Online publication date: 20-Oct-2023
  • (2023)Bottom-up and Top-down Object Inference Networks for Image CaptioningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/358036619:5(1-18)Online publication date: 16-Mar-2023
  • (2023)Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question AnsweringProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599819(4156-4165)Online publication date: 6-Aug-2023
  • (2022)Moment distributionally robust tree structured predictionProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601159(12237-12252)Online publication date: 28-Nov-2022
  • (2022)WikAnalytics: A Web-based Application for Identifying Linguistic Features of a Text Group Supporting Filipino, English, and Taglish LanguagesProceedings of the 2022 5th International Conference on Machine Learning and Machine Intelligence10.1145/3568199.3568229(190-198)Online publication date: 23-Sep-2022
  • (2022)Detecting mistakes in a domain modelProceedings of the 25th International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings10.1145/3550356.3561583(257-266)Online publication date: 23-Oct-2022
  • (2022)CLOP: Video-and-Language Pre-Training with Knowledge RegularizationsProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548346(4584-4593)Online publication date: 10-Oct-2022
  • (2022)Natural Language Processing for Detecting Undefined Values in Specifications2022 17th Annual System of Systems Engineering Conference (SOSE)10.1109/SOSE55472.2022.9812647(191-196)Online publication date: 7-Jun-2022
  • (2021)Location Classification Based on TweetsProceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery10.1145/3486635.3491075(51-60)Online publication date: 2-Nov-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media