research-article

Free access

Exploring deterministic constraints: from a constrained English POS tagger to an efficient ILP solution to Chinese word segmentation

Authors:

Mitch MarcusAuthors Info & Claims

ACL '12: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1

Pages 1054 - 1062

Published: 08 July 2012 Publication History

Abstract

We show for both English POS tagging and Chinese word segmentation that with proper representation, large number of deterministic constraints can be learned from training examples, and these are useful in constraining probabilistic inference. For tagging, learned constraints are directly used to constrain Viterbi decoding. For segmentation, character-based tagging constraints can be learned with the same templates. However, they are better applied to a word-based model, thus an integer linear programming (ILP) formulation is proposed. For both problems, the corresponding constrained solutions have advantages in both efficiency and accuracy.

References

[1]

M. Bansal and D. Klein. 2011. Web-scale features for full-scale parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, pages 693--702.

Digital Library

[2]

Noam Chomsky. 1970. Remarks on nominalization. In R Jacobs and P Rosenbaum, editors, Readings in English Transformational Grammar, pages 184--221. Ginn.

[3]

Michael Collins. 2002. Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing, EMNLP '02, pages 1--8.

Digital Library

[4]

L. Huang. 2008. Forest reranking: Discriminative parsing with non-local features. In In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics.

[5]

W. Jiang, L. Huang, Q. Liu, and Y. Lü. 2008a. A cascaded linear model for joint chinese word segmentation and part-of-speech tagging. In In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics.

[6]

W. Jiang, H. Mi, and Q. Liu. 2008b. Word lattice reranking for chinese word segmentation and part-of-speech tagging. In Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1, COLING '08, pages 385--392.

Digital Library

[7]

T. Kristjansson, A. Culotta, and P. Viola. 2004. Interactive information extraction with constrained conditional random fields. In In AAAI, pages 412--418.

Digital Library

[8]

C. Kruengkrai, K. Uchimoto, J. Kazama, Y. Wang, K. Torisawa, and H. Isahara. 2009. An error-driven word-character hybrid model for joint chinese word segmentation and pos tagging. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL '09, pages 513--521.

Digital Library

[9]

Mitch Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of english: The penn treebank. Computational linguistics, 19(2): 313--330.

Digital Library

[10]

A. F. T. Martins, N. A. Smith, and E. P. Xing. 2009. Concise integer linear programming formulations for dependency parsing. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP (ACL-IJCNLP), pages 342--350, Singapore.

Digital Library

[11]

H. T. Ng and J. K. Low. 2004. Chinese partof-speech tagging: One-at-a-time or all-at-once? word-based or character-based? In In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP), page 277C284.

[12]

A. Ratnaparkhi. 1996. A maximum entropy model for part-of-speech tagging. In In Proceedings of the Empirical Methods in Natural Language Processing Conference (EMNLP).

[13]

S. Ravi and K. Knight. 2009. Minimized models for unsupervised part-of-speech tagging. In Proc. ACL.

Digital Library

[14]

D. Roth and W. Yih. 2005. Integer linear programming inference for conditional random fields. In In Proceedings of the International Conference on Machine Learning (ICML), pages 737--744.

Digital Library

[15]

L. Shen, G. Satta, and A. K. Joshi. 2007. Guided learning for bidirectional sequence classification. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics.

[16]

R. Sproat, W. Gale, C. Shih, and N. Chang. 1996. A stochastic finite-state word-segmentation algorithm for chinese. Comput. Linguist., 22(3): 377--404.

Digital Library

[17]

W. Sun. 2011. A stacked sub-word model for joint chinese word segmentation and part-of-speech tagging. In Proceedings of the ACL-HLT 2011.

Digital Library

[18]

K. Toutanova, D. Klein, C. Manning, and Y. Singer. 2003. Feature-rich part-of-speech tagging with a cyclic dependency network. In NAACL-2003.

Digital Library

[19]

N. Xue. 2003. Chinese word segmentation as character tagging. International Journal of Computational Linguistics and Chinese Language Processing, 9(1): 29--48.

[20]

Y. Zhang and S. Clark. 2007. Chinese Segmentation with a Word-Based Perceptron Algorithm. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 840--847.

Recommendations

Exploring Implicit Semantic Constraints for Bilingual Word Embeddings

Bilingual word embeddings (BWEs) have proven to be useful in many cross-lingual natural language processing tasks. Previous studies often require bilingual texts or dictionaries that are scarce resources. As a result, in these studies, the exploited ...
Exploring the sawa corpus: collection and deployment of a parallel corpus English--Swahili

Research in machine translation and corpus annotation has greatly benefited from the increasing availability of word-aligned parallel corpora. This paper presents ongoing research on the development and application of the sawa corpus, a two-million-word ...
Extracting constraints on word usage from large text corpora
HLT '94: Proceedings of the workshop on Human Language Technology

Our research focuses on the identification of word usage constraints from large text corpora. Such constraints are useful both for the problem of selecting vocabulary for language generation and for disambiguating lexical meaning in interpretation. We ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

ACL '12: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1

July 2012

1100 pages

General Chair:
Haizhou Li
Institute for Infocomm Research
,
Program Chairs:
Chin-Yew Lin
Microsoft Research Asia
,
Miles Osborne
University of Edinburgh

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 08 July 2012

Qualifiers

Research-article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
101
Total Downloads

Downloads (Last 12 months)33
Downloads (Last 6 weeks)7

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents