Abstract
In this paper, we propose a new approach for acquiring translation templates automatically from unannotated bilingual spoken language corpora. Two basic algorithms are adopted: a grammar induction algorithm, and an alignment algorithm using Bracketing Transduction Grammar. The approach is unsupervised, statistical, data-driven, and employs no parsing procedure. The acquisition procedure consists of two steps. First, semantic groups and phrase structure groups are extracted from both the source language and the target language through a boosting procedure, in which a synonym dictionary is used to generate the seed groups of the semantic groups. Second, an alignment algorithm based on Bracketing Transduction Grammar aligns the phrase structure groups. The aligned phrase structure groups are post-processed, yielding translation templates. Preliminary experimental results show that the algorithm is effective.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kitano, H.: A Comprehensive and Practical Model of Memory-based Machine Translation. In: 13. IJCAI, Chambery, France (1993)
Sato, S.: MBT2: a method for combining fragments of examples in example-based translation. Artificial Intelligence 75, 31–50 (1995)
Cicekli, I., Guvenir, H.A.: Learning Translation Templates from ilingual Translation Exmples. Applied Intelligence 15(1), 57–76 (2001)
Watanabe, H., Kurohashi, S., Aramaki, E.: Finding Structural Correspondences from Bilingual Parsed Corpus for Corpus-based Translation. In: Proceedings of the 18th International Conference on Computational Linguistics, pp. 906–912 (2000)
Imamura, K.: Hierarchical Phrase Alignment Harmonized with Parsing. In: Proceedings of the 6th Natural Language Processing Pacific Rim Symposium, pp. 377–384 (2001)
Wu, D.: Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Cor-pora. Computational Linguistics 23(3), 377–403 (1997)
Lü, Y., Zhou, M., Li, S., Huang, C., Zhao, T.: Automatic Translation Template Acquisition Based on Bilingual Structure Alignment. Computational Linguistics and Chinese Language Processing 6(1), 83–108 (2001)
Och, F.J., Tillmann, C., Ney, H.: Improved Alignment Models for Statistical Machine Translation. In: Proceedings of the Joint Conference of Empirical Methods in Natural Language Processing and Very Large Corpora, University of Maryland, College Park, MD, June, pp. 20–28 (1999)
Wang, Y.-Y., Waibel, A.: Modeling with Structures in Statistical Machine Translation. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics, Montreal, Canada, pp. 1357–1363 (1998)
Zuo, Y.C., Zong, C.Q.: The method of extracting phrases based on HMM. In: Proceedings of the 8th Joint Symposium on Computational Linguistics, Nanjing, China, pp. 281–287 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hu, R., Wang, X. (2006). Automatic Spoken Language Translation Template Acquisition Based on Boosting Structure Extraction and Alignment. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_72
Download citation
DOI: https://doi.org/10.1007/11939993_72
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)