Automatic Spoken Language Translation Template Acquisition Based on Boosting Structure Extraction and Alignment

Rile Hu²² &
Xia Wang²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Included in the following conference series:

International Symposium on Chinese Spoken Language Processing

1606 Accesses

Abstract

In this paper, we propose a new approach for acquiring translation templates automatically from unannotated bilingual spoken language corpora. Two basic algorithms are adopted: a grammar induction algorithm, and an alignment algorithm using Bracketing Transduction Grammar. The approach is unsupervised, statistical, data-driven, and employs no parsing procedure. The acquisition procedure consists of two steps. First, semantic groups and phrase structure groups are extracted from both the source language and the target language through a boosting procedure, in which a synonym dictionary is used to generate the seed groups of the semantic groups. Second, an alignment algorithm based on Bracketing Transduction Grammar aligns the phrase structure groups. The aligned phrase structure groups are post-processed, yielding translation templates. Preliminary experimental results show that the algorithm is effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Exploring the Relevance of Bilingual Morph-Units in Automatic Induction of Translation Templates

Identification of Bilingual Suffix Classes for Classification and Translation Generation

Training Phrase-Based SMT without Explicit Word Alignment

References

Kitano, H.: A Comprehensive and Practical Model of Memory-based Machine Translation. In: 13. IJCAI, Chambery, France (1993)
Google Scholar
Sato, S.: MBT2: a method for combining fragments of examples in example-based translation. Artificial Intelligence 75, 31–50 (1995)
Article Google Scholar
Cicekli, I., Guvenir, H.A.: Learning Translation Templates from ilingual Translation Exmples. Applied Intelligence 15(1), 57–76 (2001)
Article MATH Google Scholar
Watanabe, H., Kurohashi, S., Aramaki, E.: Finding Structural Correspondences from Bilingual Parsed Corpus for Corpus-based Translation. In: Proceedings of the 18th International Conference on Computational Linguistics, pp. 906–912 (2000)
Google Scholar
Imamura, K.: Hierarchical Phrase Alignment Harmonized with Parsing. In: Proceedings of the 6th Natural Language Processing Pacific Rim Symposium, pp. 377–384 (2001)
Google Scholar
Wu, D.: Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Cor-pora. Computational Linguistics 23(3), 377–403 (1997)
Google Scholar
Lü, Y., Zhou, M., Li, S., Huang, C., Zhao, T.: Automatic Translation Template Acquisition Based on Bilingual Structure Alignment. Computational Linguistics and Chinese Language Processing 6(1), 83–108 (2001)
Google Scholar
Och, F.J., Tillmann, C., Ney, H.: Improved Alignment Models for Statistical Machine Translation. In: Proceedings of the Joint Conference of Empirical Methods in Natural Language Processing and Very Large Corpora, University of Maryland, College Park, MD, June, pp. 20–28 (1999)
Google Scholar
Wang, Y.-Y., Waibel, A.: Modeling with Structures in Statistical Machine Translation. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics, Montreal, Canada, pp. 1357–1363 (1998)
Google Scholar
Zuo, Y.C., Zong, C.Q.: The method of extracting phrases based on HMM. In: Proceedings of the 8th Joint Symposium on Computational Linguistics, Nanjing, China, pp. 281–287 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Nokia Research Center, Nokia House 1, No. 11, He Ping Li Dong Jie, Beijing, 100013
Rile Hu & Xia Wang

Authors

Rile Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xia Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of Hong Kong, Hong Kong
Qiang Huo
Human Language Technology Department, Institute for Infocomm Research (I2R), 119613, Singapore
Bin Ma
School of Computer Engineering, Nanyang Technological University (NTU), 639798, Singapore
Eng-Siong Chng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, R., Wang, X. (2006). Automatic Spoken Language Translation Template Acquisition Based on Boosting Structure Extraction and Alignment. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_72

Download citation

DOI: https://doi.org/10.1007/11939993_72
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automatic Spoken Language Translation Template Acquisition Based on Boosting Structure Extraction and Alignment

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Exploring the Relevance of Bilingual Morph-Units in Automatic Induction of Translation Templates

Identification of Bilingual Suffix Classes for Classification and Translation Generation

Training Phrase-Based SMT without Explicit Word Alignment

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Automatic Spoken Language Translation Template Acquisition Based on Boosting Structure Extraction and Alignment

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Exploring the Relevance of Bilingual Morph-Units in Automatic Induction of Translation Templates

Identification of Bilingual Suffix Classes for Classification and Translation Generation

Training Phrase-Based SMT without Explicit Word Alignment

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation