Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation

Published: 01 August 2013 Publication History

Abstract

This article proposes a novel reordering method for efficient two-step Japanese-to-English statistical machine translation (SMT) that isolates reordering from SMT and solves it after lexical translation. This reordering problem, called post-ordering, is solved as an SMT problem from Head-Final English (HFE) to English. HFE is syntax-based reordered English that is very successfully used for reordering with English-to-Japanese SMT. The proposed method incorporates its advantage into the reverse direction, Japanese-to-English, and solves the post-ordering problem by accurate syntax-based SMT with target language syntax. Two-step SMT with the proposed post-ordering empirically reduces the decoding time of the accurate but slow syntax-based SMT by its good approximation using intermediate HFE. The proposed method improves the decoding speed of syntax-based SMT decoding by about six times with comparable translation accuracy in Japanese-to-English patent translation experiments.

References

[1]
Aikawa, T. and Ruopp, A. 2009. Chained system: A linear combination of different types of statistical machine translation systems. In Proceedings of the 12th Machine Translation Summit.
[2]
Bangalore, S., Haffner, P., and Kanthak, S. 2007. Statistical machine translation through global lexical selection and sentence reconstruction. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 152--159.
[3]
Bangalore, S. and Riccardi, G. 2000. Finite-state models for lexical reordering in spoken language translation. In Proceedings of the International Conference on Spoken Language Processing (ICSLP). 422--425.
[4]
Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., and Mercer, R. L. 1993. The mathematics of statistical machine translation: Parameter estimation. Comput. Linguis. 19, 2, 263--311.
[5]
Chiang, D. 2007. Hierarchical phrase-based translation. Comput. Linguis. 33, 2, 201--228.
[6]
Collins, M., Koehn, P., and Kucerova, I. 2005. Clause restructuring for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Association for Computational Linguistics, 531--540.
[7]
Costa-jussà, M. R. and Fonollosa, J. A. R. 2006. Statistical machine reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 70--76.
[8]
DeNero, J. and Uszkoreit, J. 2011. Inducing sentence structure from parallel corpora for reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 193--203.
[9]
Dugast, L., Senellart, J., and Koehn, P. 2007. Statistical post-editing on SYSTRAN’s rule-based translation system . In Proceedings of the 2nd Workshop on Statistical Machine Translation. Association for Computational Linguistics, 220--223.
[10]
Ehara, T. 2007. Rule based machine translation combined with statistical post editor for japanese to english patent translation. In Proceedings of the MT Summit XI Workshop on Patent Translation.
[11]
Galley, M. and Manning, C. D. 2008. A simple and effective hierarchical phrase reordering model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 848--856.
[12]
Galley, M., Hopkins, M., Knight, K., and Marcu, D. 2004. What’s in a translation rule? In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’04), D. M. Susan Dumais and S. Roukos Eds., Association for Computational Linguistics, 273--280.
[13]
Genzel, D. 2010. Automatically learning source-side reordering rules for large scale machine translation. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10). 376--384.
[14]
Goto, I., Lu, B., Chow, K. P., Sumita, E., and Tsou, B. K. 2011. Overview of the patent machine translation task at the NTCIR-9 workshop. In Proceedings of the NII Test Collection for IR Systems (NTCIR-9).
[15]
Goto, I., Utiyama, M., and Sumita, E. 2012. Post-ordering by parsing for Japanese-English statistical machine translation. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. (Vol. 2: Short Papers). Association for Computational Linguistics, 311--316.
[16]
Graehl, J. and Knight, K. 2004. Training tree transducers. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’04), D. M. Susan Dumais and S. Roukos Eds., Association for Computational Linguistics, 105--112.
[17]
Hong, G., Lee, S.-W., and Rim, H.-C. 2009. Bridging morpho-syntactic gap between source and target sentences for English-Korean statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics and the International Conference on Natural Language Processing (ACL-IJCNLP’09). Conference Short Papers. Association for Computational Linguistics, 233--236.
[18]
Isozaki, H., Hirao, T., Duh, K., Sudoh, K., and Tsukada, H. 2010a. Automatic evaluation of translation quality for distant language pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 944--952.
[19]
Isozaki, H., Sudoh, K., Tsukada, H., and Duh, K. 2010b. Head finalization: A simple reordering rule for sov languages. In Proceedings of the Joint 5th Workshop on Statistical Machine Translation and MetricsMATR. Association for Computational Linguistics, 244--251.
[20]
Isozaki, H., Sudoh, K., Tsukada, H., and Duh, K. 2012. HPSG-based preprocessing for English-to-Japanese translation. ACM Trans. Asian Lang. Inf. Proces. 11, 3.
[21]
Katz-Brown, J. and Collins, M. 2008. Syntactic reordering in preprocessing for Japanese-English translation: MIT system description for NTCIR-7 patent translation task . In Proceedings of the NII Test Collection for IR Systems (NTCIR-7). 409--414.
[22]
Katz-Brown, J., Petrov, S., McDonald, R., Och, F., Talbot, D., Ichikawa, H., Seno, M., and Kazawa, H. 2011. Training a parser for machine translation reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 183--192.
[23]
Koehn, P. 2010. Statistical Machine Translation. Cambridge University Press, Cambridge, U.K.
[24]
Koehn, P., Och, F. J., and Marcu, D. 2003. Statistical phrase-based translation. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. 263--270.
[25]
Kondo, S., Komachi, M., Matsumoto, Y., Sudoh, K., Duh, K., and Tsukada, H. 2011. Learning of linear ordering problems and its application to J-E patent translation in NTCIR-9 PatentMT. In Proceedings of the NII Test Collection for IR Systems (NTCIR-9).
[26]
Li, C.-H., Li, M., Zhang, D., Li, M., Zhou, M., and Guan, Y. 2007. A probabilistic approach to syntax-based reordering for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 720--727.
[27]
Matusov, E., Kanthak, S., and Ney, H. 2005. On the integration of speech recognition and statistical machine translation. In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH). 3177--3180.
[28]
Miyao, Y. and Tsujii, J. 2008. Feature forest models for probabilistic hpsg parsing. Comput. Linguis. 34, 1, 35--80.
[29]
Nagata, M., Saito, K., Yamamoto, K., and Ohashi, K. 2006. A clustered global phrase reordering model for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 713--720.
[30]
Neubig, G., Watanabe, T., and Mori, S. 2012. Inducing a discriminative parser to optimize machine translation reordering. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 843--853.
[31]
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 311--318.
[32]
Quirk, C., Menezes, A., and Cherry, C. 2005. Dependency treelet translation: Syntactically informed phrasal SMT. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Association for Computational Linguistics, 271--279.
[33]
Simard, M., Goutte, C., and Isabelle, P. 2007. Statistical phrase-based post-editing. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 508--515.
[34]
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., and Makhoul, J. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA). 223--231.
[35]
Sudoh, K., Duh, K., Tsukada, H., Nagata, M., Wu, X., Matsuaki, T., and Tsujii, J. 2011a. NTT-UT statistical machine translation in NTCIR-9 PatentMT. In Proceedings of the NII Test Collection for IR Systems (NTCIR-9).
[36]
Sudoh, K., Wu, X., Duh, K., Tsukada, H., and Nagata, M. 2011b. Post-ordering in statistical machine translation. In Proceedings of the 13th Machine Translation Summit (MT Summit XIII). 316--323.
[37]
Tillmann, C. 2004. A unigram orientation model for statistical machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’04), D. M. Dumais and S. Roukos Eds., Association for Computational Linguistics, 101--104.
[38]
Tillmann, C., Vogel, S., Ney, H., Zubiaga, A., and Sawaf, H. 1997. Accelerated DP based search for statistical translation. In Proceedings of the European Conference on Speech Communication and Technology (Eurospeech). Vol. 5. 2667--2670.
[39]
Tromble, R. and Eisner, J. 2009. Learning linear ordering problems for better translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1007--1016.
[40]
Wu, D. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comput. Linguis. 23, 3, 377--404.
[41]
Wu, H. and Wang, H. 2007. Pivot language approach for phrase-based statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 856--863.
[42]
Xia, F. and McCord, M. 2004. Improving a statistical MT system with automatically learned rewrite patterns. In Proceedings of the International Conference on Computational Linguistics (COLING). 508--514.
[43]
Xu, P., Kang, J., Ringgaard, M., and Och, F. 2009. Using a dependency parser to improve smt for subject-object-verb languages. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 245--253.
[44]
Yamada, K. and Knight, K. 2001. A syntax-based statistical translation model. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 523--530.
[45]
Zollmann, A. and Venugopal, A. 2006. Syntax augmented machine translation via chart parsing. In Proceedings on the Workshop on Statistical Machine Translation. Association for Computational Linguistics, 138--141.

Cited By

View all
  • (2023)How Good are Transformers in Reordering?Multi-disciplinary Trends in Artificial Intelligence10.1007/978-3-031-36402-0_5(60-67)Online publication date: 24-Jun-2023
  • (2018)A neural reordering model based on phrasal dependency tree for statistical machine translationIntelligent Data Analysis10.3233/IDA-17358222:5(1163-1183)Online publication date: 26-Sep-2018
  • (2018)A preordering model based on phrasal dependency treeDigital Scholarship in the Humanities10.1093/llc/fqy00933:4(748-765)Online publication date: 18-May-2018
  • Show More Cited By

Index Terms

  1. Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian Language Information Processing
    ACM Transactions on Asian Language Information Processing  Volume 12, Issue 3
    August 2013
    76 pages
    ISSN:1530-0226
    EISSN:1558-3430
    DOI:10.1145/2499955
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 August 2013
    Accepted: 01 December 2012
    Revised: 01 November 2012
    Received: 01 February 2012
    Published in TALIP Volume 12, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Japanese-to-English translation
    2. long-distance reordering
    3. post-ordering
    4. statistical machine translation

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)How Good are Transformers in Reordering?Multi-disciplinary Trends in Artificial Intelligence10.1007/978-3-031-36402-0_5(60-67)Online publication date: 24-Jun-2023
    • (2018)A neural reordering model based on phrasal dependency tree for statistical machine translationIntelligent Data Analysis10.3233/IDA-17358222:5(1163-1183)Online publication date: 26-Sep-2018
    • (2018)A preordering model based on phrasal dependency treeDigital Scholarship in the Humanities10.1093/llc/fqy00933:4(748-765)Online publication date: 18-May-2018
    • (2016)A survey of word reordering in statistical machine translationComputational Linguistics10.1162/COLI_a_0024542:2(163-205)Online publication date: 1-Jun-2016
    • (2016)Inter-, Intra-, and Extra-Chunk Pre-Ordering for Statistical Japanese-to-English Machine TranslationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/281838115:3(1-28)Online publication date: 9-Jan-2016
    • (2015)Improving Statistical Machine Translation using Syntax-based Learning-to-Rank SystemDigital Scholarship in the Humanities10.1093/llc/fqv032(fqv032)Online publication date: 12-Aug-2015

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media