Abstract
Weighted alignment hypergraph [4] is potentially useful for statistical machine translation, because it is the first study to simultaneously exploit the compact representation and fertility model of word alignment. Since estimating the probabilities of rules extracted from hypergraphs is an NP-complete problem, they propose a divide-and-conquer strategy by decomposing a hypergraph into a set of independent subhypergraphs. However, they employ a Bull’s algorithm to enumerate all consistent alignments for each rule in each subhypergraph, which is very time-consuming especially for the rules that contain non-terminals. This limits the applicability of this method to the syntax translation models, the rules of which contain many non-terminals (e.g. SCFG rules). In response to this problem, we propose an inside-outside algorithm to efficiently enumerate the consistent alignments. Experimental results show that our method is twice as fast as the Bull’s algorithm. In addition, the efficient dynamic programming algorithm makes our approach applicable to syntax-based translation models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brown, P.E., Pietra, S.A.D., Pietra, V.J.D., Mercer, R.L.: The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2), 263–311 (1993)
Chiang, D.: Hierarchical phrase-based translation. Computational Linguistics 33(2), 201–228 (2007)
Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 48–54. Association for Computational Linguistics (2003)
Liu, Q., Tu, Z., Lin, S.: A Novel Graph-based Compact Representation of Word Alignment. In: Proceedings of the 51th Annual Meeting of the Association for Computational Linguistics (2013)
Liu, Y., Xia, T., Xiao, X., Liu, Q.: Weighted alignment matrices for statistical machine translation. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 1017–1026. Association for Computational Linguistics, Singapore (2009)
Moore, R.C.: A discriminative framework for bilingual word alignment. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 81–88. Association for Computational Linguistics, Vancouver (2005)
Tu, Z., Jiang, W., Liu, Q., Lin, S.: Dependency Forest for Sentiment Analysis. In: Zhou, M., Zhou, G., Zhao, D., Liu, Q., Zou, L. (eds.) NLPCC 2012. CCIS, vol. 333, pp. 69–77. Springer, Heidelberg (2012)
Tu, Z., Liu, Y., He, Y., van Genabith, J., Liu, Q., Lin, S.: Combining Multiple Alignments to Improve Machine Translation. In: Proceedings of the 24th International Conference on Computational Linguistics (2012)
Tu, Z., Liu, Y., Hwang, Y.-S., Liu, Q., Lin, S.: Dependency forest for statistical machine translation. In: Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pp. 1092–1100. International Committee on Computational Linguistics, Beijing (2010)
Tu, Z., Liu, Y., Liu, Q., Lin, S.: Extracting Hierarchical Rules from a Weighted Alignment Matrix. In: Proceedings of 5th International Joint Conference on Natural Language Processing, pp. 1294–1303. Asian Federation of Natural Language Processing, Chiang Mai (2011)
Venugopal, A., Zollmann, A., Smith, N.A., Vogel, S.: Wider pipelines: n-best alignments and parses in mt training. In: Proceedings of AMTA, Honolulu, Hawaii (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tu, Z., Xie, J., Lv, Y., Liu, Q. (2013). A Simple, Fast Strategy for Weighted Alignment Hypergraph. In: Zhou, G., Li, J., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2013. Communications in Computer and Information Science, vol 400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41644-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-41644-6_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41643-9
Online ISBN: 978-3-642-41644-6
eBook Packages: Computer ScienceComputer Science (R0)