Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models
<p>An example dependency tree for an English source sentence, its translation in Farsi and the word alignments.</p> "> Figure 2
<p>BLEU Rank vs. Accuracy Rank for English–Farsi, <math display="inline"> <semantics> <mrow> <mi>ρ</mi> <mo>=</mo> <mn>0.85</mn> </mrow> </semantics> </math>, <span class="html-italic">p</span>-value < <math display="inline"> <semantics> <mrow> <mn>0.01</mn> </mrow> </semantics> </math>.</p> "> Figure 3
<p>BLEU Rank vs. Accuracy Rank for English–Arabic, <math display="inline"> <semantics> <mrow> <mi>ρ</mi> <mo>=</mo> <mo>−</mo> <mn>0.75</mn> </mrow> </semantics> </math>, <span class="html-italic">p</span>-value < <math display="inline"> <semantics> <mrow> <mn>0.01</mn> </mrow> </semantics> </math>.</p> "> Figure 4
<p>BLEU Rank vs. Accuracy Rank for English–Turkish, <math display="inline"> <semantics> <mrow> <mi>ρ</mi> <mo>=</mo> <mn>0.8</mn> </mrow> </semantics> </math>, <span class="html-italic">p</span>-value < <math display="inline"> <semantics> <mrow> <mn>0.01</mn> </mrow> </semantics> </math>.</p> "> Figure 5
<p>TER Rank vs. Accuracy Rank for English–Farsi, <math display="inline"> <semantics> <mrow> <mi>ρ</mi> <mo>=</mo> <mo>−</mo> <mn>0.90</mn> </mrow> </semantics> </math>, <span class="html-italic">p</span>-value < <math display="inline"> <semantics> <mrow> <mn>0.01</mn> </mrow> </semantics> </math>.</p> "> Figure 6
<p>TER Rank vs. Accuracy Rank for English–Arabic, <math display="inline"> <semantics> <mrow> <mi>r</mi> <mo>=</mo> <mn>0.75</mn> </mrow> </semantics> </math>, <span class="html-italic">p</span>-value < <math display="inline"> <semantics> <mrow> <mn>0.01</mn> </mrow> </semantics> </math>.</p> "> Figure 7
<p>TER Rank vs. Accuracy Rank for English–Turkish, <math display="inline"> <semantics> <mrow> <mi>r</mi> <mo>=</mo> <mo>−</mo> <mn>0.85</mn> </mrow> </semantics> </math>, <span class="html-italic">p</span>-value < <math display="inline"> <semantics> <mrow> <mn>0.01</mn> </mrow> </semantics> </math>.</p> "> Figure 8
<p>BLEU Rank vs. Macro-averaged <math display="inline"> <semantics> <msub> <mi>F</mi> <mn>1</mn> </msub> </semantics> </math> Rank for English–Arabic, <math display="inline"> <semantics> <mrow> <mi>r</mi> <mo>=</mo> <mn>0.77</mn> </mrow> </semantics> </math>, <span class="html-italic">p</span>-value < <math display="inline"> <semantics> <mrow> <mn>0.01</mn> </mrow> </semantics> </math>.</p> "> Figure 9
<p>TER Rank vs. Macro-averaged <math display="inline"> <semantics> <msub> <mi>F</mi> <mn>1</mn> </msub> </semantics> </math> Rank for English–Arabic, <math display="inline"> <semantics> <mrow> <mi>r</mi> <mo>=</mo> <mo>−</mo> <mn>0.8</mn> </mrow> </semantics> </math>, <span class="html-italic">p</span>-value < <math display="inline"> <semantics> <mrow> <mn>0.01</mn> </mrow> </semantics> </math>.</p> "> Figure 10
<p>The mprovement in classification performance vs. the improvement in SMT quality for English-Farsi.</p> "> Figure 11
<p>The mprovement in classification performance vs. the improvement in SMT quality for English-Arabic.</p> "> Figure 12
<p>The improvement in classification performance vs. the improvement in SMT quality for English-Turkish.</p> ">
Abstract
:1. Introduction
2. Related Work
2.1. Discriminative Reordering Models
2.2. Intrinsic vs. Extrinsic Evaluation
3. Discriminative Reordering Models
3.1. Method
3.2. Classifiers
3.3. Integration into HPB-SMT
- ,
- ,
- ,
- .
4. Generating Classifiers with Varying Quality
5. Experiments
6. Results and Analysis
6.1. Relationship between Classification Performance and Translation Quality
6.2. The Impact of Classification Improvement on Translation Quality
- When the improvement in classification performance exceeds a certain threshold, SMT quality will improve too. For En–Fa, En–Ar and En–Tr corpora, the threshold values are 6.4%, 3% and 6.2%, respectively. This shows that, for each parallel corpus, if the amount of improvement in classification performance exceeds the corresponding threshold value, we can expect the SMT quality to improve as well.
- The magnitude of the improvement in classification performance is not necessarily proportional to the magnitude of the improvement in SMT quality. That is, a higher improvement in classification performance does not always lead to a higher improvement in SMT quality.
- An improvement of about 0–20% in classification performance leads to an improvement of about 0–3.5% in the BLEU score. It is worth noting that although the improvement in BLEU score is much smaller than the improvement in classification performance, it is still comparable with the BLEU improvement gained by some recent reordering models (cf. Table 7).
Algorithm 1 Calculating the amount of improvement in classification performance and SMT quality. |
|
7. Conclusions
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Birch, A.; Osborne, M.; Philipp, K. Predicting success in machine translation. In Proceedings of the 8th Conference on Empirical Methods in Natural Language Processing, Honolulu, HI, USA, 25–27 October 2008; pp. 745–754. [Google Scholar]
- Zens, R.; Ney, H. Discriminative reordering models for statistical machine translation. In Proceedings of the 2006 Workshop on Statistical Machine Translation, New York, NY, USA, 8–9 June 2006; pp. 55–63. [Google Scholar]
- Bisazza, A.; Federico, M. Dynamically shaping the reordering search space of phrase-based statistical machine translation. Trans. Assoc. Comput. Linguist. 2013, 1, 327–340. [Google Scholar]
- Chang, P.C.; Tseng, H.; Jurafsky, D.; Manning, C.D. Discriminative Reordering with Chinese Grammatical Relations Features. In Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation, Boulder, CO, USA, 5 June 2009; pp. 51–59. [Google Scholar]
- Xiong, D.; Liu, Q.; Lin, S. Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17–21 July 2006; pp. 521–528. [Google Scholar]
- He, Z.; Meng, Y.; Yu, H. Extending the hierachical phrase based model with maximum entropy based btg. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA), Denver, CO, USA, 31 October–4 November 2010. [Google Scholar]
- Li, J.; Marton, Y.; Resnik, P.; Daumé, H., III. A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, USA, 22–27 June 2014; Association for Computational Linguistics: Baltimore, MD, USA, 2014; Volume 1, pp. 1123–1133. [Google Scholar]
- Green, S.; Galley, M.; Manning, C.D. Improved models of distortion cost for statistical machine translation. In Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Los Angeles, CA, USA, 2–4 June 2010; pp. 867–875. [Google Scholar]
- Gao, Y.; Koehn, P.; Birch, A. Soft Dependency Constraints for Reordering in Hierarchical Phrase-based Translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Edinburgh, UK, 27–31 July 2011; pp. 857–868. [Google Scholar]
- Kazemi, A.; Toral, A.; Way, A.; Monadjemi, A.; Nematbakhsh, M. Dependency-based Reordering Model for Constituent Pairs in Hierarchical SMT. In Proceedings of the 18th Annual Conference of the European Association for Machine Translation, Antalya, Turkey, 11–13 May 2015; pp. 43–50. [Google Scholar]
- Kazemi, A.; Toral, A.; Way, A. Using Wordnet to Improve Reordering in Hierarchical Phrase-Based Statistical Machine Translation. In Proceedings of the Eighth Meeting of the Global WordNet Conference, Bucharest, Romania, 8–12 January 2016. [Google Scholar]
- Xiong, D.; Zhang, M.; Li, H. Modeling the Translation of Predicate-argument Structure for SMT. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju Island, Korea, 8–14 July 2012; Volume 1, pp. 902–911. [Google Scholar]
- Wang, X.; Xiong, D.; Zhang, M.; Hong, Y.; Yao, J. A Topic-Based Reordering Model for Statistical Machine Translation. In Proceedings of the Third CCF Conference—NLPCC 2014, Shenzhen, China, 5–9 December 2014; pp. 414–421. [Google Scholar]
- Alrajeh, A.; Niranjan, M. Scalable Reordering Models for SMT based on Multiclass SVM. Prague Bull. Math. Linguist. 2015, 103, 65–84. [Google Scholar] [CrossRef]
- Kumar, E. Natural Language Processing; I.K. International Pvt. Ltd.: New Delhi, India, 2011. [Google Scholar]
- Fraser, A.; Marcu, D. Measuring Word Alignment Quality for Statistical Machine Translation; Technical Report; ISI University of Southern California: Los Angeles, CA, USA, 2006. [Google Scholar]
- Fraser, A.; Marcu, D. Measuring Word Alignment Quality for Statistical Machine Translation. Comput. Linguist. 2007, 16, 293–303. [Google Scholar] [CrossRef]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.J. BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, PA, USA, 6–13 July 2002; pp. 311–318. [Google Scholar]
- Ayan, N.F.; Dorr, B.J. Going Beyond AER: An Extensive Analysis of Word Alignments and Their Impact on MT. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, Sydney, Australia, 17–21 July 2006; pp. 9–16. [Google Scholar]
- Davis, P.C.; Xie, Z.; Small, K. All Links are not the Same: Evaluating Word Alignments for Statistical Machine Translation. In Proceedings of the MT Summit XI, Copenhagen, Denmark, 10–14 September 2007; pp. 119–126. [Google Scholar]
- Vilar, D.; Popovic, M.; Ney, H. AER: Do we need to “ improve” our alignments? In Proceedings of the International Workshop on Spoken Language Translation (IWSLT), Kyoto, Japan, 27–28 November 2006; pp. 205–212. [Google Scholar]
- Guzman, F.; Gao, Q.; Vogel, S. Reassessment of the Role of Phrase Extraction in PBSMT. Proceedinds of the Machine Translation Summit XII, Ottawa, ON, Canada, 26–30 August 2009. [Google Scholar]
- Tian, L.; Wong, D.F.; Chao, L.S.; Oliveira, F. A Relationship: Word Alignment, Phrase Table, and Translation Quality. Sci. World J. 2014, 2014. [Google Scholar] [CrossRef] [PubMed]
- Chiang, D. A Hierarchical Phrase-based Model for Statistical Machine Translation. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, MI, USA, 25–30 June 2005; pp. 263–270. [Google Scholar]
- Koehn, P. Statistical Machine Translation; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
- Fellbaum, C. WordNet: An Electronic Lexical Database; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
- Kazemi, A.; Toral, A.; Way, A.; Monadjemi, A.; Nematbakhsh, M. Syntax- and semantic-based reordering in hierarchical phrase-based statistical machine translation. Expert Syst. Appl. 2017, 84, 186–199. [Google Scholar] [CrossRef]
- Quirk, C.; Menezes, A.; Cherry, C. Dependency treelet translation: Syntactically informed phrasal SMT. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, MI, USA, 25–30 June 2005; pp. 271–279. [Google Scholar]
- Passban, P.; Way, A.; Liu, Q. Benchmarking SMT Performance for Farsi Using the TEP++ Corpus. In Proceedings of the 18th Annual Conference of the European Association for Machine Translation, Antalya, Turkey, 11–13 May 2015. [Google Scholar]
- Oflazer, K. Statistical Machine Translation into a Morphologically Complex Language. In Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics, Haifa, Israel, 17–23 February 2008. [Google Scholar]
- News Commentary English–Arabic parallel corpus. Available online: http://www.casmacat.eu/corpus/news-commentary.html (accessed on 29 June 2017).
- Chen, D.; Manning, C.D. A Fast and Accurate Dependency Parser using Neural Networks. In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014. [Google Scholar]
- Och, F.J.; Ney, H. A Systematic Comparison of Various Statistical Alignment Models. Comput. Linguist. 2003, 29, 19–51. [Google Scholar] [CrossRef]
- Manning, C.; Klein, D. Optimization, Maxent Models, and Conditional Estimation without Magic. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, AB, Canada, 27 May–1 June 2003; p. 8. [Google Scholar]
- Hoang, H.; Koehn, P.; Lopez, A. A unified framework for phrase-based, hierarchical, and syntax-based statistical machine translation. In Proceedings of the International Workshop on Spoken Language Translation, IWSLT, Tokyo, Japan, 1–2 December 2009; pp. 152–159. [Google Scholar]
- Cherry, C.; Foster, G. Batch Tuning Strategies for Statistical Machine Translation. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, CA, USA, 2–4 June 2012; pp. 427–436. [Google Scholar]
- Snover, M.; Dorr, B.; Schwartz, R.; Micciulla, L.; Weischedel, R. A Study of Translation Error Rate with Targeted Human Annotation. In Proceedings of the Association for Machine Translation in the Americas, Cambridge, MA, USA, 8–12 August 2006. [Google Scholar]
- Clark, J.H.; Dyer, C.; Lavie, A.; Smith, N.A. Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA, 19–24 June 2011; Association for Computational Linguistics: Stroudsburg, PA, USA, 2011; Volume 2, pp. 176–181. [Google Scholar]
- Manning, C.D.; Raghavan, P.; Schütze, H. Introduction to Information Retrieval; Cambridge University Press: Cambridge, UK, 2008. [Google Scholar]
- Zhang, J.; Utiyama1, M.; Sumita, E.; Zhao, H.; Neub, G. Learning local word reorderings for hierarchical phrase-based statistical machine translation. Mach. Transl. J. 2016, 30, 1–18. [Google Scholar] [CrossRef]
- Wenniger, G.M.D.B.; Sima’an, K. Bilingual markov reordering labels for hierarchical smt. In Proceedings of the Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8), Doha, Qatar, 25 October 2014; pp. 11–21. [Google Scholar]
- Li, P.; Liu, Y.; Sun, M.; Izuha, T.; Zhang, D. A Neural Reordering Model for Phrase-based Translation. In Proceedings of the COLING 2014, the 25th International Conference on Computational Linguistics, Dublin, Ireland, 23–29 August 2014; pp. 1897–1907. [Google Scholar]
- Nguyen, T.; Vogel, S. Integrating phrase-based reordering features into a chart-based decoder for machine translation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, 4–9 August 2013; pp. 1587–1596. [Google Scholar]
Head | Dependant | Ori |
---|---|---|
fox | the | M |
fox | brown | S |
fox | quick | S |
jumped | fox | S |
jumped | dog | S |
dog | lazy | S |
dog | the | M |
Dep1 | Dep2 | Ori |
the | brown | M |
the | quick | M |
brown | quick | M |
the | lazy | M |
dog | fox | M |
Feature | Description |
---|---|
surface form of word w | |
dependency relation between dependent word d and its head | |
synset of word w |
Feature | Values |
---|---|
, , | jumped, fox, dog |
, , | SID-01945853-V, SID-02097711-N, SID-02064081-N |
, | nsubj, prep-over |
No. | Classifier | Training Data | Features |
---|---|---|---|
1 | hd-lex | whole original data | , , |
2 | hd-lex-half | 1/2 of the original data | |
3 | hd-lex-quarter | 1/4 of the original data | |
4 | hd-syn | whole original data | , , |
5 | hd-syn-half | 1/2 of the original data | |
6 | hd-syn-quarter | 1/4 of the original data | |
7 | hd-both | whole original data | , , , |
8 | hd-both-half | 1/2 of the original data | , |
9 | hd-both-quarter | 1/4 of the original data | |
10 | dd-lex | whole original data | , , , |
11 | dd-lex-half | 1/2 of the original data | , |
12 | dd-lex-quarter | 1/4 of the original data | |
13 | dd-syn | whole original data | , , , |
14 | dd-syn-half | 1/2 of the original data | , |
15 | dd-syn-quarter | 1/4 of the original data | |
16 | dd-both | whole original data | , , , |
17 | dd-both-half | 1/2 of the original data | , , |
18 | dd-both-quarter | 1/4 of the original data | , , |
Corpus | Train | Tune | Test | ||||
---|---|---|---|---|---|---|---|
Sentences | Words | Sentences | Words | Sentences | Words | ||
En–Fa | English | 575,208 | 4,652,389 | 2000 | 16,152 | 1000 | 8136 |
Farsi | 575,208 | 4,421,994 | 2000 | 15,388 | 1000 | 7850 | |
En–Ar | English | 222,975 | 5,865,994 | 2000 | 53,552 | 1000 | 26,322 |
Arabic | 222,975 | 5,807,679 | 2000 | 52,708 | 1000 | 26,256 | |
En–Tr | English | 100,957 | 1,213,275 | 647 | 13,302 | 644 | 12,371 |
Turkish | 100,957 | 1,151,795 | 647 | 13,969 | 644 | 13,048 |
Corpus | En–Fa | En–Tr | En–Ar | |||
---|---|---|---|---|---|---|
Constituent | Head-Dep | Dep-Dep | Head-Dep | Dep-Dep | Head-Dep | Dep-Dep |
Monotone | 63.04% | 71.92% | 55.70% | 60.93% | 70.89% | 87.62% |
Swap | 36.96% | 28.08% | 44.30% | 39.07% | 29.11% | 12.38% |
Reordering Model | Translation Task | Relative |
---|---|---|
BLEU Improvement (%) | ||
Zhang et al. [40] | Chinese-to-English | 3.5* |
Zhang et al. [40] | Japanese-to-English | 2.8* |
Wenniger and Sima’an [41] | Chinese-to-English | 3.1 |
Wenniger and Sima’an [41] | German-to-English | 0.3 |
Li et al. [42] | Chinese-to-English | 1.9* |
Nguyen and vogel [43] | Arabic-to-English | 2.4 |
Nguyen and vogel [43] | German-to-English | 3.4 |
Kazemi et al. [27] | English-to-Farsi | 3.6* |
Gao et al. [9] | Chinese-to-English | 3.6 |
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kazemi, A.; Toral, A.; Way, A.; Monadjemi, A.; Nematbakhsh, M. Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models. Entropy 2017, 19, 340. https://doi.org/10.3390/e19090340
Kazemi A, Toral A, Way A, Monadjemi A, Nematbakhsh M. Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models. Entropy. 2017; 19(9):340. https://doi.org/10.3390/e19090340
Chicago/Turabian StyleKazemi, Arefeh, Antonio Toral, Andy Way, Amirhassan Monadjemi, and Mohammadali Nematbakhsh. 2017. "Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models" Entropy 19, no. 9: 340. https://doi.org/10.3390/e19090340
APA StyleKazemi, A., Toral, A., Way, A., Monadjemi, A., & Nematbakhsh, M. (2017). Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models. Entropy, 19(9), 340. https://doi.org/10.3390/e19090340