A $${\mathcal{O}(|G|n^6)}$$ time extension of inversion transduction grammars

Anders Søgaard¹

115 Accesses
Explore all metrics

Abstract

Range concatenation grammars are viewed as a hierarchy of synchronous grammars. It is shown how inversion transduction grammars (ITGs) and extensions thereof, including synchronous tree-adjoining grammars, are captured by the hierarchy, and the expressivity and linguistic relevance of subclasses of the hierarchy are discussed. A ${\mathcal{O}(|G|n^6)}$ time extension of ITGs is proposed. The extension translates cross-serial dependencies into nested ones and handles complex kinds of discontinuous translation units and so-called inside-out alignments. In fact, our ${\mathcal{O}(|G|n^6)}$ time extension generates all possible alignments. It is shown that this additional expressivity comes at the cost of probabilistic parsing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Aho A, Ullman J (1972) The theory of parsing, translation and compiling. Prentice-Hall, London
Google Scholar
Barton E, Berwick R, Ristad E (1987) Computational complexity and natural language. MIT Press, Cambridge
Google Scholar
Boullier P (1998) Proposal for a natural language processing syntactic backbone. Technical report. INRIA, Le Chesnay
Boullier P (1999) On TAG and multicomponent TAG parsing. In: Proceedings of the 6th Conférence annuelle sur le Traitement Automatique des Langues Naturelles. Cargèse, Corse, pp 75–84
Boullier P (2000) A cubic time extension of context-free grammars. Grammars 3(2–3): 111–131
Article MathSciNet MATH Google Scholar
Brown P, Della Pietra V, Della Pietra S, Mercer R (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2): 263–311
Google Scholar
Casacuberta F, de la Higuera C (2000) Computational complexity of problems on probabilistic grammars and transducers. In: Grammatical inference: algorithms and applications, 5th international colloquium, Lisbon, Portugal, pp 15–24
Chi Z (1999) Statistical properties of probabilistic context-free grammars. Comput Linguist 25(1): 131–160
MathSciNet Google Scholar
Chi Z, Geman S (1998) Estimation of probabilistic context-free grammars. Comput Linguist 24(2): 299–305
MathSciNet Google Scholar
Chiang D (2007) Hierarchical phrase-based translation. Comput Linguist 33(2): 201–228
Article Google Scholar
Culik K (1966) Well translatable languages and Algol-like languages. In: Steel T (ed) Formal languages and description languages. North Holland Press, Amsterdam, pp 76–85
Google Scholar
Dreyer M, Hall K, Khudanpur S (2007) Comparing reordering constraints for SMT using efficient BLEU oracle computation. In: Proceedings of SSST, NAACL-HLT 2007/AMTA workshop on syntax and structure in statistical translation, New York, NY, pp 103–110
Eisner J (2003) Learning non-isomorphic tree mappings for machine translation. In: 41st annual meeting of the association for computational linguistics, Sapporo, Japan, pp 205–208
Gazdar G (1988) Applicability of indexed grammars to natural languages. In: Reyle U, Rohrer C (eds) Natural language parsing and linguistic theories. Reidel, Dordrecht, pp 69–94
Chapter Google Scholar
Goutte C, Yamada K, Gaussier E (2004) Aligning words using matrix factorisation. In: Proceedings of the 42nd annual meeting of the association for computational linguistics, ACL-04, Barcelona, Spain, pp 502–509
Graça J, Pardal J, Coheur L, Caseiro D (2008) Building a golden collection of parallel multi-language word alignments. In: Proceedings of the 6th international conference on language resources and evaluation (LREC’08), Marrakech, Morocco, pp 986–993
Guan Y (1992) Klammergrammatiken, Netzgrammatiken und Interpretationen von Netzen. Ph.D thesis, Universität des Saarlands, Saarbrücken, Germany
Harbusch K, Poller P (1996) Structural translation with synchronous tree-adjoining grammars in Verbmobil. Technical report Verbmobil 184. Universität Koblenz-Landau/DFKI GmbH, Koblenz
Huang S, Zhou B (2009) An EM Algorithm for SCFG in Formal Syntax-based Translation. In: Proceedings of the IEEE international conference on acoustic, speech, and signal processing (ICASSP’09), Taiwan, China, pp 4813–4816
Kallmeyer L (2005) Tree-local multicomponent tree-adjoining grammars with shared nodes. Comput Linguist 31(2): 187–225
Article MathSciNet Google Scholar
Kato Y, Seki H, Kasami T (2006) RNA pseudoknotted structure prediction using stochastic multiple context-free grammar. IPSJ Digit Cour 2: 655–664
Article Google Scholar
Maier W, Søgaard A (2008) Treebanks and mild context-sensitivity. In: Proceedings of the 13th conference on formal grammar, Hamburg, Germany, pp 61–76
Melamed D, Satta G, Wellington B (2004) Generalized multitext grammars. In: 42nd annual meeting of the association for computational linguistics, Barcelona, Spain, pp 661–668
Nesson R, Shieber S (2007) Simpler TAG semantics through synchronization. In: Proceedings of the 11th conference on formal grammar. CSLI Publications, Stanford, pp 129–142
Padó S, Lapata M (2006) Optimal constituent alignment with edge covers for semantic projection. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics, COLING · ACL 2006, Sydney, Australia, pp 1161–1168
Rambow O, Satta G (1994) A two-dimensional hierarchy for parallel rewriting systems. Technical report. University of Philadelphia, Philadelphia
Saers M, Nivre J, Wu D (2009) Learning stochastic bracketing inversion transduction grammars with a cubic time biparsing algorithm. In: 11th international conference on parsing technology (IWPT 2009), Paris, France, pp 29–32
Satta G, Peserico E (2005) Some computational complexity results for synchronous context-free grammars. In: Proceedings of human language technology conference and conference for empirical methods in natural language processing 2005, Vancouver, Canada, pp 803–810
Seki H, Matsumura T, Fujii M, Kasami T (1991) On multiple context-free grammars. Theor Comput Sci 88(2): 191–229
Article MathSciNet MATH Google Scholar
Shieber S (2007) Probabilistic synchronous tree-adjoining grammars for machine translation. In: Proceedings of the North American chapter of the association for computational linguistics 2007, workshop on syntax and structure in statistical translation, Rochester, New York, pp 88–95
Shieber S, Schabes Y (1990) Synchronous tree-adjoining grammars. In: COLING-90, papers presented to the 13th international conference on computational linguistics on the occasion of the 25th anniversary of COLING and the 350th anniversary of Helsinki University, Helsinki, Finland, pp 253–258
Simard M, Cancedda N, Cavestro B, Dymetman M, Gaussier E, Goutte C, Yamada K, Langlais P, Mauser A (2005) Translating with non-contiguous phrases. In: HLT/EMNLP 2005: proceedings of the human language technology conference and conference on empirical methods in natural language processing, Vancouver, British Columbia, Canada, pp 755–762
Søgaard A (2010) Can inversion transduction grammars generate hand alignments? In: Proceedings of the 14th annual conference of the european association for machine translation, St. Raphael, France
Søgaard A, Kuhn J (2009a) Empirical lower bounds on alignment error rates in syntax-based machine translation. In: Proceedings of the third workshop on syntax and structure in statistical translation (SSST-3) at NAACL HLT 2009, Boulder, CO, pp 19–27
Søgaard A, Kuhn J (2009b) Using a maximum entropy-based tagger to improve a very fast vine parser. In: 11th international conference on parsing technology (IWPT 2009), Paris, France, pp 206–209
Søgaard A, Wu D (2009) Empirical lower bounds on alignment error rates for the full class of inversion transduction grammars. In: 11th international conference on parsing technology (IWPT 2009), Paris, France, pp 33–36
Sudkamp T (2005) Languages and machines, 3rd edn. Pearson, Boston
Google Scholar
Vijay-Shanker K, Weir D (1994) The equivalence of four extensions of context-free grammars. Math Syst Theory 27: 511–546
Article MathSciNet MATH Google Scholar
Wedekind J (1997) Approaches to unification in grammar. In: Blackburn P, de Rijke M (eds) Specifying syntactic structure. CSLI Publications, Stanford, pp 245–280
Google Scholar
Wellington B, Waxmonsky S, Melamed D (2006) Empirical lower bounds on the complexity of translational equivalence. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics, COLING · ACL 2006, Sydney, Australia, pp 977–984
Wong F, Dong M, Hu D (2006) Machine translation using constraint-based synchronous grammar. Tsinghua Sci Technol 11(3): 295–306
Article MathSciNet MATH Google Scholar
Wong YW, Mooney R (2007) Learning synchronous grammars for semantic parsing with lambda calculus. In: Proceedings of the 45th annual meeting of the association for computational linguistics, Prague, Czech Republic, pp 960–967
Wu D (1995) Trainable coarse bilingual grammars for parallel text bracketing. In: Third annual workshop on very large corpora, Cambridge, MA, pp 69–81
Wu D (1997) Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comput Linguist 23(3): 377–403
Google Scholar
Zens R, Ney H (2003) A comparative study on reordering constraints in statistical machine translation. In: 41st annual meeting on association for computational linguistics, Sapporo, Japan, pp 144–151
Zhang H, Gildea D, Chiang D (2008) Extracting synchronous grammar rules from word-level alignments in linear time. In: Coling 2008, proceedings of the 22nd international conference on computational linguistics, Manchester, England, pp 1081–1088
Zhang H, Huang L, Gildea D, Knight K (2006) Synchronous binarization for machine translation. In: Proceedings of the joint human language technology conference and the North American chapter of the association for computational linguistics 2006, New York, NY, pp 256–263

Download references

Author information

Authors and Affiliations

Center for Language Technology, University of Copenhagen, Copenhagen, Denmark
Anders Søgaard

Authors

Anders Søgaard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anders Søgaard.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Søgaard, A. A ${\mathcal{O}(|G|n^6)}$ time extension of inversion transduction grammars. Machine Translation 25, 291–315 (2011). https://doi.org/10.1007/s10590-011-9107-8

Download citation

Received: 02 October 2007
Accepted: 12 August 2011
Published: 08 September 2011
Issue Date: December 2011
DOI: https://doi.org/10.1007/s10590-011-9107-8

A \({\mathcal{O}(|G|n^6)}\) time extension of inversion transduction grammars

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Transduction of Automatic Sequences and Applications

Parallel Work Spaces in Syntax and the Inexistence of Internal Merge

Tree-to-Graph Transductions with Scope

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

A \({\mathcal{O}(|G|n^6)}\) time extension of inversion transduction grammars

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Transduction of Automatic Sequences and Applications

Parallel Work Spaces in Syntax and the Inexistence of Internal Merge

Tree-to-Graph Transductions with Scope

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now