Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1873781.1873793dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
research-article
Free access

Broad coverage multilingual deep sentence generation with a stochastic multi-level realizer

Published: 23 August 2010 Publication History

Abstract

Most of the known stochastic sentence generators use syntactically annotated corpora, performing the projection to the surface in one stage. However, in full-fledged text generation, sentence realization usually starts from semantic (predicate-argument) structures. To be able to deal with semantic structures, stochastic generators require semantically annotated, or, even better, multilevel annotated corpora. Only then can they deal with such crucial generation issues as sentence planning, linearization and morphologization. Multilevel annotated corpora are increasingly available for multiple languages. We take advantage of them and propose a multilingual deep stochastic sentence realizer that mirrors the state-of-the-art research in semantic parsing. The realizer uses an SVM learning algorithm. For each pair of adjacent levels of annotation, a separate decoder is defined. So far, we evaluated the realizer for Chinese, English, German, and Spanish.

References

[1]
Bangalore, S. and O. Rambow. 2000. Exploiting a Probabilistic Hierarchical Model for Generation. In Proceedings of COLING '00, pages 42--48.
[2]
Bangalore, S., J. Chen, and O. Rambow. 2001. Impact of Quality and Quantity of Corpora on Stochastic Generation. In Proceedings of the EMNLP Conference, pages 159--166.
[3]
Bateman, J. A., I. Kruijff-Korbayová, and G.-J. Kruijff. 2005. Multilingual Resource Sharing Across Both Related and Unrelated Languages: An Implemented, Open-Source Framework for Practical Natural Language Generation. Research on Language and Computation, 15:1--29.
[4]
Belz, A. 2008. Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models. Natural Language Engineering, 14(4):431--455.
[5]
Bohnet, B. 2004. A graph grammar approach to map between dependency trees and topological models. In Proceedings of the IJCNLP, pages 636--645.
[6]
Bröker, N. 1998. Separating Surface Order and Syntactic Relations in a Dependency Grammar. In Proceedings of the COLING/ACL '98.
[7]
Crammer, K., O. Dekel, S. Shalev-Shwartz, and Y. Singer. 2006. Online Passive-Aggressive Algorithms. Journal of Machine Learning Research, 7:551--585.
[8]
Duchier, D. and R. Debusmann. 2001. Topological dependency trees: A constraint-based account of linear precedence. In Proceedings of the ACL.
[9]
Elhadad, M. and J. Robin. 1996. An overview of SURGE: A reusable comprehensive syntactic realization component. Technical Report TR 96--03, Department of Mathematics and Computer Science, Ben Gurion University.
[10]
Filippova, K. and M. Strube. 2008. Sentence fusion via dependency graph compression. In Proceedings of the EMNLP Conference.
[11]
Filippova, K. and M. Strube. 2009. Tree linearization in English: Improving language model based approaches. In Proceedings of the NAACL '09 and HLT, Short Papers, pages 225--228.
[12]
Gerdes, K. and S. Kahane. 2001. Word order in German: A formal dependency grammar using a topo-logical hierarchy. In Proceedings of the ACL.
[13]
Hajič, J. et al. 2009. The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages. In Proceedings of the CoNLL.
[14]
He, W., H. Wang, Y. Guo, and T. Liu. 2009. Dependency based chinese sentence realization. In Proceedings of the ACL and of the IJCNLP of the AFNLP, pages 809--816.
[15]
Knight, K. and V. Hatzivassiloglou. 1995. Two-level, many paths generation. In Proceedings of the ACL.
[16]
Langkilde, I. and K. Knight. 1998. Generation that exploits corpus-based statistical knowledge. In Proceedings of the COLING/ACL, pages 704--710.
[17]
Langkilde-Geary, I. 2002. An empirical verification of coverage and correctness for a general-purpose sentence generator. In Proceedings of the Second INLG Conference, pages 17--28.
[18]
Lavoie, B. and O. Rambow. 1997. A fast and portable realizer for text generation systems. In Proceedings of the 5th Conference on ANLP.
[19]
Levenshtein, V. I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics, 10:707--710.
[20]
Mellish, C., D. Scott, L. Cahill, D. Paiva, R. Evans, and M. Reape. 2006. A reference architecture for natural language generation systems. Natural Language Engineering, 12(1):1--34.
[21]
Minnen, G., J. Carroll, and D. Pearce. 2001. Applied morphological processing for English. Natural Language Engineering, 7(3):207--223.
[22]
Oh, A. H. and A. I. Rudnicky. 2000. Stochastic language generation for spoken dialogue systems. In Proceedings of the ANL/NAACL Workshop on Conversational Systems, pages 27--32.
[23]
Palmer, M., D. Gildea, and P. Kingsbury. 2005. The proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 31(1):71--105.
[24]
Ringger, E., M. Gamon, R. C. Moore, D. Rojas, M. Smets, and S. Corston-Oliver. 2004. Linguistically informed statistical models of constituent structure for ordering in sentence realization. In Proceedings of COLING, pages 673--679.
[25]
Wan, S., M. Dras, Dale R., and C. Paris. 2009. Improving Grammaticality in Statistical Sentence Generation: Introducing a Dependency Spanning Tree Algorithm with an Argument Satisfaction Model. In Proceedings of the EACL '09, pages 852--860.

Cited By

View all
  • (2023)Surface Realization Architecture for Low-resourced African LanguagesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/356759422:3(1-26)Online publication date: 10-Mar-2023
  • (2012)The surface realisation taskProceedings of the Seventh International Natural Language Generation Conference10.5555/2392712.2392743(136-140)Online publication date: 30-May-2012
  • (2012)Towards a surface realization-oriented corpus annotationProceedings of the Seventh International Natural Language Generation Conference10.5555/2392712.2392721(22-30)Online publication date: 30-May-2012
  • Show More Cited By
  1. Broad coverage multilingual deep sentence generation with a stochastic multi-level realizer

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image DL Hosted proceedings
        COLING '10: Proceedings of the 23rd International Conference on Computational Linguistics
        August 2010
        1408 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        Published: 23 August 2010

        Qualifiers

        • Research-article

        Acceptance Rates

        Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)70
        • Downloads (Last 6 weeks)18
        Reflects downloads up to 21 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Surface Realization Architecture for Low-resourced African LanguagesACM Transactions on Asian and Low-Resource Language Information Processing10.1145/356759422:3(1-26)Online publication date: 10-Mar-2023
        • (2012)The surface realisation taskProceedings of the Seventh International Natural Language Generation Conference10.5555/2392712.2392743(136-140)Online publication date: 30-May-2012
        • (2012)Towards a surface realization-oriented corpus annotationProceedings of the Seventh International Natural Language Generation Conference10.5555/2392712.2392721(22-30)Online publication date: 30-May-2012
        • (2012)Generating non-projective word order in statistical linearizationProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning10.5555/2390948.2391049(928-939)Online publication date: 12-Jul-2012
        • (2012)Syntax-based word ordering incorporating a large-scale language modelProceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics10.5555/2380816.2380906(736-746)Online publication date: 23-Apr-2012
        • (2011)Proceedings of the 13th European Workshop on Natural Language Generation10.5555/2187681.2187722(232-235)Online publication date: 28-Sep-2011
        • (2011)The first surface realisation shared taskProceedings of the 13th European Workshop on Natural Language Generation10.5555/2187681.2187719(217-226)Online publication date: 28-Sep-2011
        • (2011)Towards generating text from discourse representation structuresProceedings of the 13th European Workshop on Natural Language Generation10.5555/2187681.2187705(145-150)Online publication date: 28-Sep-2011
        • (2011)Underspecifying and predicting voice for surface realisation rankingProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 110.5555/2002472.2002599(1007-1017)Online publication date: 19-Jun-2011

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media