Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Turkish Data-to-Text Generation Using Sequence-to-Sequence Neural Networks

Published: 27 December 2022 Publication History

Abstract

End-to-end data-driven approaches lead to rapid development of language generation and dialogue systems. Despite the need for large amounts of well-organized data, these approaches jointly learn multiple components of the traditional generation pipeline without requiring costly human intervention. End-to-end approaches also enable the use of loosely aligned parallel datasets in system development by relaxing the degree of semantic correspondences between training data representations and text spans. However, their potential in Turkish language generation has not yet been fully exploited. In this work, we apply sequence-to-sequence (Seq2Seq) neural models to Turkish data-to-text generation where the input data given in the form of a meaning representation is verbalized. We explore encoder-decoder architectures with attention mechanism in unidirectional, bidirectional, and stacked recurrent neural network (RNN) models. Our models generate one-sentence biographies and dining venue descriptions using a crowdsourced dataset where all field value pairs that appear in meaning representations are fully captured in reference sentences. To support this work, we also explore the performances of our models on a more challenging dataset, where the content of a meaning representation is too large to fit into a single sentence, and hence content selection and surface realization need to be learned jointly. This dataset is retrieved by coupling introductory sentences of person-related Turkish Wikipedia articles with their contained infobox tables. Our empirical experiments on both datasets demonstrate that Seq2Seq models are capable of generating coherent and fluent biographies and venue descriptions from field value pairs. We argue that the wealth of knowledge residing in our datasets and the insights obtained from this study hold the potential to give rise to the development of new end-to-end generation approaches for Turkish and other morphologically rich languages.

References

[1]
Gabor Angeli, Percy Liang, and Dan Klein. 2010. A simple domain-independent probabilistic approach to generation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 502–512.
[2]
Burcu Karagol Ayan. 2000. Morphosyntactic generation of Turkish from predicate-argument structure. In Proceedings of the COLING Student Session. Association for Computational Linguistics, Saarbrucken, Germany.
[3]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations. OpenReview.net, San Diego, California.
[4]
Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, 65–72.
[5]
Regina Barzilay and Mirella Lapata. 2005. Collective content selection for concept-to-text generation. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Vancouver, British Columbia, Canada, 331–338.
[6]
John A. Bateman. 1990. Upper modeling: A general organization of knowledge for natural language processing. In Proceedings of the Information Sciences Institute. USC.
[7]
Anja Belz. 2005. Statistical generation: Three methods compared and evaluated. In Proceedings of the 10th European Workshop on Natural Language Generation. Association for Computational Linguistics, Aberdeen, Scotland, 15–23.
[8]
Russa Biswas, Rima Türker, Farshad Bakhshandegan Moghaddam, Maria Koutraki, and Harald Sack. 2018. Wikipedia infobox type prediction using embeddings. In Proceedings of the 1st Workshop on Deep Learning for Knowledge Graphs and Semantic Technologies, ESWC. CEUR-WS.org, Crete, Greece.
[9]
Thiago Castro Ferreira, Chris van der Lee, Emiel van Miltenburg, and Emiel Krahmer. 2019. Neural data-to-text generation: A comparison between pipeline and end-to-end architectures. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Hong Kong, China, 552–562.
[10]
Songsak Channarukul, Susan Mcroy, and Syed Ali. 2001. YAG a template-based text realization system for dialog. International Journal of Uncertainty, Fuzziness and Knowledge-based Systems 9, 6 (2001), 649–659.
[11]
David L. Chen and Raymond J. Mooney. 2008. Learning to sportscast: A test of grounded language acquisition. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, NY, 128–135.
[12]
Shuang Chen. 2018. A general model for neural text generation from structured data. In Proceedings of the E2E NLG Challenge System Descriptions, 11th International Conference on Natural Language Generation. Association for Computational Linguistics, Tilburg, The Netherlands.
[13]
Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, and William Yang Wang. 2020. Few-shot NLG with pre-trained language model. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 183–190.
[14]
Andrew Chisholm, Will Radford, and Ben Hachey. 2017. Learning to generate one-sentence biographies from Wikidata. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Valencia, Spain, 633–642.
[15]
Ilyas Cicekli and Turgay Korkmaz. 1998. Generation of simple Turkish sentences with systemic-functional grammar. In Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 165–173.
[16]
Emilie Colin, Claire Gardent, Yassine Mrabet, Shashi Narayan, and Laura Perez-Beltrachini. 2016. The WebNLG challenge: Generating text from DBPedia data. In Proceedings of the 9th International Natural Language Generation Conference. Association for Computational Linguistics, Edinburgh, UK, 163–167.
[17]
Angel Daza and Anette Frank. 2018. A sequence-to-sequence model for semantic role labeling. In Proceedings of the 3rd Workshop on Representation Learning for NLP. Association for Computational Linguistics, Melbourne, Australia, 207–216.
[18]
Seniz Demir and Seza Oktem. 2022. A benchmark dataset for Turkish data-to-text generation. Computer Speech and Language (to appear). https://www.sciencedirect.com/science/article/abs/pii/S0885230822000614.
[19]
George Doddington. 2002. Automatic evaluation of machine translation quality using N-Gram co-occurrence statistics. In Proceedings of the 2nd International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc., San Francisco, CA, 138–145.
[20]
Emre Doğan, Buket Kaya, and Ahmet Müngen. 2018. Generation of original text with text mining and deep learning methods for Turkish and other languages. In Proceedings of the International Conference on Artificial Intelligence and Data Processing. IEEE, Malatya, Turkey, 1–9.
[21]
Daniel Duma and Ewan Klein. 2013. Generating natural language from linked data: Unsupervised template extraction. In Proceedings of the 10th International Conference on Computational Semantics. Association for Computational Linguistics, Potsdam, Germany, 83–94.
[22]
Ondrej Dusek, Jekaterina Novikova, and Verena Rieser. 2020. Evaluating the state-of-the-art of end-to-end natural language generation: The E2E NLG challenge. Computer Speech & Language 59, 1 (2020), 123–156.
[23]
Angela Fan, Mike Lewis, and Yann Dauphin. 2018. Hierarchical neural story generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Melbourne, Australia, 889–898.
[24]
Tanvir Ahmed Fuad, Mir Tafseer Nayeem, Asif Mahmud, and Yllias Chali. 2019. Neural sentence fusion for diversity driven abstractive multi-document summarization. Computer Speech & Language 58, 6 (2019), 216–230.
[25]
Hanning Gao, Lingfei Wu, Po Hu, and Fangli Xu. 2021. RDF-to-text generation with graph-augmented structural neural encoders. In Proceedings of the 29th International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, Yokohama, Yokohama, Japan, Article 419, 7 pages.
[26]
Claire Gardent, Anastasia Shimorina, Shashi Narayan, and Laura Perez-Beltrachini. 2017. The WebNLG challenge: Generating text from RDF data. In Proceedings of the 10th International Conference on Natural Language Generation. Association for Computational Linguistics, Santiago de Compostela, Spain, 124–133.
[27]
Albert Gatt and Emiel Krahmer. 2018. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research 61, 1 (2018), 65–170.
[28]
Sebastian Gehrmann, Falcon Dai, Henry Elder, and Alexander Rush. 2018. End-to-end content and plan selection for data-to-text generation. In Proceedings of the 11th International Conference on Natural Language Generation. Association for Computational Linguistics, Tilburg, The Netherlands, 46–56.
[29]
Abbas Ghaddar and Phillippe Langlais. 2016. Coreference in Wikipedia: Main concept resolution. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Association for Computational Linguistics, Berlin, Germany, 229–238.
[30]
Yoav Goldberg. 2016. A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research 57, 1 (2016), 345–420.
[31]
Heng Gong, Yawei Sun, Xiaocheng Feng, Bing Qin, Wei Bi, Xiaojiang Liu, and Ting Liu. 2020. TableGPT: Few-shot table-to-text generation with table structure reconstruction and content matching. In Proceedings of the 28th International Conference on Computational Linguistics. Association for Computational Linguistics, Barcelona, Spain (Online), 1978–1988.
[32]
Li Gong, Josep Crego, and Jean Senellart. 2019. Enhanced transformer model for data-to-text generation. In Proceedings of the 3rd Workshop on Neural Generation and Translation. Association for Computational Linguistics, Hong Kong, 148–156.
[33]
Alex Graves and Navdeep Jaitly. 2014. Towards end-to-end speech recognition with recurrent neural networks. In Proceedings of the 31st International Conference on International Conference on Machine Learning. JMLR.org, Beijing, China, 1764–1772.
[34]
Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O. K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Berlin, Germany, 1631–1640.
[35]
Aysun Güran, Nilgun Güler Bayazit, and Mustafa Zahid Gürbüz. 2013. Efficient feature integration with Wikipedia-based semantic feature extraction for Turkish text summarization. Turkish Journal of Electrical Engineering and Computer Sciences 21, 5 (2013), 1411–1425.
[36]
Dilek Zeynep Hakkani. 1996. Design and Implementation of a Wide-coverage Tactical Generator for Turkish, a Free Constituent Order Language. Master’s thesis. Bilkent University.
[37]
Mary Dee Harris. 2008. Building a large-scale commercial NLG system for an EMR. In Proceedings of the Fifth International Natural Language Generation Conference. Association for Computational Linguistics, Salt Fork, Ohio, 157–160.
[38]
Daniel Hewlett, Alexandre Lacoste, Llion Jones, Illia Polosukhin, Andrew Fandrianto, Jay Han, Matthew Kelcey, and David Berthelot. 2016. WikiReading: A novel large-scale language understanding task over Wikipedia. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Berlin, Germany, 1535–1545.
[39]
Wei-Ning Hsu, Yu Zhang, and James R. Glass. 2016. A prioritized grid long short-term memory RNN for speech recognition. In Proceedings of the IEEE Spoken Language Technology Workshop. IEEE, San Diego, California, 467–473.
[40]
Takashi Kawashima and Tomohiro Takagi. 2019. Sentence simplification from non-parallel corpus with adversarial learning. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. ACM, New York, NY, 43–50.
[41]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations. ACM, San Diego, CA.
[42]
Rik Koncel-Kedziorski, Dhanush Bekal, Yi Luan, Mirella Lapata, and Hannaneh Hajishirzi. 2019. Text generation from knowledge graphs with graph transformers. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 2284–2293.
[43]
Ioannis Konstas and Mirella Lapata. 2013. Inducing document plans for concept-to-text generation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, Washington, 1503–1514.
[44]
Mücahid Kutlu, Celal Cığır, and Ilyas Cicekli. 2010. Generic text summarization for Turkish. Computer Journal 53, 8 (Oct.2010), 1315–1323.
[45]
Mehmet Ali Kutlugun and Yahya Şirin. 2018. Turkish meaningful text generation with class based n-gram model. In Proceedings of the 26th Signal Processing and Communications Applications Conference. IEEE, Izmir, Turkey, 1–4.
[46]
Menekşe Kuyu, Aykut Erdem, and Erkut Erdem. 2018. Image captioning in Turkish with subword units. In Proceedings of the 26th Signal Processing and Communications Applications Conference. IEEE, Izmir, Turkey, 1–4.
[47]
Gerasimos Lampouras and Ion Androutsopoulos. 2018. Extracting linguistic resources from the web for concept-to-text generation. arXiv: 1810.13414. Retrieved from https://arxiv.org/abs/1810.13414.
[48]
Dustin Lange, Christoph Böhm, and Felix Naumann. 2010. Extracting structured information from Wikipedia articles to populate Infoboxes. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. ACM, New York, NY, 1661–1664.
[49]
Irene Langkilde-Geary. 2002. An empirical verification of coverage and correctness for a general-purpose sentence generator. In Proceedings of the International Natural Language Generation Conference. Association for Computational Linguistics, Harriman, New York, 17–24.
[50]
Rémi Lebret, David Grangier, and Michael Auli. 2016. Neural text generation from structured data with application to the biography domain. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 1203–1213.
[51]
Junyi Li, Tianyi Tang, Wayne Xin Zhao, and Ji-Rong Wen. 2021. Pretrained language model for text generation: A survey. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, IJCAI-21. International Joint Conferences on Artificial Intelligence Organization, Online, 4492–4499.
[52]
Percy Liang, Michael Jordan, and Dan Klein. 2009. Learning semantic correspondences with less supervision. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics, Suntec, Singapore, 91–99.
[53]
Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Procedings of the Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81.
[54]
Tianyu Liu, Kexiang Wang, Lei Sha, Baobao Chang, and Zhifang Sui. 2018. Table-to-text generation by structure-aware Seq2seq learning. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (2018).
[55]
Thang Luong, Ilya Sutskever, Quoc Le, Oriol Vinyals, and Wojciech Zaremba. 2015. Addressing the rare word problem in neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Beijing, China, 11–19.
[56]
Joy Mahapatra, Sudip Kumar Naskar, and Sivaji Bandyopadhyay. 2016. Statistical natural language generation from tabular non-textual data. In Proceedings of the 9th International Natural Language Generation Conference. Association for Computational Linguistics, Edinburgh, Scotland, 143–152.
[57]
Francois Mairesse, Milica Gasic, Filip Jurcicek, Simon Keizer, Blaise Thomson, Kai Yu, and Steve Young. 2010. Phrase-based statistical language generation using graphical models and active learning. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden, 1552–1561.
[58]
François Mairesse and Steve Young. 2014. Stochastic language generation in dialogue using factored language models. Computational Linguistics 40, 4 (2014), 763–799.
[59]
Robert Malouf. 2017. Abstractive morphological learning with a recurrent neural network. Morphology 27, 4 (2017), 431–458.
[60]
Sourab Mangrulkar, Suhani Shrivastava, Veena Thenkanidiyoor, and Dileep Aroor Dinesh. 2018. A context-aware convolutional natural language generation model for dialogue systems. In Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue. Association for Computational Linguistics, Melbourne, Australia, 191–200.
[61]
Elena Manishina. 2016. Data-driven natural language generation using statistical machine translation and discriminative learning. Ph.D. Dissertation. Université d’Avignon, France.
[62]
Elena Manishina, Bassam Jabaian, Stéphane Huet, and Fabrice Lefèvre. 2016. Automatic corpus extension for data-driven natural language generation. In Proceedings of the 10th International Conference on Language Resources and Evaluation. European Language Resources Association, Portorož, Slovenia, 3624–3631.
[63]
Yassine Mrabet, Pavlos Vougiouklis, Halil Kilicoglu, Claire Gardent, Dina Demner-Fushman, Jonathon Hare, and Elena Simperl. 2016. Aligning texts and knowledge bases with semantic sentence simplification. In Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web. Association for Computational Linguistics, Edinburgh, Scotland, 29–36.
[64]
Jekaterina Novikova, Oliver Lemon, and Verena Rieser. 2016. Crowd-sourcing NLG data: Pictures elicit better data. In Proceedings of the 9th International Natural Language Generation Conference. Association for Computational Linguistics, Edinburgh, UK, 265–273.
[65]
Muhammed Yavuz Nuzumlalı and Arzucan Özgür. 2014. Analyzing stemming approaches for Turkish multi-document summarization. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Doha, Qatar, 702–706.
[66]
Michael O’Donnell, Alistair Knott, Christopher Stuart Mellish, and Jon Oberlander. 2001. ILEX: The architecture for a dynamic hypertext generation system. Natural Language Engineering 7, 3 (2001), 225–250.
[67]
Kemal Oflazer and Murat Saraclar. 2018. Turkish Natural Language Processing. Springer.
[68]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, Pennsylvania, 311–318.
[69]
Laura Perez-Beltrachini and Mirella Lapata. 2018. Bootstrapping generators from noisy data. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies. Association for Computational Linguistics, New Orleans, Louisiana, 1516–1527.
[70]
Maja Popović. 2015. chrF: Character n-gram F-score for automatic MT evaluation. In Proceedings of the 10th Workshop on Statistical Machine Translation. Association for Computational Linguistics, Lisbon, Portugal, 392–395.
[71]
Francois Portet, Ehud Reiter, Albert Gatt, Jim Hunter, Somayajulu Sripada, Yvonne Freer, and Cindy Sykes. 2009. Automatic generation of textual summaries from neonatal intensive care data. Artificial Intelligence 173, 7–8 (2009), 789–816.
[72]
Rohit Prabhavalkar, Kanishka Rao, Tara N. Sainath, Bo Li, Leif Johnson, and Navdeep Jaitly. 2017. A comparison of sequence-to-sequence models for speech recognition. In Proceedings of the 18th International Speech Communication Association (Interspeech). International Speech Communication Association, Stockholm, Sweden, 939–943.
[73]
Ratish Puduppully, Li Dong, and Mirella Lapata. 2019. Data-to-text generation with content selection and planning. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. AAAI Press, Honolulu, Hawaii, 6908–6915.
[74]
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000 + questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 2383–2392.
[75]
Clément Rebuffel, Laure Soulier, Geoffrey Scoutheeten, and Patrick Gallinari. 2020. A hierarchical model for data-to-text generation. In Proceedings of the Advances in Information Retrieval, Joemon M. Jose, Emine Yilmaz, João Magalhães, Pablo Castells, Nicola Ferro, Mário J. Silva, and Flávio Martins (Eds.), Springer International Publishing, 65–80.
[76]
Ehud Reiter and Robert Dale. 1997. Building applied natural language generation systems. Natural Language Engineering 3, 1 (1997), 57–87.
[77]
Alan Ritter, Colin Cherry, and William B. Dolan. 2011. Data-driven response generation in social media. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Edinburgh, Scotland, UK., 583–593.
[78]
Yogesh Sankarasubramaniam, Krishnan Ramanathan, and Subhankar Ghosh. 2014. Text summarization using Wikipedia. Information Processing and Management 50, 3 (2014), 443–461.
[79]
Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Vancouver, Canada, 1073–1083.
[80]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Berlin, Germany, 1715–1725.
[81]
Lei Sha, Lili Mou, Tianyu Liu, Pascal Poupart, Sujian Li, Baobao Chang, and Zhifang Sui. 2018. Order-planning neural text generation from structured data. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, the 30th innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence. AAAI Press, New Orleans, Louisiana, 5414–5421.
[82]
Anastasia Shimorina and Claire Gardent. 2018. Handling rare items in data-to-text generation. In Proceedings of the 11th International Conference on Natural Language Generation. Association for Computational Linguistics, Tilburg University, The Netherlands, 360–370.
[83]
Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng, and Christopher D. Manning. 2011. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of the 28th International Conference on International Conference on Machine Learning. Omnipress, Madison, WI, 129–136.
[84]
Somayajulu Sripada, Ehud Reiter, Jim Hunter, and Jin Yu. 2003. Exploiting a parallel TEXT - DATA corpus. In Proceedings of the Corpus Linguistics. Lancaster University, UK, 734–743.
[85]
Lya Hulliyyatus Suadaa, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura, and Hiroya Takamura. 2021. Towards table-to-text generation with numerical reasoning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Bangkok, Thailand, 1451–1465.
[86]
Murat Temizsoy and Ilyas Cicekli. 1998. A language-independent system for generating feature structures from interlingua representations. In Proceedings of Natural Language Generation. Association for Computational Linguistics, Niagara-on-the-Lake, Ontario, Canada, 188–197.
[87]
Van-Khanh Tran and Le-Minh Nguyen. 2018. Adversarial domain adaptation for variational neural language generation in dialogue systems. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, 1205–1217.
[88]
Van-Khanh Tran, Le-Minh Nguyen, and Satoshi Tojo. 2017. Neural-based natural language generation in dialogue using RNN encoder-decoder with semantic aggregation. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. Association for Computational Linguistics, Saarbrücken, Germany, 231–240.
[89]
Bayu Distiawan Trisedya, Jianzhong Qi, Wei Wang, and Rui Zhang. 2021. GCP: Graph encoder with content-planning for sentence generation from knowledge base. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021), 1–1.
[90]
Kees van Deemter, Mariet Theune, and Emiel Krahmer. 2005. Real versus template-based natural language generation: A false opposition? Computational Linguistics 31, 1 (2005), 15–24.
[91]
Uluc Furkan Vardar, Ilkay Tevfik Devran, and Seniz Demir. 2019. An XML parser for Turkish Wikipedia. In Proceedings of the 27th Signal Processing and Communications Applications Conference, SIU. IEEE, Sivas, Turkey, 1–4.
[92]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Long Beach, CA, 6000–6010.
[93]
Rui Wang, Hai Zhao, Sabine Ploux, Bao-Liang Lu, Masao Utiyama, and Eiichiro Sumita. 2018. Graph-based bilingual word embedding for statistical machine translation. ACM Transactions on Asian Low-Resource Language Information Processing 17, 4 (2018), 1–23.
[94]
Tsung-Hsien Wen, Milica Gašić, Nikola Mrkšić, Pei-Hao Su, David Vandyke, and Steve Young. 2015. Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 1711–1721.
[95]
Tsung-Hsien Wen and Steve Young. 2020. Recurrent neural network language generation for spoken dialogue systems. Computer Speech and Language 63, 5 (2020), 101017.
[96]
Ruslan Yermakov, Nicholas Drago, and Angelo Ziletti. 2021. Biomedical data-to-text generation via fine-tuning transformers. In Proceedings of the 14th International Conference on Natural Language Generation. Association for Computational Linguistics, Aberdeen, Scotland, UK, 364–370.
[97]
Seunghak Yu, Nilesh Kulkarni, Haejun Lee, and Jihie Kim. 2017. Syllable-level neural language model for agglutinative language. In Proceedings of the 1st Workshop on Subword and Character Level Models in NLP. Association for Computational Linguistics, Copenhagen, Denmark, 92–96.
[98]
Tuğba Yıldız. 2019. A comparative study of author gender identification. Turkish Journal of Electrical Engineering and Computer Science 27, 2 (2019), 1052–1064.
[99]
Xingxing Zhang and Mirella Lapata. 2014. Chinese poetry generation with recurrent neural networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Doha, Qatar, 670–680.
[100]
Luowei Zhou, Yingbo Zhou, Jason J. Corso, Richard Socher, and Caiming Xiong. 2018. End-to-end dense video captioning with masked transformer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City, UT, 8739–8748.
[101]
Chenguang Zhu, Michael Zeng, and Xuedong Huang. 2019. Multi-task learning for natural language generation in task-oriented dialogue. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Hong Kong, China, 1261–1266.
[102]
Çağdaş Can Birant, Özgün Koşaner, and Özlem Aktaş. 2016. A survey to text summarization methods for Turkish. International Journal of Computer Applications 144, 6 (2016), 23–28.
[103]
Begüm Çıtamak, Menekşe Kuyu, Aykut Erdem, and Erkut Erdem. 2019. MSVD-Turkish: A large-scale dataset for video captioning in Turkish. In Proceedings of the 27th Signal Processing and Communications Applications Conference (SIU). IEEE, Sivas, Turkey.

Cited By

View all
  • (2024)Human vs. Machine: A Comparative Study on the Detection of AI-Generated ContentACM Transactions on Asian and Low-Resource Language Information Processing10.1145/370888924:2(1-26)Online publication date: 19-Dec-2024
  • (2024)DRL-based dependent task offloading with delay-energy tradeoff in medical image edge computingComplex & Intelligent Systems10.1007/s40747-023-01322-x10:3(3283-3304)Online publication date: 29-Jan-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing
ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 2
February 2023
624 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3572719
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 December 2022
Online AM: 08 July 2022
Accepted: 29 May 2022
Revised: 04 March 2022
Received: 30 March 2021
Published in TALLIP Volume 22, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data-to-text generation
  2. sequence-to-sequence model
  3. Turkish
  4. Wikipedia

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • TUBITAK-ARDEB

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)70
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Human vs. Machine: A Comparative Study on the Detection of AI-Generated ContentACM Transactions on Asian and Low-Resource Language Information Processing10.1145/370888924:2(1-26)Online publication date: 19-Dec-2024
  • (2024)DRL-based dependent task offloading with delay-energy tradeoff in medical image edge computingComplex & Intelligent Systems10.1007/s40747-023-01322-x10:3(3283-3304)Online publication date: 29-Jan-2024

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media