Abstract
This paper shares two things: an efficient experiment for “pseudo” transfer learning by using huge monolingual (mono-script) dataset in a sequence-to-sequence LSTM model; and application of proposed methodology to improve Roman Urdu transliteration. The research involves echoing monolingual dataset, such that in the pre-training phase, the input and output sequences are ditto, to learn the target language. This process gives target language based initialized weights to the LSTM model before training the network with the original parallel data. The method is beneficial for reducing the requirement of more training data or more computational resources because these are usually not available to many research groups. The experiment is performed for the character-based Romanized Urdu script to standard (i.e., modified Perso-Arabic) Urdu script transliteration. Initially, a sequence-to-sequence encoder-decoder model is trained (echoed) for 100 epochs on 306.9K distinct words in standard Urdu script. Then, the trained model (with the weights tuned by echoing) is used for learning transliteration. At this stage, the parallel corpus comprises 127K pairs of Roman Urdu and standard Urdu tokens. The results are quite impressive, the proposed methodology shows BLEU accuracy of 80.1% in 100 epochs of training parallel data (preceded by echoing the mono-script data for 100 epochs), whereas, the baseline model trained solely on parallel corpus yields ≈76% BLEU accuracy in 200 epochs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahmed, T.: Roman to Urdu transliteration using wordlist. In: Proceedings of the Conference on Language and Technology, vol. 305, p. 309 (2009)
Alam, M., ul Hussain, S.: Sequence to sequence networks for Roman-Urdu to Urdu transliteration. In: 2017 International Multi-Topic Conference (IN-MIC), pp. 1–7. IEEE (2017)
Arik, S.Ö., et al.: Deep voice: real-time neural text-to-speech. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 195–204. JMLR.org (2017)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Baxter, J., Caruana, R., Mitchell, T., Pratt, L.Y., Silver, D.L., Thrun, S.: Learning to learn: knowledge consolidation and transfer in inductive systems. In: NIPS Workshop (1995). http://plato.acadiau.ca/courses/comp/dsilver/NIPS95_LTL/transfer.workshop
Bishop, C.M., et al.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
Bögel, T.: Urdu-Roman transliteration via finite state transducers. In: 10th International Workshop on Finite State Methods and Natural Language Processing, FSMNLP 2012, pp. 25–29 (2012)
Burlot, F., Yvon, F.: Using monolingual data in neural machine translation: a systematic study. arXiv preprint arXiv:1903.11437 (2019)
Canziani, A., Paszke, A., Culurciello, E.: An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678 (2016)
Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Chollet, F., et al.: Keras (2015)
Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. arXiv preprint arXiv:1202.2745 (2012)
Currey, A., Barone, A.V.M., Heafield, K.: Copied monolingual data improves low-resource neural machine translation. In: Proceedings of the Second Conference on Machine Translation, pp. 148–156 (2017)
Daud, A., Khan, W., Che, D.: Urdu language processing: a survey. Artif. Intell. Rev. 47(3), 279–311 (2017). https://doi.org/10.1007/s10462-016-9482-x
Graves, A.: Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013)
Hunt, K.J., Sbarbaro, D., Żbikowski, R., Gawthrop, P.J.: Neural networks for control systems – a survey. Automatica 28(6), 1083–1112 (1992)
Hussain, S.: Resources for Urdu language processing. In: Proceedings of the 6th Workshop on Asian Language Resources (2008)
Javed, I., Afzal, H.: Opinion analysis of bi-lingual event data from social networks. In: ESSEM@ AI* IA, pp. 164–172. Citeseer (2013)
Kachru, B.B., Kachru, Y., Sridhar, S.N.: Language in South Asia. Cambridge University Press, Cambridge (2008)
Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Malik, M.K., et al.: Transliterating Urdu for a broad-coverage Urdu/Hindi LFG grammar. In: Seventh International Conference on Language Resources and Evaluation, LREC 2010, pp. 2921–2927 (2010)
McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5(4), 115–133 (1943). https://doi.org/10.1007/BF02478259
McEnery, T., Baker, P., Burnard, L.: Corpus resources and minority language engineering. In: LREC (2000)
Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010)
Mukund, S., Ghosh, D., Srihari, R.K.: Using cross-lingual projections to generate semantic role labeled corpus for Urdu: a resource poor language. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 797–805. Association for Computational Linguistics (2010)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Simons, G.F., Fennig, C.D.: Ethnologue: Languages of Asia. SIL International, Dallas (2017)
Socher, R., Bengio, Y., Manning, C.D.: Deep learning for NLP (without magic). In: Tutorial Abstracts of ACL 2012, p. 5. Association for Computational Linguistics (2012)
Sorokin, A., Forsyth, D.: Utility data annotation with Amazon mechanical turk. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8. IEEE (2008)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Weiss, K., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3(1), 9 (2016)
Zahid, M.A., Rao, N.I., Siddiqui, A.M.: English to Urdu transliteration: an application of soundex algorithm. In: 2010 International Conference on Information and Emerging Technologies, pp. 1–5. IEEE (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Khan, M.Y., Ahmed, T. (2020). Pseudo Transfer Learning by Exploiting Monolingual Corpus: An Experiment on Roman Urdu Transliteration. In: Bajwa, I., Sibalija, T., Jawawi, D. (eds) Intelligent Technologies and Applications. INTAP 2019. Communications in Computer and Information Science, vol 1198. Springer, Singapore. https://doi.org/10.1007/978-981-15-5232-8_36
Download citation
DOI: https://doi.org/10.1007/978-981-15-5232-8_36
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5231-1
Online ISBN: 978-981-15-5232-8
eBook Packages: Computer ScienceComputer Science (R0)