Pseudo Transfer Learning by Exploiting Monolingual Corpus: An Experiment on Roman Urdu Transliteration

Muhammad Yaseen Khan⁹ &
Tafseer Ahmed⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1198))

Included in the following conference series:

International Conference on Intelligent Technologies and Applications

1052 Accesses

Abstract

This paper shares two things: an efficient experiment for “pseudo” transfer learning by using huge monolingual (mono-script) dataset in a sequence-to-sequence LSTM model; and application of proposed methodology to improve Roman Urdu transliteration. The research involves echoing monolingual dataset, such that in the pre-training phase, the input and output sequences are ditto, to learn the target language. This process gives target language based initialized weights to the LSTM model before training the network with the original parallel data. The method is beneficial for reducing the requirement of more training data or more computational resources because these are usually not available to many research groups. The experiment is performed for the character-based Romanized Urdu script to standard (i.e., modified Perso-Arabic) Urdu script transliteration. Initially, a sequence-to-sequence encoder-decoder model is trained (echoed) for 100 epochs on 306.9K distinct words in standard Urdu script. Then, the trained model (with the weights tuned by echoing) is used for learning transliteration. At this stage, the parallel corpus comprises 127K pairs of Roman Urdu and standard Urdu tokens. The results are quite impressive, the proposed methodology shows BLEU accuracy of 80.1% in 100 epochs of training parallel data (preceded by echoing the mono-script data for 100 epochs), whereas, the baseline model trained solely on parallel corpus yields ≈76% BLEU accuracy in 200 epochs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Transforming Language Translation: A Deep Learning Approach to Urdu–English Translation

Article 22 August 2024

RNN-LSTM-GRU based language transformation

Article 17 August 2019

A Hybrid Machine Transliteration Model Based on Multi-source Encoder–Decoder Framework: English to Manipuri

Article 11 January 2022

Notes

References

Ahmed, T.: Roman to Urdu transliteration using wordlist. In: Proceedings of the Conference on Language and Technology, vol. 305, p. 309 (2009)
Google Scholar
Alam, M., ul Hussain, S.: Sequence to sequence networks for Roman-Urdu to Urdu transliteration. In: 2017 International Multi-Topic Conference (IN-MIC), pp. 1–7. IEEE (2017)
Google Scholar
Arik, S.Ö., et al.: Deep voice: real-time neural text-to-speech. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 195–204. JMLR.org (2017)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Baxter, J., Caruana, R., Mitchell, T., Pratt, L.Y., Silver, D.L., Thrun, S.: Learning to learn: knowledge consolidation and transfer in inductive systems. In: NIPS Workshop (1995). http://plato.acadiau.ca/courses/comp/dsilver/NIPS95_LTL/transfer.workshop
Bishop, C.M., et al.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
MATH Google Scholar
Bögel, T.: Urdu-Roman transliteration via finite state transducers. In: 10th International Workshop on Finite State Methods and Natural Language Processing, FSMNLP 2012, pp. 25–29 (2012)
Google Scholar
Burlot, F., Yvon, F.: Using monolingual data in neural machine translation: a systematic study. arXiv preprint arXiv:1903.11437 (2019)
Canziani, A., Paszke, A., Culurciello, E.: An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678 (2016)
Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Chollet, F., et al.: Keras (2015)
Google Scholar
Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. arXiv preprint arXiv:1202.2745 (2012)
Currey, A., Barone, A.V.M., Heafield, K.: Copied monolingual data improves low-resource neural machine translation. In: Proceedings of the Second Conference on Machine Translation, pp. 148–156 (2017)
Google Scholar
Daud, A., Khan, W., Che, D.: Urdu language processing: a survey. Artif. Intell. Rev. 47(3), 279–311 (2017). https://doi.org/10.1007/s10462-016-9482-x
Article Google Scholar
Graves, A.: Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013)
Hunt, K.J., Sbarbaro, D., Żbikowski, R., Gawthrop, P.J.: Neural networks for control systems – a survey. Automatica 28(6), 1083–1112 (1992)
Article MathSciNet Google Scholar
Hussain, S.: Resources for Urdu language processing. In: Proceedings of the 6th Workshop on Asian Language Resources (2008)
Google Scholar
Javed, I., Afzal, H.: Opinion analysis of bi-lingual event data from social networks. In: ESSEM@ AI* IA, pp. 164–172. Citeseer (2013)
Google Scholar
Kachru, B.B., Kachru, Y., Sridhar, S.N.: Language in South Asia. Cambridge University Press, Cambridge (2008)
Book Google Scholar
Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Malik, M.K., et al.: Transliterating Urdu for a broad-coverage Urdu/Hindi LFG grammar. In: Seventh International Conference on Language Resources and Evaluation, LREC 2010, pp. 2921–2927 (2010)
Google Scholar
McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5(4), 115–133 (1943). https://doi.org/10.1007/BF02478259
Article MathSciNet MATH Google Scholar
McEnery, T., Baker, P., Burnard, L.: Corpus resources and minority language engineering. In: LREC (2000)
Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010)
Google Scholar
Mukund, S., Ghosh, D., Srihari, R.K.: Using cross-lingual projections to generate semantic role labeled corpus for Urdu: a resource poor language. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 797–805. Association for Computational Linguistics (2010)
Google Scholar
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009)
Article Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Google Scholar
Simons, G.F., Fennig, C.D.: Ethnologue: Languages of Asia. SIL International, Dallas (2017)
Google Scholar
Socher, R., Bengio, Y., Manning, C.D.: Deep learning for NLP (without magic). In: Tutorial Abstracts of ACL 2012, p. 5. Association for Computational Linguistics (2012)
Google Scholar
Sorokin, A., Forsyth, D.: Utility data annotation with Amazon mechanical turk. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8. IEEE (2008)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Google Scholar
Weiss, K., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3(1), 9 (2016)
Article Google Scholar
Zahid, M.A., Rao, N.I., Siddiqui, A.M.: English to Urdu transliteration: an application of soundex algorithm. In: 2010 International Conference on Information and Emerging Technologies, pp. 1–5. IEEE (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Language Computing, Department of Computer Science, Mohammad Ali Jinnah University, Karachi, Pakistan
Muhammad Yaseen Khan & Tafseer Ahmed

Authors

Muhammad Yaseen Khan
View author publications
You can also search for this author in PubMed Google Scholar
Tafseer Ahmed
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tafseer Ahmed .

Editor information

Editors and Affiliations

Islamia University of Bahawalpur, Baghdad, Pakistan
Imran Sarwar Bajwa
Metropolitan University, Belgrade, Serbia
Tatjana Sibalija
University of Technology Malaysia, Johor Bahru, Malaysia
Dayang Norhayati Abang Jawawi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khan, M.Y., Ahmed, T. (2020). Pseudo Transfer Learning by Exploiting Monolingual Corpus: An Experiment on Roman Urdu Transliteration. In: Bajwa, I., Sibalija, T., Jawawi, D. (eds) Intelligent Technologies and Applications. INTAP 2019. Communications in Computer and Information Science, vol 1198. Springer, Singapore. https://doi.org/10.1007/978-981-15-5232-8_36

Download citation

DOI: https://doi.org/10.1007/978-981-15-5232-8_36
Published: 09 May 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5231-1
Online ISBN: 978-981-15-5232-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics