Nothing Special   »   [go: up one dir, main page]

Skip to main content

Pseudo Transfer Learning by Exploiting Monolingual Corpus: An Experiment on Roman Urdu Transliteration

  • Conference paper
  • First Online:
Intelligent Technologies and Applications (INTAP 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1198))

Included in the following conference series:

  • 1008 Accesses

Abstract

This paper shares two things: an efficient experiment for “pseudo” transfer learning by using huge monolingual (mono-script) dataset in a sequence-to-sequence LSTM model; and application of proposed methodology to improve Roman Urdu transliteration. The research involves echoing monolingual dataset, such that in the pre-training phase, the input and output sequences are ditto, to learn the target language. This process gives target language based initialized weights to the LSTM model before training the network with the original parallel data. The method is beneficial for reducing the requirement of more training data or more computational resources because these are usually not available to many research groups. The experiment is performed for the character-based Romanized Urdu script to standard (i.e., modified Perso-Arabic) Urdu script transliteration. Initially, a sequence-to-sequence encoder-decoder model is trained (echoed) for 100 epochs on 306.9K distinct words in standard Urdu script. Then, the trained model (with the weights tuned by echoing) is used for learning transliteration. At this stage, the parallel corpus comprises 127K pairs of Roman Urdu and standard Urdu tokens. The results are quite impressive, the proposed methodology shows BLEU accuracy of 80.1% in 100 epochs of training parallel data (preceded by echoing the mono-script data for 100 epochs), whereas, the baseline model trained solely on parallel corpus yields ≈76% BLEU accuracy in 200 epochs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/keras-team/keras/blob/master/examples/lstm_Seq2Seq.py.

  2. 2.

    http://ijunoon.com.

  3. 3.

    https://www.crummy.com/software/BeautifulSoup/.

  4. 4.

    https://github.com/dwyl/english-words.

  5. 5.

    http://crcl.dsu.edu.pk/clt14/shared_call.html.

References

  1. Ahmed, T.: Roman to Urdu transliteration using wordlist. In: Proceedings of the Conference on Language and Technology, vol. 305, p. 309 (2009)

    Google Scholar 

  2. Alam, M., ul Hussain, S.: Sequence to sequence networks for Roman-Urdu to Urdu transliteration. In: 2017 International Multi-Topic Conference (IN-MIC), pp. 1–7. IEEE (2017)

    Google Scholar 

  3. Arik, S.Ö., et al.: Deep voice: real-time neural text-to-speech. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 195–204. JMLR.org (2017)

    Google Scholar 

  4. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  5. Baxter, J., Caruana, R., Mitchell, T., Pratt, L.Y., Silver, D.L., Thrun, S.: Learning to learn: knowledge consolidation and transfer in inductive systems. In: NIPS Workshop (1995). http://plato.acadiau.ca/courses/comp/dsilver/NIPS95_LTL/transfer.workshop

  6. Bishop, C.M., et al.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)

    MATH  Google Scholar 

  7. Bögel, T.: Urdu-Roman transliteration via finite state transducers. In: 10th International Workshop on Finite State Methods and Natural Language Processing, FSMNLP 2012, pp. 25–29 (2012)

    Google Scholar 

  8. Burlot, F., Yvon, F.: Using monolingual data in neural machine translation: a systematic study. arXiv preprint arXiv:1903.11437 (2019)

  9. Canziani, A., Paszke, A., Culurciello, E.: An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678 (2016)

  10. Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)

  11. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

  12. Chollet, F., et al.: Keras (2015)

    Google Scholar 

  13. Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. arXiv preprint arXiv:1202.2745 (2012)

  14. Currey, A., Barone, A.V.M., Heafield, K.: Copied monolingual data improves low-resource neural machine translation. In: Proceedings of the Second Conference on Machine Translation, pp. 148–156 (2017)

    Google Scholar 

  15. Daud, A., Khan, W., Che, D.: Urdu language processing: a survey. Artif. Intell. Rev. 47(3), 279–311 (2017). https://doi.org/10.1007/s10462-016-9482-x

    Article  Google Scholar 

  16. Graves, A.: Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013)

  17. Hunt, K.J., Sbarbaro, D., Żbikowski, R., Gawthrop, P.J.: Neural networks for control systems – a survey. Automatica 28(6), 1083–1112 (1992)

    Article  MathSciNet  Google Scholar 

  18. Hussain, S.: Resources for Urdu language processing. In: Proceedings of the 6th Workshop on Asian Language Resources (2008)

    Google Scholar 

  19. Javed, I., Afzal, H.: Opinion analysis of bi-lingual event data from social networks. In: ESSEM@ AI* IA, pp. 164–172. Citeseer (2013)

    Google Scholar 

  20. Kachru, B.B., Kachru, Y., Sridhar, S.N.: Language in South Asia. Cambridge University Press, Cambridge (2008)

    Book  Google Scholar 

  21. Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)

    Google Scholar 

  22. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  23. Malik, M.K., et al.: Transliterating Urdu for a broad-coverage Urdu/Hindi LFG grammar. In: Seventh International Conference on Language Resources and Evaluation, LREC 2010, pp. 2921–2927 (2010)

    Google Scholar 

  24. McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5(4), 115–133 (1943). https://doi.org/10.1007/BF02478259

    Article  MathSciNet  MATH  Google Scholar 

  25. McEnery, T., Baker, P., Burnard, L.: Corpus resources and minority language engineering. In: LREC (2000)

    Google Scholar 

  26. Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010)

    Google Scholar 

  27. Mukund, S., Ghosh, D., Srihari, R.K.: Using cross-lingual projections to generate semantic role labeled corpus for Urdu: a resource poor language. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 797–805. Association for Computational Linguistics (2010)

    Google Scholar 

  28. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2009)

    Article  Google Scholar 

  29. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)

    Google Scholar 

  30. Simons, G.F., Fennig, C.D.: Ethnologue: Languages of Asia. SIL International, Dallas (2017)

    Google Scholar 

  31. Socher, R., Bengio, Y., Manning, C.D.: Deep learning for NLP (without magic). In: Tutorial Abstracts of ACL 2012, p. 5. Association for Computational Linguistics (2012)

    Google Scholar 

  32. Sorokin, A., Forsyth, D.: Utility data annotation with Amazon mechanical turk. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8. IEEE (2008)

    Google Scholar 

  33. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)

    Google Scholar 

  34. Weiss, K., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3(1), 9 (2016)

    Article  Google Scholar 

  35. Zahid, M.A., Rao, N.I., Siddiqui, A.M.: English to Urdu transliteration: an application of soundex algorithm. In: 2010 International Conference on Information and Emerging Technologies, pp. 1–5. IEEE (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tafseer Ahmed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khan, M.Y., Ahmed, T. (2020). Pseudo Transfer Learning by Exploiting Monolingual Corpus: An Experiment on Roman Urdu Transliteration. In: Bajwa, I., Sibalija, T., Jawawi, D. (eds) Intelligent Technologies and Applications. INTAP 2019. Communications in Computer and Information Science, vol 1198. Springer, Singapore. https://doi.org/10.1007/978-981-15-5232-8_36

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-5232-8_36

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-5231-1

  • Online ISBN: 978-981-15-5232-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics