Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Handwriting Recognition Using Wasserstein Metric in Adversarial Learning

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Deep intelligence provides a great way to deal with understanding the complex handwriting of the user. Handwriting is challenging due to its irregular shapes, which vary from one user to another. In recent advancements in artificial intelligence, deep learning has unprecedented potential to recognize the user’s handwritten characters or words more accurately than traditional algorithms. It works well on the concept of the neural network, many algorithms such as convolutional neural network (CNN), recurrent neural network (RNN), and long short term memory (LSTM) are the best approaches to get high accuracy in handwritten recognition. However, much of the existing literature work lacks the feature space in a scalable manner. A model consisting of CNN, RNN, and transcription layer called CRNN and the Adversarial Feature Deformation Module (AFDM) is used for the affine transformation to overcome the limitation of existing literature. Finally, we propose an adversarial architecture comprised of two separate networks; one is seven layers of CNN with spatial transformation networks (STN), which act as a generator network. Another is Bi-LSTM, with the transcription layer working as a discriminator network and applying Wasserstein’s function to verify the model effectiveness using IAM word and IndBAN dataset. The performance of the proposed model was evaluated through different baseline models. Finally, It reduced the overall word error and character error rate using our proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://www.babbel.com/en/magazine/the-10-most-spoken-languages-in-the-world.

References

  1. Pal U, Chaudhuri BB. Indian script character recognition: a survey. Pattern Recogn. 2004;37(9):1887–99.

    Article  Google Scholar 

  2. Bhunia AK, Roy PP, Mohta A, Pal U. Cross-language framework for word recognition and spotting of indic scripts. Pattern Recogn. 2018;79:12–31.

    Article  Google Scholar 

  3. Roy PP, Bhunia AK, Das A, Dey P, Pal U. Hmm-based indic handwritten word recognition using zone segmentation. Pattern Recogn. 2016;60:1057–75.

    Article  Google Scholar 

  4. Arik P, Lior W. Cnn-n-gram for handwriting word recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 2305–2314.

  5. Sudholt S, Fink GA (2016) Phocnet: a deep convolutional neural network for word spotting in handwritten documents. In: 2016 15th International conference on frontiers in handwriting recognition (ICFHR), IEEE. p. 277–282.

  6. Chowdhury SD, Bhattacharya U, Parui SK. Levenshtein distance metric based holistic handwritten word recognition. In: Proceedings of the 4th international workshop on multilingual OCR; 2013. p. 1–5.

  7. Bhunia AK, Das A, Bhunia AK, Kishore PSR, Roy PP. Handwriting recognition in low-resource scripts using adversarial learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2019; p. 4767–4776.

  8. Jaderberg M, Simonyan K, Zisserman A et al. Spatial transformer networks. In: Advances in neural information processing systems, 2015; p. 2017–2025.

  9. Arjovsky SCM, Bottou L. Wasserstein generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, Sydney, Australia, 2017.

  10. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Advances in neural information processing systems. 2014. p. 2672–2680.

  11. LeCun Y, Boser BE, Denker JS, Henderson D, Howard RE, Hubbard WE, Jackel LD. Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems. 1990. p. 396–404.

  12. LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.

    Article  Google Scholar 

  13. U-V Marti and Horst Bunke. The iam-database: an English sentence database for offline handwriting recognition. Int J Document Anal Recogn. 2002;5(1):39–46.

    Article  MATH  Google Scholar 

  14. Hecht-Nielsen R. Theory of the backpropagation neural network. In: Neural networks for perception. Elsevier; 1992. p. 65–93

  15. Katte T. Recurrent neural network and its various architecture types. Int J Res Sci Innov (IJRSI). 2018;5:124–9.

    Google Scholar 

  16. Shi B, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell. 2016;39(11):2298–304.

    Article  Google Scholar 

  17. Chherawala Y, Roy PP, Cheriet M. Feature set evaluation for offline handwriting recognition systems: application to the recurrent neural network model. IEEE Trans Cybern. 2016;46(12):2825–36.

    Article  Google Scholar 

  18. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. 2012. p. 1097–1105.

  19. Kumar M, Jindal SR, Jindal MK, Lehal GS. Improved recognition results of medieval handwritten gurmukhi manuscripts using boosting and bagging methodologies. Neural Process Lett. 2019;50(1):43–56.

    Article  Google Scholar 

  20. Kumar M, Jindal MK, Sharma RK, Jindal SR. Performance evaluation of classifiers for the recognition of offline handwritten gurmukhi characters and numerals: a study. Artif Intell Rev. 2020;53(3):2075–97.

    Article  Google Scholar 

  21. Narang SR, Jindal MK, Kumar M. Devanagari ancient character recognition using dct features with adaptive boosting and bootstrap aggregating. Soft Comput. 2019;23(24):13603–14.

    Article  Google Scholar 

  22. Keserwani P, Ali T, Roy P. Handwritten bangla character and numeral recognition using convolutional neural network for low-memory gpu. Int J Mach Learn Cybern. 2019;10(12):3485–97.

    Article  Google Scholar 

  23. Stratonovich RL. Conditional markov processes. In: Non-linear transformations of stochastic processes. Elsevier; 1965. p. 427–453.

  24. Chherawala Y, Roy PP, Cheriet M. Combination of context-dependent bidirectional long short-term memory classifiers for robust offline handwriting recognition. Pattern Recogn Lett. 2017;90:58–64.

    Article  Google Scholar 

  25. Toledo JI, Dey S, Fornés A, Lladós J. Handwriting recognition by attribute embedding and recurrent neural networks. In: 2017 14th IAPR International conference on document analysis and recognition (ICDAR), vol 1. IEEE; 2017. p. 1038–1043

  26. Almazán J, Gordo A, Fornés A, Valveny E. Word spotting and recognition with embedded attributes. IEEE Trans Pattern Anal Mach Intell. 2014;36(12):2552–66.

    Article  Google Scholar 

  27. Such FP, Peri D, Brockler F, Paul H, Ptucha R. Fully convolutional networks for handwriting recognition. In: 2018 16th International conference on frontiers in handwriting recognition (ICFHR). IEEE; 2018. p. 86–91.

  28. de Buy Wenniger GM, Schomaker L, Way A. No padding please: efficient neural handwriting recognition. In: 2019 International conference on document analysis and recognition (ICDAR). IEEE; 2019. p. 355–362.

  29. Graves A, Fernández S, Schmidhuber J. Multi-dimensional recurrent neural networks. In: International conference on artificial neural networks. Springer; 2007. p. 549–558.

  30. Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. In: International conference on machine learning, 2013. p. 1310–1318.

  31. Kang L, Rusiñol M, Fornés A, Riba P, Villegas M. Unsupervised adaptation for synthetic-to-real handwritten word recognition. In: 2020 IEEE winter conference on applications of computer vision (WACV). IEEEE; 2020. p. 3491–3500.

  32. Bluche T, Louradour J, Messina R. Scan, attend and read: end-to-end handwritten paragraph recognition with mdlstm attention. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1. IEEE; 2017. p. 1050–1055.

  33. Graves A, Fernández S, Gomez F, Schmidhuber J. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd international conference on machine learning, 2006. p. 369–376.

  34. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.

    Article  Google Scholar 

  35. Rachev ST et al. Duality theorems for Kantorovich-Rubinstein and Wasserstein functionals. 1990.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sudhanshu Kumar.

Ethics declarations

Conflict of interest

The authors declared that they have no conflicts of interest to this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jangpangi, M., Kumar, S., Bhardwaj, D. et al. Handwriting Recognition Using Wasserstein Metric in Adversarial Learning. SN COMPUT. SCI. 4, 43 (2023). https://doi.org/10.1007/s42979-022-01445-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-022-01445-x

Keywords

Navigation