Abstract
Deep intelligence provides a great way to deal with understanding the complex handwriting of the user. Handwriting is challenging due to its irregular shapes, which vary from one user to another. In recent advancements in artificial intelligence, deep learning has unprecedented potential to recognize the user’s handwritten characters or words more accurately than traditional algorithms. It works well on the concept of the neural network, many algorithms such as convolutional neural network (CNN), recurrent neural network (RNN), and long short term memory (LSTM) are the best approaches to get high accuracy in handwritten recognition. However, much of the existing literature work lacks the feature space in a scalable manner. A model consisting of CNN, RNN, and transcription layer called CRNN and the Adversarial Feature Deformation Module (AFDM) is used for the affine transformation to overcome the limitation of existing literature. Finally, we propose an adversarial architecture comprised of two separate networks; one is seven layers of CNN with spatial transformation networks (STN), which act as a generator network. Another is Bi-LSTM, with the transcription layer working as a discriminator network and applying Wasserstein’s function to verify the model effectiveness using IAM word and IndBAN dataset. The performance of the proposed model was evaluated through different baseline models. Finally, It reduced the overall word error and character error rate using our proposed approach.
Similar content being viewed by others
References
Pal U, Chaudhuri BB. Indian script character recognition: a survey. Pattern Recogn. 2004;37(9):1887–99.
Bhunia AK, Roy PP, Mohta A, Pal U. Cross-language framework for word recognition and spotting of indic scripts. Pattern Recogn. 2018;79:12–31.
Roy PP, Bhunia AK, Das A, Dey P, Pal U. Hmm-based indic handwritten word recognition using zone segmentation. Pattern Recogn. 2016;60:1057–75.
Arik P, Lior W. Cnn-n-gram for handwriting word recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p. 2305–2314.
Sudholt S, Fink GA (2016) Phocnet: a deep convolutional neural network for word spotting in handwritten documents. In: 2016 15th International conference on frontiers in handwriting recognition (ICFHR), IEEE. p. 277–282.
Chowdhury SD, Bhattacharya U, Parui SK. Levenshtein distance metric based holistic handwritten word recognition. In: Proceedings of the 4th international workshop on multilingual OCR; 2013. p. 1–5.
Bhunia AK, Das A, Bhunia AK, Kishore PSR, Roy PP. Handwriting recognition in low-resource scripts using adversarial learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2019; p. 4767–4776.
Jaderberg M, Simonyan K, Zisserman A et al. Spatial transformer networks. In: Advances in neural information processing systems, 2015; p. 2017–2025.
Arjovsky SCM, Bottou L. Wasserstein generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, Sydney, Australia, 2017.
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Advances in neural information processing systems. 2014. p. 2672–2680.
LeCun Y, Boser BE, Denker JS, Henderson D, Howard RE, Hubbard WE, Jackel LD. Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems. 1990. p. 396–404.
LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.
U-V Marti and Horst Bunke. The iam-database: an English sentence database for offline handwriting recognition. Int J Document Anal Recogn. 2002;5(1):39–46.
Hecht-Nielsen R. Theory of the backpropagation neural network. In: Neural networks for perception. Elsevier; 1992. p. 65–93
Katte T. Recurrent neural network and its various architecture types. Int J Res Sci Innov (IJRSI). 2018;5:124–9.
Shi B, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell. 2016;39(11):2298–304.
Chherawala Y, Roy PP, Cheriet M. Feature set evaluation for offline handwriting recognition systems: application to the recurrent neural network model. IEEE Trans Cybern. 2016;46(12):2825–36.
Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. 2012. p. 1097–1105.
Kumar M, Jindal SR, Jindal MK, Lehal GS. Improved recognition results of medieval handwritten gurmukhi manuscripts using boosting and bagging methodologies. Neural Process Lett. 2019;50(1):43–56.
Kumar M, Jindal MK, Sharma RK, Jindal SR. Performance evaluation of classifiers for the recognition of offline handwritten gurmukhi characters and numerals: a study. Artif Intell Rev. 2020;53(3):2075–97.
Narang SR, Jindal MK, Kumar M. Devanagari ancient character recognition using dct features with adaptive boosting and bootstrap aggregating. Soft Comput. 2019;23(24):13603–14.
Keserwani P, Ali T, Roy P. Handwritten bangla character and numeral recognition using convolutional neural network for low-memory gpu. Int J Mach Learn Cybern. 2019;10(12):3485–97.
Stratonovich RL. Conditional markov processes. In: Non-linear transformations of stochastic processes. Elsevier; 1965. p. 427–453.
Chherawala Y, Roy PP, Cheriet M. Combination of context-dependent bidirectional long short-term memory classifiers for robust offline handwriting recognition. Pattern Recogn Lett. 2017;90:58–64.
Toledo JI, Dey S, Fornés A, Lladós J. Handwriting recognition by attribute embedding and recurrent neural networks. In: 2017 14th IAPR International conference on document analysis and recognition (ICDAR), vol 1. IEEE; 2017. p. 1038–1043
Almazán J, Gordo A, Fornés A, Valveny E. Word spotting and recognition with embedded attributes. IEEE Trans Pattern Anal Mach Intell. 2014;36(12):2552–66.
Such FP, Peri D, Brockler F, Paul H, Ptucha R. Fully convolutional networks for handwriting recognition. In: 2018 16th International conference on frontiers in handwriting recognition (ICFHR). IEEE; 2018. p. 86–91.
de Buy Wenniger GM, Schomaker L, Way A. No padding please: efficient neural handwriting recognition. In: 2019 International conference on document analysis and recognition (ICDAR). IEEE; 2019. p. 355–362.
Graves A, Fernández S, Schmidhuber J. Multi-dimensional recurrent neural networks. In: International conference on artificial neural networks. Springer; 2007. p. 549–558.
Pascanu R, Mikolov T, Bengio Y. On the difficulty of training recurrent neural networks. In: International conference on machine learning, 2013. p. 1310–1318.
Kang L, Rusiñol M, Fornés A, Riba P, Villegas M. Unsupervised adaptation for synthetic-to-real handwritten word recognition. In: 2020 IEEE winter conference on applications of computer vision (WACV). IEEEE; 2020. p. 3491–3500.
Bluche T, Louradour J, Messina R. Scan, attend and read: end-to-end handwritten paragraph recognition with mdlstm attention. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1. IEEE; 2017. p. 1050–1055.
Graves A, Fernández S, Gomez F, Schmidhuber J. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd international conference on machine learning, 2006. p. 369–376.
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.
Rachev ST et al. Duality theorems for Kantorovich-Rubinstein and Wasserstein functionals. 1990.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that they have no conflicts of interest to this work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jangpangi, M., Kumar, S., Bhardwaj, D. et al. Handwriting Recognition Using Wasserstein Metric in Adversarial Learning. SN COMPUT. SCI. 4, 43 (2023). https://doi.org/10.1007/s42979-022-01445-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-022-01445-x