Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Persian handwritten digit, character and word recognition using deep learning

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

In spite of various applications of digit, letter and word recognition, only a few studies have dealt with Persian scripts. In this paper, deep neural networks are utilized through different DenseNet and Xception architectures, being further boosted by means of data augmentation and test time augmentation. Dividing the datasets to training, validation and test sets, and utilizing k-fold cross-validation, the comparison of the proposed method with various state-of-the-art alternatives is performed. Three datasets: HODA, Sadri and Iranshahr are used, which offer the most comprehensive collections of samples in terms of handwriting styles and the forms each letter may take depending on its position within a word. On the HODA dataset, we achieve recognition rates of 99.49% and 98.10% for digits and characters, being 99.72%, 89.99% and 98.82% for digits, characters and words from the Sadri dataset, respectively, as well as 98.99% for words from the Iranshahr dataset, each of which outperforms the performances achieved by the most advanced alternative networks, namely ResNet50 and VGG16. An additional contribution of the paper arises from its capability of words recognition as a holistic image classification. This improves the resulting speed and versatility significantly, as it does not require explicit character models, unlike earlier alternatives such as hidden Markov models and convolutional recursive neural networks. In addition, computation times have been compared with alternative state-of-the-art models and better performance has been observed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Pansare, S., Joshi, D.: A survey on optical character recognition techniques. Int. J. Sci. Res. (IJSR) 3, 1247–1249 (2014)

    Google Scholar 

  2. Borovikov, E.: A survey of modern optical character recognition techniques. arXiv preprint arXiv:1412.4183 (2014)

  3. ping Tian, D.: A review on image feature extraction and representation techniques. Int. J. Multimed. Ubiquitous Eng. 8, 385–396 (2013)

    Google Scholar 

  4. Khorashadizadeh, S., Latif, A.: Arabic/Farsi handwritten digit recognition using histogra of oriented gradient and chain code histogram. Int. Arab J. Inf. Technol. (IAJIT) 13, 367 (2016)

    Google Scholar 

  5. Dehghan, M., Faez, K.: Farsi handwritten character recognition with moment invariants. In: Proceedings of 13th International Conference on Digital Signal Processing, vol. 2, pp. 507–510 (1997)

  6. Niu, X.-X.S., Ching, Y.: A novel hybrid CNN-SVM classifier for recognizing handwritten digits. Pattern Recognit. 45, 1318–1325 (2012)

    Article  Google Scholar 

  7. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  8. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)

    Article  Google Scholar 

  9. Huang, G., Liu, Z., Van Der, M., Laurens, W., Kilian Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

  10. Khosravi, H., Kabir, E.: Introducing a very large dataset of handwritten Farsi digits and a study on their varieties. Pattern Recognit. Lett. 28, 1133–1141 (2007)

    Article  Google Scholar 

  11. Sadri, J., Yeganehzad, M.R., Saghi, J.: A novel comprehensive database for offline Persian handwriting recognition. Pattern Recognit. 60, 378–393 (2016)

    Article  Google Scholar 

  12. Bayesteh, E., Ahmadifard, A., Khosravi, H.: A lexicon reduction method based on clustering word images in offline Farsi handwritten word recognition systems. In: 7th Iranian Conference on Machine Vision and Image Processing, pp. 1–5. IEEE (2011)

  13. Theckedath, D., Sedamkar, R.R.: Detecting affect states using VGG16, ResNet50 and SE-ResNet50 networks. SN Comput. Sci. 1(2), 1–7 (2020)

    Article  Google Scholar 

  14. Parseh, M.J., Meftahi, M.: A new combined feature extraction method for Persian handwritten digit recognition. Int. J. Image Graphics 17, 1750012 (2017)

    Article  Google Scholar 

  15. Hajihashemi, V., Ameri, M.M.A., Gharahbagh, A.A., Bastanfard, A.: A pattern recognition based Holographic Graph Neuron for Persian alphabet recognition. In: International Conference on Machine Vision and Image Processing (MVIP), pp. 1–6 (2020)

  16. Ghods, V., Sohrabi, M.K.: Online Farsi handwritten character recognition using hidden Markov model. JCP 11, 169–175 (2016)

    Article  Google Scholar 

  17. Arani, S.A., Kabir, E., Ebrahimpour, R.: Combining RtL and LtR HMMs to recognise handwritten Farsi words of small-and medium-sized vocabularies. IET Comput. Vis. 12, 925–932 (2018)

    Article  Google Scholar 

  18. Sajedi, H.: Handwriting recognition of digits, signs, and numerical strings in Persian. Comput. Electr. Eng. 49, 52–65 (2016)

    Article  Google Scholar 

  19. Montazer, G.A., Soltanshahi, M.A., Giveki, D.: Farsi/Arabic handwritten digit recognition using quantum neural networks and bag of visual words method. Opt. Mem. Neural Netw. 26, 117–128 (2017)

    Article  Google Scholar 

  20. Parseh, M.R., Mohammad Keshavarzi, P.: Handwriting recognition of digits, and numerical strings in Persian. Comput. Electr. Eng. 49, 52–65 (2016)

    Article  Google Scholar 

  21. Nanehkaran, Y.A., Zhang, D., Salimi, S., Chen, J., Tian, Y., Al-Nabhan, N.: Analysis and comparison of machine learning classifiers and deep neural networks techniques for recognition of Farsi handwritten digits. J. Supercomput. 77, 3193–3222 (2020)

    Article  Google Scholar 

  22. Ghadikolaie, M.F., Kabir, E., Razzazi, F.: Sub-word based offline handwritten Farsi word recognition using recurrent neural network. ETRI J. 38, 703–713 (2016)

    Google Scholar 

  23. Akhlaghi, M., Ghods, V.: Farsi handwritten phone number recognition using deep learning. SN Appl. Sci. 2, 1–10 (2020)

    Article  Google Scholar 

  24. Sarvaramini, F., Nasrollahzadeh, A., Soryani, M.: Persian handwritten character recognition using convolutional neural network. In: Iranian Conference on Electrical Engineering (ICEE), pp. 1676–1680 (2018)

  25. Alizadehashraf, B., Roohi, S.: Persian handwritten character recognition using convolutional neural network. In: 10th Iranian Conference on Machine Vision and Image Processing (MVIP), pp. 247–251. IEEE (2017)

  26. Farahbakhsh, E., Kozegar, E., Soryani, M.: Improving Persian digit recognition by combining data augmentation and AlexNet. In: 10th Iranian Conference on Machine Vision and Image Processing (MVIP), pp. 265–270. IEEE (2017)

  27. Bossaghzadeh, A.: Improving Persian digit recognition by combining deep neural networks and SVM and using PCA. In: International Conference on Machine Vision and Image Processing (MVIP), pp. 1–5. IEEE (2020)

  28. Liu, C.-L., Suen, C.Y.: A new benchmark on the recognition of handwritten Bangla and Farsi numeral characters. Pattern Recognit. 42(12), 3287–3295 (2009)

    Article  Google Scholar 

  29. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

  30. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)

  31. Kuehne, H., Richard, A., Gall, J.: A hybrid RNN-HMM approach for weakly supervised temporal action segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 42, 765–779 (2018)

    Article  Google Scholar 

  32. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556(2014)

  33. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2015)

  34. Bottou, L.: Neural Networks: Tricks of the Trade, pp. 421–436. Springer, Berlin (2012)

    Book  Google Scholar 

  35. Kingma, D.P.Ba, Jimmy, A.: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014)

  36. Fushiki, T.: Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 21, 137–146 (2011)

    Article  MathSciNet  Google Scholar 

  37. Alaei, A., Pal, U., Nagabhushan, P.: Using modified contour features and SVM based classifier for the recognition of Persian/Arabic handwritten numerals. In: Seventh International Conference on Advances in Pattern Recognition, pp. 391–394. IEEE (2009)

  38. Hosseini-Pozveh, M.S., Safayani, M., Mirzaei, A.: Interval type-2 fuzzy restricted Boltzmann machine. IEEE Trans. Fuzzy Syst. 49 (2020)

  39. Al-wajih E, Ghazali R. Improving the accuracy for offline Arabic digit recognition using sliding window approach. Iran. J. Sci. Technol. Trans. Electr. Eng. 1–12 (2020)

  40. Safarzadeh, V.M., Jafarzadeh, P.: Offline Persian handwriting recognition with CNN and RNN-CTC. In: 25th International Computer Conference, Computer Society of Iran (CSICC), pp. 1–10. IEEE (2020)

  41. Shanmugam, D., Blalock, D., Balakrishnan, G., Guttag, J.: When and why test-time augmentation works. In: 25th International Computer Conference, Computer Society of Iran (CSICC), pp. 1–10. IEEE (2020)

Download references

Acknowledgements

This work was supported by the European Social Fund via IT Academy program and the Estonian Research Council [grant number COVSG24].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simindokht Jahangard.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bonyani, M., Jahangard, S. & Daneshmand, M. Persian handwritten digit, character and word recognition using deep learning. IJDAR 24, 133–143 (2021). https://doi.org/10.1007/s10032-021-00368-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-021-00368-2

Keywords

Navigation