Abstract
In spite of various applications of digit, letter and word recognition, only a few studies have dealt with Persian scripts. In this paper, deep neural networks are utilized through different DenseNet and Xception architectures, being further boosted by means of data augmentation and test time augmentation. Dividing the datasets to training, validation and test sets, and utilizing k-fold cross-validation, the comparison of the proposed method with various state-of-the-art alternatives is performed. Three datasets: HODA, Sadri and Iranshahr are used, which offer the most comprehensive collections of samples in terms of handwriting styles and the forms each letter may take depending on its position within a word. On the HODA dataset, we achieve recognition rates of 99.49% and 98.10% for digits and characters, being 99.72%, 89.99% and 98.82% for digits, characters and words from the Sadri dataset, respectively, as well as 98.99% for words from the Iranshahr dataset, each of which outperforms the performances achieved by the most advanced alternative networks, namely ResNet50 and VGG16. An additional contribution of the paper arises from its capability of words recognition as a holistic image classification. This improves the resulting speed and versatility significantly, as it does not require explicit character models, unlike earlier alternatives such as hidden Markov models and convolutional recursive neural networks. In addition, computation times have been compared with alternative state-of-the-art models and better performance has been observed.
Similar content being viewed by others
References
Pansare, S., Joshi, D.: A survey on optical character recognition techniques. Int. J. Sci. Res. (IJSR) 3, 1247–1249 (2014)
Borovikov, E.: A survey of modern optical character recognition techniques. arXiv preprint arXiv:1412.4183 (2014)
ping Tian, D.: A review on image feature extraction and representation techniques. Int. J. Multimed. Ubiquitous Eng. 8, 385–396 (2013)
Khorashadizadeh, S., Latif, A.: Arabic/Farsi handwritten digit recognition using histogra of oriented gradient and chain code histogram. Int. Arab J. Inf. Technol. (IAJIT) 13, 367 (2016)
Dehghan, M., Faez, K.: Farsi handwritten character recognition with moment invariants. In: Proceedings of 13th International Conference on Digital Signal Processing, vol. 2, pp. 507–510 (1997)
Niu, X.-X.S., Ching, Y.: A novel hybrid CNN-SVM classifier for recognizing handwritten digits. Pattern Recognit. 45, 1318–1325 (2012)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)
Huang, G., Liu, Z., Van Der, M., Laurens, W., Kilian Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Khosravi, H., Kabir, E.: Introducing a very large dataset of handwritten Farsi digits and a study on their varieties. Pattern Recognit. Lett. 28, 1133–1141 (2007)
Sadri, J., Yeganehzad, M.R., Saghi, J.: A novel comprehensive database for offline Persian handwriting recognition. Pattern Recognit. 60, 378–393 (2016)
Bayesteh, E., Ahmadifard, A., Khosravi, H.: A lexicon reduction method based on clustering word images in offline Farsi handwritten word recognition systems. In: 7th Iranian Conference on Machine Vision and Image Processing, pp. 1–5. IEEE (2011)
Theckedath, D., Sedamkar, R.R.: Detecting affect states using VGG16, ResNet50 and SE-ResNet50 networks. SN Comput. Sci. 1(2), 1–7 (2020)
Parseh, M.J., Meftahi, M.: A new combined feature extraction method for Persian handwritten digit recognition. Int. J. Image Graphics 17, 1750012 (2017)
Hajihashemi, V., Ameri, M.M.A., Gharahbagh, A.A., Bastanfard, A.: A pattern recognition based Holographic Graph Neuron for Persian alphabet recognition. In: International Conference on Machine Vision and Image Processing (MVIP), pp. 1–6 (2020)
Ghods, V., Sohrabi, M.K.: Online Farsi handwritten character recognition using hidden Markov model. JCP 11, 169–175 (2016)
Arani, S.A., Kabir, E., Ebrahimpour, R.: Combining RtL and LtR HMMs to recognise handwritten Farsi words of small-and medium-sized vocabularies. IET Comput. Vis. 12, 925–932 (2018)
Sajedi, H.: Handwriting recognition of digits, signs, and numerical strings in Persian. Comput. Electr. Eng. 49, 52–65 (2016)
Montazer, G.A., Soltanshahi, M.A., Giveki, D.: Farsi/Arabic handwritten digit recognition using quantum neural networks and bag of visual words method. Opt. Mem. Neural Netw. 26, 117–128 (2017)
Parseh, M.R., Mohammad Keshavarzi, P.: Handwriting recognition of digits, and numerical strings in Persian. Comput. Electr. Eng. 49, 52–65 (2016)
Nanehkaran, Y.A., Zhang, D., Salimi, S., Chen, J., Tian, Y., Al-Nabhan, N.: Analysis and comparison of machine learning classifiers and deep neural networks techniques for recognition of Farsi handwritten digits. J. Supercomput. 77, 3193–3222 (2020)
Ghadikolaie, M.F., Kabir, E., Razzazi, F.: Sub-word based offline handwritten Farsi word recognition using recurrent neural network. ETRI J. 38, 703–713 (2016)
Akhlaghi, M., Ghods, V.: Farsi handwritten phone number recognition using deep learning. SN Appl. Sci. 2, 1–10 (2020)
Sarvaramini, F., Nasrollahzadeh, A., Soryani, M.: Persian handwritten character recognition using convolutional neural network. In: Iranian Conference on Electrical Engineering (ICEE), pp. 1676–1680 (2018)
Alizadehashraf, B., Roohi, S.: Persian handwritten character recognition using convolutional neural network. In: 10th Iranian Conference on Machine Vision and Image Processing (MVIP), pp. 247–251. IEEE (2017)
Farahbakhsh, E., Kozegar, E., Soryani, M.: Improving Persian digit recognition by combining data augmentation and AlexNet. In: 10th Iranian Conference on Machine Vision and Image Processing (MVIP), pp. 265–270. IEEE (2017)
Bossaghzadeh, A.: Improving Persian digit recognition by combining deep neural networks and SVM and using PCA. In: International Conference on Machine Vision and Image Processing (MVIP), pp. 1–5. IEEE (2020)
Liu, C.-L., Suen, C.Y.: A new benchmark on the recognition of handwritten Bangla and Farsi numeral characters. Pattern Recognit. 42(12), 3287–3295 (2009)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)
Kuehne, H., Richard, A., Gall, J.: A hybrid RNN-HMM approach for weakly supervised temporal action segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 42, 765–779 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556(2014)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2015)
Bottou, L.: Neural Networks: Tricks of the Trade, pp. 421–436. Springer, Berlin (2012)
Kingma, D.P.Ba, Jimmy, A.: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014)
Fushiki, T.: Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 21, 137–146 (2011)
Alaei, A., Pal, U., Nagabhushan, P.: Using modified contour features and SVM based classifier for the recognition of Persian/Arabic handwritten numerals. In: Seventh International Conference on Advances in Pattern Recognition, pp. 391–394. IEEE (2009)
Hosseini-Pozveh, M.S., Safayani, M., Mirzaei, A.: Interval type-2 fuzzy restricted Boltzmann machine. IEEE Trans. Fuzzy Syst. 49 (2020)
Al-wajih E, Ghazali R. Improving the accuracy for offline Arabic digit recognition using sliding window approach. Iran. J. Sci. Technol. Trans. Electr. Eng. 1–12 (2020)
Safarzadeh, V.M., Jafarzadeh, P.: Offline Persian handwriting recognition with CNN and RNN-CTC. In: 25th International Computer Conference, Computer Society of Iran (CSICC), pp. 1–10. IEEE (2020)
Shanmugam, D., Blalock, D., Balakrishnan, G., Guttag, J.: When and why test-time augmentation works. In: 25th International Computer Conference, Computer Society of Iran (CSICC), pp. 1–10. IEEE (2020)
Acknowledgements
This work was supported by the European Social Fund via IT Academy program and the Estonian Research Council [grant number COVSG24].
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bonyani, M., Jahangard, S. & Daneshmand, M. Persian handwritten digit, character and word recognition using deep learning. IJDAR 24, 133–143 (2021). https://doi.org/10.1007/s10032-021-00368-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-021-00368-2