Nothing Special   »   [go: up one dir, main page]

Skip to main content

Evaluation of Pooling Layers in Convolutional Neural Network for Script Recognition

  • Conference paper
  • First Online:
Soft Computing in Data Science (SCDS 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1100))

Included in the following conference series:

Abstract

This paper investigates the suitable position and number of pooling layers in Convolutional Neural Network (CNN) for script recognition from scene images. A common practice of CNN for object recognition is to position a convolve layer alternately with a pooling layer followed by a few layers of fully connected layers. We re-evaluate this basic principle by examining the position of pooling layer after every convolve layer, reducing and increasing its numbers. Experimental results on MLe2e dataset for script recognition show that a CNN with less number of pooling layers and non-overlapping pooling stride can reach excellent percentage of accuracy compared to alternating convolve layer with pooling layer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bhunia, A.K., Konwer, A., Bhunia, Ak.K., Bhowmick, A., Roy, P.P., Pal, U.: Script identification in natural scene image and video frams using an attention based convolutional-LSTM network. Pattern Recogn. 85, 172–184 (2019)

    Article  Google Scholar 

  2. Gomez, L., Nicolaou, A., Karatzas, D.: Improving patch-based scene text script identification with ensembles of conjoined network. Pattern Recogn. 67, 85–96 (2017)

    Article  Google Scholar 

  3. Mei, J., Dai, L., Shi, B., Bai, X.: Scene text script identification with convolutional recurrent neural networks. In: IEEE International Conference on Pattern Recognition, pp. 4053–4058 (2016)

    Google Scholar 

  4. Sharma, N., Mandal, R., Sharma, R., Pal, U., Blumenstein, M.: ICDAR2015 competition on video script identification (CVSI 2015). In: IEEE 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 1196–1200 (2015)

    Google Scholar 

  5. Gomez, L., Karatzas, D.: A fine-grained approach to scene text script identification. In: 12th IAPR Workshop on IEEE Document Analysis Systems (DAS), pp. 192–197 (2016)

    Google Scholar 

  6. Chanda, S., Pal, U., Franke, K.: Font identification – in context of an indic script. In: 21st International Conference on Pattern Recognition (ICPR2012) (2012)

    Google Scholar 

  7. Ul-Hasan, A., Afzal, M.Z., Shafait, F., Liwicki, M., Breuel, T.M.: A sequence learning approach for multiple script identification. In: 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 1046–1050 (2015)

    Google Scholar 

  8. Saidani, A., Kacem, A., Belaid, A.: Co-occurrence matrix of oriented gradients for word script and nature identification. In: 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 16–20 (2015)

    Google Scholar 

  9. Ghosh, D., Dube, T., Shivaprasad, A.P.: Script recognition – a review. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2142–2160 (2010)

    Article  Google Scholar 

  10. Ubul, K., Tursun, G., Aysa, A., Impedovo, D.: Script identification of multi-script documents: a survey. IEEE Access 5, 6546–6559 (2017)

    Google Scholar 

  11. Fujii, Y., Driesen, K., Baccash, J., Hurst, A., Popat, A.C.: Sequence-to-label script identification for multilingual OCR. In: 14th International Conference on Document Analysis and Recognition (ICDAR), pp. 161–168 (2017)

    Google Scholar 

  12. Chen, Z., Wu, Y., Yin, F., Liu, C.L.: Simultaneous script identification and handwriting recognition via muti-task learning of recurrent neural networks. In: 14th International Conference on Document Analysis and Recognition (ICDAR), pp. 525–530 (2017)

    Google Scholar 

  13. Gomez, L.: MLe2e multi-lingual end-to-end dataset (2016). https://www.researchgate.net/publication/297469752_MLe2e_multi-lingual_end-to-end_dataset

  14. Fukushima, K.: Neocognitron: a hierarchical neural network capable of visual pattern recognition. Neural Netw. 1(2), 119–130 (1988)

    Article  Google Scholar 

  15. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  16. Chen, L., Wong, S., Fan, W., Sun, J., Satoshi, N.: Reconstruction combined training for convolutional neural networks on character recognition. In: 13th International Conference on Document Analysis and Recognition (ICDAR) (2015)

    Google Scholar 

  17. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Confl. Violence 115(3), 211–252 (2015)

    MathSciNet  Google Scholar 

  18. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  19. Rozantsev, A., Lepetit, V., Fua, P.: On rendering synthetic images for training an object detector. Comput. Vis. Image Underst. 137, 24–37 (2015)

    Article  Google Scholar 

  20. Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015)

    Google Scholar 

  21. Krizhecsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of Neural Information Processing Systems (NIPS) (2012)

    Google Scholar 

  22. Szegedy, C., et al.: Going deeper with convolution. In: Proceedings of Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zaidah Ibrahim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ibrahim, Z., Isa, D., Idrus, Z., Kasiran, Z., Roslan, R. (2019). Evaluation of Pooling Layers in Convolutional Neural Network for Script Recognition. In: Berry, M., Yap, B., Mohamed, A., Köppen, M. (eds) Soft Computing in Data Science. SCDS 2019. Communications in Computer and Information Science, vol 1100. Springer, Singapore. https://doi.org/10.1007/978-981-15-0399-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-0399-3_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-0398-6

  • Online ISBN: 978-981-15-0399-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics