Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Text detection and localization in scene images: a broad review

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Nowadays, text detection and localization have gained much popularity in the field of text analysis systems as they pave the way for the number of real-time based applications like mobile transliteration technologies, assistive methods for visually impaired persons, etc. Text detection and localization techniques are used to find the position of text area in the image.This paper intends to present a broad review in this field as five-fold: (1) comparison of document images with scene images and applications of natural scene images, (2) significant and up-to-date traditional machine learning and deep learning-based approaches for the text detection and localization for different languages, (3) various publicly available benchmarked datasets, (4) comparative analysis for other benchmarked datasets and, (5) related challenges and future scope on the field. The paper summarises some of the potential ways in this field, which can serve as a useful reference for the researchers for future exploration of the area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Ahmed SB, Naz S, Razzak MI, Yousaf R (2017a) Deep Learning based Isolated Arabic Scene Character Recognition. IEEE International Workshop on Arabic Script Analysis and Recognition pp 46–51. arxiv:1704.06821

  • Ahmed SB, Naz S, Razzak MI, Yousaf R (2017b) Deep learning based isolated Arabic scene character recognition. In: IEEE International Workshop on Arabic Script Analysis and Recognition (ASAR) Deep, pp 46–51. https://doi.org/10.1109/asar.2017.8067758,arxiv:1704.06821

  • Alessi NG, Battiato S, Gallo G, Mancuso M, Stanco F (2003) Automatic discrimination of images. Proc SPIE-IS&T Electron Imag SPIE 5017(5017):351–359

    Google Scholar 

  • Angadi SA, Kodabagi M (2009) A texture based methodology for text region extraction from low resolution natural scene images. Int J Image Process 3(5):229–245

    Google Scholar 

  • Ansari GJ, Shah JH, Yasmin M, Sharif M, Fernandes SL (2018) A novel machine learning approach for scene text extraction. Future Generat Comput Syst 87:328–340. https://doi.org/10.1016/j.future.2018.04.074

    Article  Google Scholar 

  • Baek Y, Lee B, Han D, Yun S, Lee H (2019) Character region awareness for text detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2019-June, pp 9357–9366. https://doi.org/10.1109/CVPR.2019.00959. arxiv:1904.01941

  • Bains JK, Singh S, Sharma A (2020) Dynamic features based stroke recognition system for signboard images of Gurmukhi text. Multimedia Tools and Applications pp 1–25. https://doi.org/10.1007/s11042-020-09653-4

  • Bhattacharya U, Parui SK, Mondal S (2009) Devanagari and Bangla Text Extraction from Natural Scene Images. In: 2009 10th International Conference on Document Analysis and Recognition, pp 171–175. https://doi.org/10.1109/ICDAR.2009.178

  • Bhunia AK, Konwer A, Bhunia AK, Bhowmick A, Roy PP, Pal U (2018) Script identification in natural scene image and video frames using an attention based convolutional-lstm network. Pattern Recognit 85:172–184. https://doi.org/10.1016/j.patcog.2018.07.034

    Article  Google Scholar 

  • Cao M, Zou Y (2020) All You Need is a Second Look: Towards Tighter Arbitrary Shape Text Detection. In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 2228–2232, https://doi.org/10.1109/icassp40776.2020.9053679. arxiv:2004.12436

  • Chaudhari S, Gulati RM (2016) Script identification using gabor feature and SVM classifier script identification using gabor feature and SVM classifier. Proc Proc Comput Sci 79(December):85–92. https://doi.org/10.1016/j.procs.2016.03.012

    Article  Google Scholar 

  • Chen X, Yuille AL (2004) Detecting and reading text in natural scenes. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2. https://doi.org/10.1109/cvpr.2004.1315187

  • Chen X, Yang J, Zhang J, Waibel A (2004) Automatic detection and recognition of signs from natural scenes. IEEE Trans Image Process 13(1):87–99

    Article  Google Scholar 

  • Dai Y, Huang Z, Gao Y, Xu Y, Chen K, Guo J, Qiu W (2018) Fused Text Segmentation Networks for Multi-oriented Scene Text Detection. In: Proceedings - International Conference on Pattern Recognition, IEEE, vol 2018-Augus, pp 3604–3609. https://doi.org/10.1109/ICPR.2018.8546066

  • Dargan S, Kumar M, Ayyagari MR, Kumar G (2019) A survey of deep learning and its applications: a new paradigm to machine learning. Archives of Computational Methods in Engineering, pp 1–22

  • Datong Chen KS, Bourlard H (2001) Text Enhancement with Asymmetric Filter for Video OCR. In: JProceedings of International Conference on Image Analysis andProcessing, Palermo, Italy,, pp 192–197. https://doi.org/10.1017/CBO9781107415324.004, file:///C:/Users/User/Downloads/fvm939e.pdf. arXiv:1011.1669v3

  • Deng D, Liu H, Li X, Cai D (2018) PixelLink: Detecting scene text via instance segmentation. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp 6773–6780. arxiv:1801.01315

  • Dey S, Shivakumara P, Raghunandan KS, Pal U, Lu T, Kumar GH, Chan CS (2017) Script independent approach for multi-oriented text detection in scene image. Neurocomputing 242:96–112. https://doi.org/10.1016/j.neucom.2017.02.061

    Article  Google Scholar 

  • Ezaki N, Bulacu M, Schomaker L (2004) Text detection from natural scene images: towards a system for visually impaired persons. Proceedings of the 17th International Conference on Pattern Recognition, 2004 ICPR 2004 2:2–5. https://doi.org/10.1109/ICPR.2004.1334351

  • Fan K, Baek SJ (2018) A robust proposal generation method for text lines in natural scene images. Neurocomputing 304:47–63. https://doi.org/10.1016/j.neucom.2018.03.041

    Article  Google Scholar 

  • Faustina Joan SP, Valli S (2019) A survey on text information extraction from born-digital and scene text images. Proc Natl Acad Sci India Sect A Phys Sci 89(1):77–101. https://doi.org/10.1007/s40010-017-0478-y

    Article  Google Scholar 

  • Gao J, Yang J (2001) An Adaptive Algorithm for Text Detection from Natural Scenes. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, pp 84–89

  • Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, vol 2015 Inter, pp 1440–1448, https://doi.org/10.1109/ICCV.2015.169. arxiv:1504.08083

  • Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 580–587, https://doi.org/10.1109/CVPR.2014.81. arxiv:1311.2524

  • Gllavata J, Freisleben B (2005) Script recognition in images with complex backgrounds. Proc Fifth IEEE Int Sympos Signal Process Inf Technol IEEE 2005:589–594. https://doi.org/10.1109/ISSPIT.2005.1577163

    Article  Google Scholar 

  • Greenhalgh J, Mirmehdi M (2012) Real-time detection and recognition of road traffic signs. IEEE Trans Intell Transp Syst 13(4):1498–1506. https://doi.org/10.1109/TITS.2012.2208909

    Article  Google Scholar 

  • Gupta A, Vedaldi A, Zisserman A (2016) Synthetic Data for Text Localisation in Natural Images. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1:2315–2324. https://doi.org/10.1109/CVPR.2016.254. http://ieeexplore.ieee.org/document/7780623/. arxiv:1604.06646

  • Hanif SM, Prevost L (2008) Texture based text detection in natural scene images: a help to blind and visually impaired persons. CEUR Workshop Proc 415:1–6

    Google Scholar 

  • He P, Huang W, He T, Zhu Q, Qiao Y, Li X (2017a) Single Shot Text Detector with Regional Attention. In: Proceedings of the IEEE International Conference on Computer Vision, vol 2017-October, pp 3066–3074. https://doi.org/10.1109/ICCV.2017.331. arxiv:1709.00138

  • He T, Huang W, Qiao Y, Member S (2016) Text-attentional convolutional neural network for scene text detection. IEEE Trans Image Process 25(6):2529–2541. https://doi.org/10.1109/TIP.2016.2547588

    Article  MathSciNet  MATH  Google Scholar 

  • He W, Zhang XY, Yin F, Liu CL (2017b) Deep Direct Regression for Multi-oriented Scene Text Detection. In: Proceedings of the IEEE International Conference on Computer Vision, vol 2017-October, pp 745–753. https://doi.org/10.1109/ICCV.2017.87. arxiv:1703.08289

  • He W, Zhang XY, Yin F, Liu CL (2018) Multi-oriented and multi-lingual scene text detection with direct regression. IEEE Trans Image Process 27(11):5406–5419. https://doi.org/10.1109/TIP.2018.2855399

    Article  MathSciNet  Google Scholar 

  • Hu W, Cai X, Hou J, Yi S, Lin Z (2020) GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition. arXiv:2002. arxiv:2002.01276

  • Huang W, Qiao Y, Tang X (2014) Robust scene text detection with convolution neural network induced MSER trees. Lecture Notes in Computer Science 8692 LNCS(PART 4):497–511. https://doi.org/10.1007/978-3-319-10593-2_33

  • Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2016) Reading text in the wild with convolutional neural networks. Int J Comput Vis 116(1):1–20. https://doi.org/10.1007/s11263-015-0823-z

    Article  MathSciNet  Google Scholar 

  • Ji R, Xu P, Yao H, Zhang Z, Sun X, Liu T (2008) Directional correlation analysis of local haar binary pattern for text detection. In: 2008 IEEE International Conference on Multimedia and Expo, ICME 2008 - Proceedings, June 2014, pp 885–888. https://doi.org/10.1109/ICME.2008.4607577

  • Jiang M, Cheng J, Chen M, Ku X (2018) An Improved Text Localization Method for Natural Scene Images. J Phys Conf Ser. https://doi.org/10.1088/1742-6596/960/1/012027

    Article  Google Scholar 

  • Jindal K, Kumar R (2018) A new method for segmentation of pre-detected Devanagari words from the scene images: Pihu method. Comput Electr Eng 70:754–763. https://doi.org/10.1016/j.compeleceng.2017.12.017

    Article  Google Scholar 

  • Joshi GD, Garg S, Sivaswamy J (2007) A generalised framework for script identification. Int J Document Anal Recognit 10(2):55–68. https://doi.org/10.1007/s10032-007-0043-3

    Article  Google Scholar 

  • Karaoglu S, Tao R, Gevers T, Smeulders AWM (2017) Words matter?: Scene text for image classification and retrieval. IEEE Trans Multimedia 19(5):1063–1076. https://doi.org/10.1109/TMM.2016.2638622

    Article  Google Scholar 

  • Katper SH, Gilal AR (2020) Deep neural networks combined with stn for multi-oriented text detection and recognition. Int J Adv Comput Sci Appl 11(4):178–184

    Google Scholar 

  • Kaur RP, Jindal MK, Kumar M (2019a) Recognition of newspaper printed in gurumukhi script. J Central South Univ 26(9):2495–2503. https://doi.org/10.1007/s11771-019-4189-1

    Article  Google Scholar 

  • Kaur RP, Kumar M, Jindal MK (2019b) Newspaper text recognition of gurumukhi script using random forest classifier. Multimedia Tools and Applications pp 1–14

  • Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev, pp 1 – 62

  • Kim KC, Byun HR, Song YJ, Choi YW, Chi SY, Kim KK, Chung YK (2004) Scene text extraction in natural scene images using hierarchical feature combining and verification. Proc Int Conf Pattern Recognit 2:679–682. https://doi.org/10.1109/ICPR.2004.1334350

    Article  Google Scholar 

  • Kim KI, Jung K, Kim JH (2003) Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Trans Pattern Anal Mach Intell 25(12):1631–1639. https://doi.org/10.1109/TPAMI.2003.1251157

    Article  Google Scholar 

  • Kumar M, Jindal SR, Jindal MK, Lehal GS (2019) Improved recognition results of medieval handwritten gurmukhi manuscripts using boosting and bagging methodologies. Neural Process Lett 50(1):43–56. https://doi.org/10.1007/s11063-018-9913-6

    Article  Google Scholar 

  • Kumar M, Jindal MK, Sharma RK, Jindal SR (2020) Performance evaluation of classifiers for the recognition of offline handwritten gurmukhi characters and numerals: a study. Artif Intell Rev 53(3):2075–2097

    Article  Google Scholar 

  • Lai J, Guo L, Qiao Y, Chen X, Zhang Z (2019) Robust Text Line Detection in Equipment Nameplate Images. Proceeding of the IEEE International Conference on Robotics and Biomimetics, pp 889–894

  • Le VP, Nayef N, Visani M, Ogier JM, Tran CD (2015) Text and non-text segmentation based on connected component features. In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, vol 2015-Novem, pp 1096–1100. https://doi.org/10.1109/ICDAR.2015.7333930

  • Li X, W WW, Hou W, Liu RZ, Lu T, Yang J (2018) Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9336–9345

  • Liao M, Shi B, Bai X, Wang X, Liu W (2017) TextBoxes: A fast text detector with a single deep neural network. In: 31st AAAI Conference on Artificial Intelligence, AAAI 2017, pp 4161–4167

  • Liao M, Shi B, Bai X (2018) TextBoxes++: A Single-Shot Oriented Scene Text Detector. IEEE Transactions on Image Processing 27(8):3676–3690. https://doi.org/10.1109/TIP.2018.2825107arxiv:1801.02765

  • Liao M, Wan Z, Yao C, Chen K, Bai X (2019) Real-time scene text detection with differentiable binarization. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) Real-Time, pp 11474–11481. https://doi.org/10.1609/aaai.v34i07.6812, arxiv:1911.08947

  • Liu DECSSRCYFACB Dragomir Anguelov (2016) SSD: single shot multibox detector. Europ Conf Comput Vis 1:21–37. https://doi.org/10.1007/978-3-319-46448-02

    Article  Google Scholar 

  • Liu X, Liang D, Yan S, Chen D, Qiao Y, Yan J (2018) FOTS : Fast Oriented Text Spotting with a Unified Network. arXiv:180101671v2arxiv:arXiv:1801.01671v2

  • Liu Z, Lin G, Goh WL (2020) Bottom-up scene text detection with Markov clustering networks. Int J Comput Vis. https://doi.org/10.1007/s11263-020-01298-y

    Article  MathSciNet  Google Scholar 

  • Long S, Ruan J, Zhang W, He X (2018) TextSnake : A Flexible Representation for Detecting Text of Arbitrary Shapes. European Conference on Computer Vision pp 1–17. arxiv:arXiv:1807.01544v1

  • Lyu P, Yao C, Wu W, Yan S, Bai X (2018) Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 7553–7563. https://doi.org/10.1109/CVPR.2018.00788. arxiv:1802.08948

  • Ma C, Zhong Z, Sun L, Huo Q (2019) A relation network based approach to curved text detection. In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp 707–713. https://doi.org/10.1109/ICDAR.2019.00118

  • Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-Oriented Scene Text Detection via Rotation Proposals. IEEE TRANSACTIONS ON MULTIMEDIA, 20(11):3111–3122. https://doi.org/10.1109/TMM.2018.2818020. arxiv:1703.01086

  • Ma Y, Wang Y (2015) Text Detection in Medical Images Using Local Feature Extraction and Supervised Learning. In: 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 61202264, pp 988–993

  • Mahajan S, Rani R (2018) Text Extraction from Indian and Non-Indian Natural Scene Images : A Review. 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC) pp 584–588. https://doi.org/10.1109/ICSCCC.2018.8703369

  • Mathew M, Jain M, Jawahar C (2017) Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, pp 42–46

  • Merino-Gracia C, Mirmehdi M (2014) Real-time text tracking in natural scenes. IET Comput Vis 8(6):670–681. https://doi.org/10.1049/iet-cvi.2013.0217

    Article  Google Scholar 

  • Mittal A, Roy PP, Singh P, Raman B (2017) Rotation and script independent text detection from video frames using sub pixel mapping. J Vis Commun Image Represent 46:187–198. https://doi.org/10.1016/j.jvcir.2017.03.002

    Article  Google Scholar 

  • Naik S, Nayak S (2015) Text Detection and Character Extraction in Natural Scene Images. In: IEEE 2017 International Conference on Computing Methodologies and Communication (ICCMC), pp 1136–1141

  • Narang SR, Jindal MK, Kumar M (2020) Ancient text recognition: a review. 0123456789, Springer Netherlands. https://doi.org/10.1007/s10462-020-09827-4

  • Pan Yf, Hou X, Liu Cl (2008) A Robust System to Detect and Localize Texts in Natural Scene Images. The Eighth IAPR Workshop on Document Analysis Systems pp 35–42. https://doi.org/10.1109/DAS.2008.42

  • Pan YF, Hou X, Liu CL (2009) Text Localization in Natural Scene Images Based on Conditional Random Field. 2009 10th International Conference on Document Analysis and Recognition pp 6–10. https://doi.org/10.1109/ICDAR.2009.97. http://ieeexplore.ieee.org/document/5277814/

  • Pan Yf, Hou X, Liu Cl (2011) A hybrid approach to detect and localize texts in natural scene images. IEEE Trans Image Process 20(3):800–813. https://doi.org/10.1109/TIP.2010.2070803

    Article  MathSciNet  MATH  Google Scholar 

  • Paul JN, Nitnaware VN (2016) Portable camera-based assistive text and product label reading from hand-held objects for blind persons. Int J Adv Eng Res Develop 3(06):406–410. https://doi.org/10.21090/ijaerd.030657

    Article  Google Scholar 

  • Petter M, Fragoso V, Turk M, Baur C (2011) Automatic text detection for mobile augmented reality translation. Proceedings of the IEEE International Conference on Computer Vision pp 48–55. https://doi.org/10.1109/ICCVW.2011.6130221

  • Phan TQ, Shivakumara P, Tan CL (2012) Detecting text in the real world. MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia pp 765–768. https://doi.org/10.1145/2393347.2396307

  • Rahul R, Bhaskaran S, Amudha J, Gupta D (2018) Multilingual Text Detection and Identification from Indian Signage Boards. 2018 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2018 pp 1120–1125. https://doi.org/10.1109/ICACCI.2018.8554778

  • Raj H, Ghosh R (2014) Devanagari Text Extraction from Natural Scene Images. International Conference on Advances in Computing,Communications and Informatics (ICACCI), pp 513–517

  • Rajan V, Raj S (2017) Text Detection and Character Extraction in Natural Scene Images using Fractional Poisson Model. In: Proceedings of the IEEE 2017 International Conference on Computing Methodologies and Communication, pp 1136–1141

  • Redmon J, Farhadi A (2016) Yolo9000: Better, faster, stronger., cite arxiv:1612.08242

  • Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2016-Decem, pp 779–788. https://doi.org/10.1109/CVPR.2016.91. arxiv:arXiv:1506.02640v5

  • Roy PP, Bhunia AK, Bhattacharyya A, Pal U (2019) Word searching in scene image and video frame in multi-script scenario using dynamic shape coding. Multimedia Tools Appl 78(6):7767–7801. https://doi.org/10.1007/s11042-018-6484-5

    Article  Google Scholar 

  • Sahoo G, Kumar T, Raina BL, Bhatia CM (2009) Text extraction and enhancement of binary images using cellular automata. Int J Automat Comput 6(3):254–260. https://doi.org/10.1007/s11633-009-0254-9

    Article  Google Scholar 

  • Sain A, Bhunia AK, Roy PP, Pal U (2018) Multi-oriented text detection and verification in video frames and scene images. Neurocomputing 275:1531–1549. https://doi.org/10.1016/j.neucom.2017.09.089

    Article  Google Scholar 

  • Shi B, Yao C, Zhang C, Guo X, Huang F, Bai X (2015) Automatic script identification in the wild. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR 2015-Novem:531–535. https://doi.org/10.1109/ICDAR.2015.7333818

  • Shi B, Bai X, Belongie S (2017) Detecting Oriented Text in Natural Images by Linking Segments. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2550–2558

  • Singh H, Sharma R, Kumar R, Verma K, Kumar R, Kumar M (2019) A benchmark dataset of online handwritten gurmukhi script words and numerals. In: International Conference on Computer Vision and Image Processing, Springer, pp 457–466

  • Singh S, Sharma A (2019) Recognition of Online Handwritten Gurmukhi Characters Through Neural Networks. In: Advances in Computer Communication and Computational Sciences, vol 759, Springer US, pp 263–281. https://doi.org/10.1007/978-981-13-0341-8. http://link.springer.com/10.1007/978-981-13-0341-8

  • Smith MA, Kanade T (1995) Video skimming for quick browsing based on audio and image characterization. Tech. rep

  • Soni R, Kumar B, Chand S (2019) Text detection and localization in natural scene images based on text awareness score. Appl Intell 49(4):1376–1405. https://doi.org/10.1007/s10489-018-1338-4

    Article  Google Scholar 

  • SR Surem Samuel CSC (2015) Artificial intelligence and evolutionary algorithms in engineering systems: Proceedings of ICAEES 2014, Volume 2. Advances in Intelligent Systems and Computing 325:665–674. https://doi.org/10.1007/978-81-322-2135-7

  • Srivastav A, Kumar J (2008) Text detection in scene images using stroke width and nearest-neighbor constraints. IEEE Region 10 Annual International Conference, Proceedings/TENCON https://doi.org/10.1109/TENCON.2008.4766826

  • Sun SRKHRGJ (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1–9. https://doi.org/10.2307/j.ctt1d98bxx.10

    Article  Google Scholar 

  • Tang Y, Wu X (2017) Scene text detection and segmentation based on cascaded convolution neural networks. IEEE Trans Image Process 26(3):1509–1520. https://doi.org/10.1109/TIP.2017.2656474

    Article  Google Scholar 

  • Tang Y, Wu X (2018) Scene text detection using superpixel-based stroke feature transform and deep learning based. IEEE Trans Multimedia 20(9):2276–2288. https://doi.org/10.1109/TMM.2018.2802644

    Article  Google Scholar 

  • Tian C, Xia Y, Zhang X, Gao X (2017) Natural scene text detection with MC-MR candidate extraction and coarse-to-fine filtering. Neurocomputing 260:112–122. https://doi.org/10.1016/j.neucom.2017.03.078

    Article  Google Scholar 

  • Vidhyalakshmi M, Sudha S (2019) Text detection in natural images with hybrid stroke feature transform and high performance deep Convnet computing. In: Concurrency Computation, January, pp 1–8. https://doi.org/10.1002/cpe.5271

  • Wang S, Liu Y, He Z, Wang Y, Tang Z (2020) A quadrilateral scene text detector with two-stage network architecture. Pattern Recognit 102(C):1–9. https://doi.org/10.1016/j.patcog.2020.107230d

    Article  Google Scholar 

  • Wang Y, Shi C, Xiao B, Wang C, Qi C (2018) CRF based text detection for natural scene images using convolutional neural network and context information. Neurocomputing 295:46–58. https://doi.org/10.1016/j.neucom.2017.12.058

    Article  Google Scholar 

  • Wei Y, Zhang Z, Shen W, Zeng D, Fang M, Zhou S (2017) Text detection in scene images based on exhaustive segmentation. Signal Process Image Commun 50:1–8. https://doi.org/10.1016/j.image.2016.10.003

    Article  Google Scholar 

  • Wei Y, Shen W, Zeng D, Ye L, Zhang Z (2018) Multi-oriented text detection from natural scene images based on a CNN and pruning non-adjacent graph edges. Signal Process Image Commun 64(March):89–98. https://doi.org/10.1016/j.image.2018.02.016

    Article  Google Scholar 

  • Wu H, Zou B, Qian Zhao Y, Chen Z, Zhu C, Guo J (2016) Natural scene text detection by multi-scale adaptive color clustering and non-text filtering. Neurocomputing 214:1011–1025. https://doi.org/10.1016/j.neucom.2016.07.016

    Article  Google Scholar 

  • Xie E, Zang Y, Shao S, Yu G, Yao C, Li G (2019) Scene Text Detection with Supervised Pyramid Context Network. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), pp 9038–9045

  • Xinyu Zhou HWYWSZWH Cong Yao, Liang J (2018) EAST. CVPR pp 5551–5560. https://doi.org/10.2307/j.ctt201mpcf.16

  • Xiong B, Grauman K (2016) Text detection in stores using a repetition prior. In: 2016 IEEE Winter Conference on Applications of Computer Vision, WACV 2016. https://doi.org/10.1109/WACV.2016.7477575

  • Xue C, Lu S, Zhan F (2018) Accurate Scene Text Detection Through Border Semantics Awareness and Bootstrapping. In: European Conference on Computer Vision, vol 11220 LNCS, pp 370–387. https://doi.org/10.1007/978-3-030-01270-0_22, arxiv:1807.03547

  • Yang Q, Cheng M, Zhou W, Chen Y, Qiu M, Lin W (2018) Inceptext: A new inception-text module with deformable PSROI pooling for multi-oriented scene text detection. In: IJCAI International Joint Conference on Artificial Intelligence, vol 2018-July, pp 1071–1077. arxiv:arXiv:1805.01167v2

  • Yao C, Bai X, Liu W, Ma Y, Tu Z (2017) Detecting Texts of Arbitrary Orientations in Natural Images. Computer Vision and Pattern Recognition 8

  • Yin XC, Yin X, Huang K, Hao HW (2013) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):1–10. https://doi.org/10.1109/TPAMI.2013.182arxiv:1301.2628

    Article  Google Scholar 

  • Yu C, Song Y, Meng Q, Zhang Y, Liu Y (2015) Text detection and recognition in natural scene with edge analysis. IET Comput Vis 9(4):603–613. https://doi.org/10.1049/iet-cvi.2013.0307

    Article  Google Scholar 

  • Zhang J, Cheng R, Wang K, Zhao H (2013) Research on the text detection and extraction from complex images. In: Proceedings - 4th International Conference on Emerging Intelligent Data and Web Technologies, EIDWT 2013, pp 708–713. https://doi.org/10.1109/EIDWT.2013.122

  • Zhang P, Shi Z, Gao H (2018) Research on text location and recognition in natural images with deep learning. In: ACM International Conference Proceeding Series, pp 1–6. https://doi.org/10.1145/3292448.3292452

  • Zhang SX, Zhu X, Hou JB, Liu C, Yang C, Wang H, Yin XC (2020) Deep relational reasoning graph network for arbitrary shape text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9699–9708

  • Zhang Z, Shen W, Yao C, Bai X (2015) Symmetry-based text line detection in natural scenes. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 07-12-June, pp 2558–2567. https://doi.org/10.1109/CVPR.2015.7298871

  • Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-Oriented Text Detection with Fully Convolutional Networks. Computer Vision and Pattern Recognition pp 4159–4167. https://doi.org/10.1109/CVPR.2016.451. arxiv:1604.04018

  • Zheng Y, Li Q, Liu J, Liu H, Li G, Zhang S (2017) A cascaded method for text detection in natural scene images. Neurocomputing 238:307–315. https://doi.org/10.1016/j.neucom.2017.01.066

    Article  Google Scholar 

  • Zhong Z, Jin L, Zhang S, Feng Z (2016) DeepText : A Unified Framework for Text Proposal Generation and Text Detection in Natural Images. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp 1–12. https://doi.org/10.1109/ICASSP.2017.7952348. arxiv:arXiv:1605.07314v1

  • Zhu Y, Yao C, Bai X (2015) Scene text detection and recognition: recent advances and future trends. Fron Comput Sci 10(1):1–18. https://doi.org/10.1007/s11704-015-4488-0

    Article  Google Scholar 

  • Zou F, Xiao W, Ji W, He K, Yang Z, Song J, Zhou H, Li K (2020) Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image. Neural Comput Appl. https://doi.org/10.1007/s00521-020-04893-9

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shilpa Mahajan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mahajan, S., Rani, R. Text detection and localization in scene images: a broad review. Artif Intell Rev 54, 4317–4377 (2021). https://doi.org/10.1007/s10462-021-10000-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-021-10000-8

Keywords

Navigation