Abstract
Nowadays, text detection and localization have gained much popularity in the field of text analysis systems as they pave the way for the number of real-time based applications like mobile transliteration technologies, assistive methods for visually impaired persons, etc. Text detection and localization techniques are used to find the position of text area in the image.This paper intends to present a broad review in this field as five-fold: (1) comparison of document images with scene images and applications of natural scene images, (2) significant and up-to-date traditional machine learning and deep learning-based approaches for the text detection and localization for different languages, (3) various publicly available benchmarked datasets, (4) comparative analysis for other benchmarked datasets and, (5) related challenges and future scope on the field. The paper summarises some of the potential ways in this field, which can serve as a useful reference for the researchers for future exploration of the area.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ahmed SB, Naz S, Razzak MI, Yousaf R (2017a) Deep Learning based Isolated Arabic Scene Character Recognition. IEEE International Workshop on Arabic Script Analysis and Recognition pp 46–51. arxiv:1704.06821
Ahmed SB, Naz S, Razzak MI, Yousaf R (2017b) Deep learning based isolated Arabic scene character recognition. In: IEEE International Workshop on Arabic Script Analysis and Recognition (ASAR) Deep, pp 46–51. https://doi.org/10.1109/asar.2017.8067758,arxiv:1704.06821
Alessi NG, Battiato S, Gallo G, Mancuso M, Stanco F (2003) Automatic discrimination of images. Proc SPIE-IS&T Electron Imag SPIE 5017(5017):351–359
Angadi SA, Kodabagi M (2009) A texture based methodology for text region extraction from low resolution natural scene images. Int J Image Process 3(5):229–245
Ansari GJ, Shah JH, Yasmin M, Sharif M, Fernandes SL (2018) A novel machine learning approach for scene text extraction. Future Generat Comput Syst 87:328–340. https://doi.org/10.1016/j.future.2018.04.074
Baek Y, Lee B, Han D, Yun S, Lee H (2019) Character region awareness for text detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2019-June, pp 9357–9366. https://doi.org/10.1109/CVPR.2019.00959. arxiv:1904.01941
Bains JK, Singh S, Sharma A (2020) Dynamic features based stroke recognition system for signboard images of Gurmukhi text. Multimedia Tools and Applications pp 1–25. https://doi.org/10.1007/s11042-020-09653-4
Bhattacharya U, Parui SK, Mondal S (2009) Devanagari and Bangla Text Extraction from Natural Scene Images. In: 2009 10th International Conference on Document Analysis and Recognition, pp 171–175. https://doi.org/10.1109/ICDAR.2009.178
Bhunia AK, Konwer A, Bhunia AK, Bhowmick A, Roy PP, Pal U (2018) Script identification in natural scene image and video frames using an attention based convolutional-lstm network. Pattern Recognit 85:172–184. https://doi.org/10.1016/j.patcog.2018.07.034
Cao M, Zou Y (2020) All You Need is a Second Look: Towards Tighter Arbitrary Shape Text Detection. In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 2228–2232, https://doi.org/10.1109/icassp40776.2020.9053679. arxiv:2004.12436
Chaudhari S, Gulati RM (2016) Script identification using gabor feature and SVM classifier script identification using gabor feature and SVM classifier. Proc Proc Comput Sci 79(December):85–92. https://doi.org/10.1016/j.procs.2016.03.012
Chen X, Yuille AL (2004) Detecting and reading text in natural scenes. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2. https://doi.org/10.1109/cvpr.2004.1315187
Chen X, Yang J, Zhang J, Waibel A (2004) Automatic detection and recognition of signs from natural scenes. IEEE Trans Image Process 13(1):87–99
Dai Y, Huang Z, Gao Y, Xu Y, Chen K, Guo J, Qiu W (2018) Fused Text Segmentation Networks for Multi-oriented Scene Text Detection. In: Proceedings - International Conference on Pattern Recognition, IEEE, vol 2018-Augus, pp 3604–3609. https://doi.org/10.1109/ICPR.2018.8546066
Dargan S, Kumar M, Ayyagari MR, Kumar G (2019) A survey of deep learning and its applications: a new paradigm to machine learning. Archives of Computational Methods in Engineering, pp 1–22
Datong Chen KS, Bourlard H (2001) Text Enhancement with Asymmetric Filter for Video OCR. In: JProceedings of International Conference on Image Analysis andProcessing, Palermo, Italy,, pp 192–197. https://doi.org/10.1017/CBO9781107415324.004, file:///C:/Users/User/Downloads/fvm939e.pdf. arXiv:1011.1669v3
Deng D, Liu H, Li X, Cai D (2018) PixelLink: Detecting scene text via instance segmentation. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp 6773–6780. arxiv:1801.01315
Dey S, Shivakumara P, Raghunandan KS, Pal U, Lu T, Kumar GH, Chan CS (2017) Script independent approach for multi-oriented text detection in scene image. Neurocomputing 242:96–112. https://doi.org/10.1016/j.neucom.2017.02.061
Ezaki N, Bulacu M, Schomaker L (2004) Text detection from natural scene images: towards a system for visually impaired persons. Proceedings of the 17th International Conference on Pattern Recognition, 2004 ICPR 2004 2:2–5. https://doi.org/10.1109/ICPR.2004.1334351
Fan K, Baek SJ (2018) A robust proposal generation method for text lines in natural scene images. Neurocomputing 304:47–63. https://doi.org/10.1016/j.neucom.2018.03.041
Faustina Joan SP, Valli S (2019) A survey on text information extraction from born-digital and scene text images. Proc Natl Acad Sci India Sect A Phys Sci 89(1):77–101. https://doi.org/10.1007/s40010-017-0478-y
Gao J, Yang J (2001) An Adaptive Algorithm for Text Detection from Natural Scenes. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, pp 84–89
Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, vol 2015 Inter, pp 1440–1448, https://doi.org/10.1109/ICCV.2015.169. arxiv:1504.08083
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 580–587, https://doi.org/10.1109/CVPR.2014.81. arxiv:1311.2524
Gllavata J, Freisleben B (2005) Script recognition in images with complex backgrounds. Proc Fifth IEEE Int Sympos Signal Process Inf Technol IEEE 2005:589–594. https://doi.org/10.1109/ISSPIT.2005.1577163
Greenhalgh J, Mirmehdi M (2012) Real-time detection and recognition of road traffic signs. IEEE Trans Intell Transp Syst 13(4):1498–1506. https://doi.org/10.1109/TITS.2012.2208909
Gupta A, Vedaldi A, Zisserman A (2016) Synthetic Data for Text Localisation in Natural Images. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1:2315–2324. https://doi.org/10.1109/CVPR.2016.254. http://ieeexplore.ieee.org/document/7780623/. arxiv:1604.06646
Hanif SM, Prevost L (2008) Texture based text detection in natural scene images: a help to blind and visually impaired persons. CEUR Workshop Proc 415:1–6
He P, Huang W, He T, Zhu Q, Qiao Y, Li X (2017a) Single Shot Text Detector with Regional Attention. In: Proceedings of the IEEE International Conference on Computer Vision, vol 2017-October, pp 3066–3074. https://doi.org/10.1109/ICCV.2017.331. arxiv:1709.00138
He T, Huang W, Qiao Y, Member S (2016) Text-attentional convolutional neural network for scene text detection. IEEE Trans Image Process 25(6):2529–2541. https://doi.org/10.1109/TIP.2016.2547588
He W, Zhang XY, Yin F, Liu CL (2017b) Deep Direct Regression for Multi-oriented Scene Text Detection. In: Proceedings of the IEEE International Conference on Computer Vision, vol 2017-October, pp 745–753. https://doi.org/10.1109/ICCV.2017.87. arxiv:1703.08289
He W, Zhang XY, Yin F, Liu CL (2018) Multi-oriented and multi-lingual scene text detection with direct regression. IEEE Trans Image Process 27(11):5406–5419. https://doi.org/10.1109/TIP.2018.2855399
Hu W, Cai X, Hou J, Yi S, Lin Z (2020) GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition. arXiv:2002. arxiv:2002.01276
Huang W, Qiao Y, Tang X (2014) Robust scene text detection with convolution neural network induced MSER trees. Lecture Notes in Computer Science 8692 LNCS(PART 4):497–511. https://doi.org/10.1007/978-3-319-10593-2_33
Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2016) Reading text in the wild with convolutional neural networks. Int J Comput Vis 116(1):1–20. https://doi.org/10.1007/s11263-015-0823-z
Ji R, Xu P, Yao H, Zhang Z, Sun X, Liu T (2008) Directional correlation analysis of local haar binary pattern for text detection. In: 2008 IEEE International Conference on Multimedia and Expo, ICME 2008 - Proceedings, June 2014, pp 885–888. https://doi.org/10.1109/ICME.2008.4607577
Jiang M, Cheng J, Chen M, Ku X (2018) An Improved Text Localization Method for Natural Scene Images. J Phys Conf Ser. https://doi.org/10.1088/1742-6596/960/1/012027
Jindal K, Kumar R (2018) A new method for segmentation of pre-detected Devanagari words from the scene images: Pihu method. Comput Electr Eng 70:754–763. https://doi.org/10.1016/j.compeleceng.2017.12.017
Joshi GD, Garg S, Sivaswamy J (2007) A generalised framework for script identification. Int J Document Anal Recognit 10(2):55–68. https://doi.org/10.1007/s10032-007-0043-3
Karaoglu S, Tao R, Gevers T, Smeulders AWM (2017) Words matter?: Scene text for image classification and retrieval. IEEE Trans Multimedia 19(5):1063–1076. https://doi.org/10.1109/TMM.2016.2638622
Katper SH, Gilal AR (2020) Deep neural networks combined with stn for multi-oriented text detection and recognition. Int J Adv Comput Sci Appl 11(4):178–184
Kaur RP, Jindal MK, Kumar M (2019a) Recognition of newspaper printed in gurumukhi script. J Central South Univ 26(9):2495–2503. https://doi.org/10.1007/s11771-019-4189-1
Kaur RP, Kumar M, Jindal MK (2019b) Newspaper text recognition of gurumukhi script using random forest classifier. Multimedia Tools and Applications pp 1–14
Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev, pp 1 – 62
Kim KC, Byun HR, Song YJ, Choi YW, Chi SY, Kim KK, Chung YK (2004) Scene text extraction in natural scene images using hierarchical feature combining and verification. Proc Int Conf Pattern Recognit 2:679–682. https://doi.org/10.1109/ICPR.2004.1334350
Kim KI, Jung K, Kim JH (2003) Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Trans Pattern Anal Mach Intell 25(12):1631–1639. https://doi.org/10.1109/TPAMI.2003.1251157
Kumar M, Jindal SR, Jindal MK, Lehal GS (2019) Improved recognition results of medieval handwritten gurmukhi manuscripts using boosting and bagging methodologies. Neural Process Lett 50(1):43–56. https://doi.org/10.1007/s11063-018-9913-6
Kumar M, Jindal MK, Sharma RK, Jindal SR (2020) Performance evaluation of classifiers for the recognition of offline handwritten gurmukhi characters and numerals: a study. Artif Intell Rev 53(3):2075–2097
Lai J, Guo L, Qiao Y, Chen X, Zhang Z (2019) Robust Text Line Detection in Equipment Nameplate Images. Proceeding of the IEEE International Conference on Robotics and Biomimetics, pp 889–894
Le VP, Nayef N, Visani M, Ogier JM, Tran CD (2015) Text and non-text segmentation based on connected component features. In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, vol 2015-Novem, pp 1096–1100. https://doi.org/10.1109/ICDAR.2015.7333930
Li X, W WW, Hou W, Liu RZ, Lu T, Yang J (2018) Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9336–9345
Liao M, Shi B, Bai X, Wang X, Liu W (2017) TextBoxes: A fast text detector with a single deep neural network. In: 31st AAAI Conference on Artificial Intelligence, AAAI 2017, pp 4161–4167
Liao M, Shi B, Bai X (2018) TextBoxes++: A Single-Shot Oriented Scene Text Detector. IEEE Transactions on Image Processing 27(8):3676–3690. https://doi.org/10.1109/TIP.2018.2825107arxiv:1801.02765
Liao M, Wan Z, Yao C, Chen K, Bai X (2019) Real-time scene text detection with differentiable binarization. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) Real-Time, pp 11474–11481. https://doi.org/10.1609/aaai.v34i07.6812, arxiv:1911.08947
Liu DECSSRCYFACB Dragomir Anguelov (2016) SSD: single shot multibox detector. Europ Conf Comput Vis 1:21–37. https://doi.org/10.1007/978-3-319-46448-02
Liu X, Liang D, Yan S, Chen D, Qiao Y, Yan J (2018) FOTS : Fast Oriented Text Spotting with a Unified Network. arXiv:180101671v2arxiv:arXiv:1801.01671v2
Liu Z, Lin G, Goh WL (2020) Bottom-up scene text detection with Markov clustering networks. Int J Comput Vis. https://doi.org/10.1007/s11263-020-01298-y
Long S, Ruan J, Zhang W, He X (2018) TextSnake : A Flexible Representation for Detecting Text of Arbitrary Shapes. European Conference on Computer Vision pp 1–17. arxiv:arXiv:1807.01544v1
Lyu P, Yao C, Wu W, Yan S, Bai X (2018) Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 7553–7563. https://doi.org/10.1109/CVPR.2018.00788. arxiv:1802.08948
Ma C, Zhong Z, Sun L, Huo Q (2019) A relation network based approach to curved text detection. In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp 707–713. https://doi.org/10.1109/ICDAR.2019.00118
Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-Oriented Scene Text Detection via Rotation Proposals. IEEE TRANSACTIONS ON MULTIMEDIA, 20(11):3111–3122. https://doi.org/10.1109/TMM.2018.2818020. arxiv:1703.01086
Ma Y, Wang Y (2015) Text Detection in Medical Images Using Local Feature Extraction and Supervised Learning. In: 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 61202264, pp 988–993
Mahajan S, Rani R (2018) Text Extraction from Indian and Non-Indian Natural Scene Images : A Review. 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC) pp 584–588. https://doi.org/10.1109/ICSCCC.2018.8703369
Mathew M, Jain M, Jawahar C (2017) Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, pp 42–46
Merino-Gracia C, Mirmehdi M (2014) Real-time text tracking in natural scenes. IET Comput Vis 8(6):670–681. https://doi.org/10.1049/iet-cvi.2013.0217
Mittal A, Roy PP, Singh P, Raman B (2017) Rotation and script independent text detection from video frames using sub pixel mapping. J Vis Commun Image Represent 46:187–198. https://doi.org/10.1016/j.jvcir.2017.03.002
Naik S, Nayak S (2015) Text Detection and Character Extraction in Natural Scene Images. In: IEEE 2017 International Conference on Computing Methodologies and Communication (ICCMC), pp 1136–1141
Narang SR, Jindal MK, Kumar M (2020) Ancient text recognition: a review. 0123456789, Springer Netherlands. https://doi.org/10.1007/s10462-020-09827-4
Pan Yf, Hou X, Liu Cl (2008) A Robust System to Detect and Localize Texts in Natural Scene Images. The Eighth IAPR Workshop on Document Analysis Systems pp 35–42. https://doi.org/10.1109/DAS.2008.42
Pan YF, Hou X, Liu CL (2009) Text Localization in Natural Scene Images Based on Conditional Random Field. 2009 10th International Conference on Document Analysis and Recognition pp 6–10. https://doi.org/10.1109/ICDAR.2009.97. http://ieeexplore.ieee.org/document/5277814/
Pan Yf, Hou X, Liu Cl (2011) A hybrid approach to detect and localize texts in natural scene images. IEEE Trans Image Process 20(3):800–813. https://doi.org/10.1109/TIP.2010.2070803
Paul JN, Nitnaware VN (2016) Portable camera-based assistive text and product label reading from hand-held objects for blind persons. Int J Adv Eng Res Develop 3(06):406–410. https://doi.org/10.21090/ijaerd.030657
Petter M, Fragoso V, Turk M, Baur C (2011) Automatic text detection for mobile augmented reality translation. Proceedings of the IEEE International Conference on Computer Vision pp 48–55. https://doi.org/10.1109/ICCVW.2011.6130221
Phan TQ, Shivakumara P, Tan CL (2012) Detecting text in the real world. MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia pp 765–768. https://doi.org/10.1145/2393347.2396307
Rahul R, Bhaskaran S, Amudha J, Gupta D (2018) Multilingual Text Detection and Identification from Indian Signage Boards. 2018 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2018 pp 1120–1125. https://doi.org/10.1109/ICACCI.2018.8554778
Raj H, Ghosh R (2014) Devanagari Text Extraction from Natural Scene Images. International Conference on Advances in Computing,Communications and Informatics (ICACCI), pp 513–517
Rajan V, Raj S (2017) Text Detection and Character Extraction in Natural Scene Images using Fractional Poisson Model. In: Proceedings of the IEEE 2017 International Conference on Computing Methodologies and Communication, pp 1136–1141
Redmon J, Farhadi A (2016) Yolo9000: Better, faster, stronger., cite arxiv:1612.08242
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 2016-Decem, pp 779–788. https://doi.org/10.1109/CVPR.2016.91. arxiv:arXiv:1506.02640v5
Roy PP, Bhunia AK, Bhattacharyya A, Pal U (2019) Word searching in scene image and video frame in multi-script scenario using dynamic shape coding. Multimedia Tools Appl 78(6):7767–7801. https://doi.org/10.1007/s11042-018-6484-5
Sahoo G, Kumar T, Raina BL, Bhatia CM (2009) Text extraction and enhancement of binary images using cellular automata. Int J Automat Comput 6(3):254–260. https://doi.org/10.1007/s11633-009-0254-9
Sain A, Bhunia AK, Roy PP, Pal U (2018) Multi-oriented text detection and verification in video frames and scene images. Neurocomputing 275:1531–1549. https://doi.org/10.1016/j.neucom.2017.09.089
Shi B, Yao C, Zhang C, Guo X, Huang F, Bai X (2015) Automatic script identification in the wild. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR 2015-Novem:531–535. https://doi.org/10.1109/ICDAR.2015.7333818
Shi B, Bai X, Belongie S (2017) Detecting Oriented Text in Natural Images by Linking Segments. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2550–2558
Singh H, Sharma R, Kumar R, Verma K, Kumar R, Kumar M (2019) A benchmark dataset of online handwritten gurmukhi script words and numerals. In: International Conference on Computer Vision and Image Processing, Springer, pp 457–466
Singh S, Sharma A (2019) Recognition of Online Handwritten Gurmukhi Characters Through Neural Networks. In: Advances in Computer Communication and Computational Sciences, vol 759, Springer US, pp 263–281. https://doi.org/10.1007/978-981-13-0341-8. http://link.springer.com/10.1007/978-981-13-0341-8
Smith MA, Kanade T (1995) Video skimming for quick browsing based on audio and image characterization. Tech. rep
Soni R, Kumar B, Chand S (2019) Text detection and localization in natural scene images based on text awareness score. Appl Intell 49(4):1376–1405. https://doi.org/10.1007/s10489-018-1338-4
SR Surem Samuel CSC (2015) Artificial intelligence and evolutionary algorithms in engineering systems: Proceedings of ICAEES 2014, Volume 2. Advances in Intelligent Systems and Computing 325:665–674. https://doi.org/10.1007/978-81-322-2135-7
Srivastav A, Kumar J (2008) Text detection in scene images using stroke width and nearest-neighbor constraints. IEEE Region 10 Annual International Conference, Proceedings/TENCON https://doi.org/10.1109/TENCON.2008.4766826
Sun SRKHRGJ (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1–9. https://doi.org/10.2307/j.ctt1d98bxx.10
Tang Y, Wu X (2017) Scene text detection and segmentation based on cascaded convolution neural networks. IEEE Trans Image Process 26(3):1509–1520. https://doi.org/10.1109/TIP.2017.2656474
Tang Y, Wu X (2018) Scene text detection using superpixel-based stroke feature transform and deep learning based. IEEE Trans Multimedia 20(9):2276–2288. https://doi.org/10.1109/TMM.2018.2802644
Tian C, Xia Y, Zhang X, Gao X (2017) Natural scene text detection with MC-MR candidate extraction and coarse-to-fine filtering. Neurocomputing 260:112–122. https://doi.org/10.1016/j.neucom.2017.03.078
Vidhyalakshmi M, Sudha S (2019) Text detection in natural images with hybrid stroke feature transform and high performance deep Convnet computing. In: Concurrency Computation, January, pp 1–8. https://doi.org/10.1002/cpe.5271
Wang S, Liu Y, He Z, Wang Y, Tang Z (2020) A quadrilateral scene text detector with two-stage network architecture. Pattern Recognit 102(C):1–9. https://doi.org/10.1016/j.patcog.2020.107230d
Wang Y, Shi C, Xiao B, Wang C, Qi C (2018) CRF based text detection for natural scene images using convolutional neural network and context information. Neurocomputing 295:46–58. https://doi.org/10.1016/j.neucom.2017.12.058
Wei Y, Zhang Z, Shen W, Zeng D, Fang M, Zhou S (2017) Text detection in scene images based on exhaustive segmentation. Signal Process Image Commun 50:1–8. https://doi.org/10.1016/j.image.2016.10.003
Wei Y, Shen W, Zeng D, Ye L, Zhang Z (2018) Multi-oriented text detection from natural scene images based on a CNN and pruning non-adjacent graph edges. Signal Process Image Commun 64(March):89–98. https://doi.org/10.1016/j.image.2018.02.016
Wu H, Zou B, Qian Zhao Y, Chen Z, Zhu C, Guo J (2016) Natural scene text detection by multi-scale adaptive color clustering and non-text filtering. Neurocomputing 214:1011–1025. https://doi.org/10.1016/j.neucom.2016.07.016
Xie E, Zang Y, Shao S, Yu G, Yao C, Li G (2019) Scene Text Detection with Supervised Pyramid Context Network. The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), pp 9038–9045
Xinyu Zhou HWYWSZWH Cong Yao, Liang J (2018) EAST. CVPR pp 5551–5560. https://doi.org/10.2307/j.ctt201mpcf.16
Xiong B, Grauman K (2016) Text detection in stores using a repetition prior. In: 2016 IEEE Winter Conference on Applications of Computer Vision, WACV 2016. https://doi.org/10.1109/WACV.2016.7477575
Xue C, Lu S, Zhan F (2018) Accurate Scene Text Detection Through Border Semantics Awareness and Bootstrapping. In: European Conference on Computer Vision, vol 11220 LNCS, pp 370–387. https://doi.org/10.1007/978-3-030-01270-0_22, arxiv:1807.03547
Yang Q, Cheng M, Zhou W, Chen Y, Qiu M, Lin W (2018) Inceptext: A new inception-text module with deformable PSROI pooling for multi-oriented scene text detection. In: IJCAI International Joint Conference on Artificial Intelligence, vol 2018-July, pp 1071–1077. arxiv:arXiv:1805.01167v2
Yao C, Bai X, Liu W, Ma Y, Tu Z (2017) Detecting Texts of Arbitrary Orientations in Natural Images. Computer Vision and Pattern Recognition 8
Yin XC, Yin X, Huang K, Hao HW (2013) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):1–10. https://doi.org/10.1109/TPAMI.2013.182arxiv:1301.2628
Yu C, Song Y, Meng Q, Zhang Y, Liu Y (2015) Text detection and recognition in natural scene with edge analysis. IET Comput Vis 9(4):603–613. https://doi.org/10.1049/iet-cvi.2013.0307
Zhang J, Cheng R, Wang K, Zhao H (2013) Research on the text detection and extraction from complex images. In: Proceedings - 4th International Conference on Emerging Intelligent Data and Web Technologies, EIDWT 2013, pp 708–713. https://doi.org/10.1109/EIDWT.2013.122
Zhang P, Shi Z, Gao H (2018) Research on text location and recognition in natural images with deep learning. In: ACM International Conference Proceeding Series, pp 1–6. https://doi.org/10.1145/3292448.3292452
Zhang SX, Zhu X, Hou JB, Liu C, Yang C, Wang H, Yin XC (2020) Deep relational reasoning graph network for arbitrary shape text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9699–9708
Zhang Z, Shen W, Yao C, Bai X (2015) Symmetry-based text line detection in natural scenes. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol 07-12-June, pp 2558–2567. https://doi.org/10.1109/CVPR.2015.7298871
Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-Oriented Text Detection with Fully Convolutional Networks. Computer Vision and Pattern Recognition pp 4159–4167. https://doi.org/10.1109/CVPR.2016.451. arxiv:1604.04018
Zheng Y, Li Q, Liu J, Liu H, Li G, Zhang S (2017) A cascaded method for text detection in natural scene images. Neurocomputing 238:307–315. https://doi.org/10.1016/j.neucom.2017.01.066
Zhong Z, Jin L, Zhang S, Feng Z (2016) DeepText : A Unified Framework for Text Proposal Generation and Text Detection in Natural Images. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp 1–12. https://doi.org/10.1109/ICASSP.2017.7952348. arxiv:arXiv:1605.07314v1
Zhu Y, Yao C, Bai X (2015) Scene text detection and recognition: recent advances and future trends. Fron Comput Sci 10(1):1–18. https://doi.org/10.1007/s11704-015-4488-0
Zou F, Xiao W, Ji W, He K, Yang Z, Song J, Zhou H, Li K (2020) Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image. Neural Comput Appl. https://doi.org/10.1007/s00521-020-04893-9
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mahajan, S., Rani, R. Text detection and localization in scene images: a broad review. Artif Intell Rev 54, 4317–4377 (2021). https://doi.org/10.1007/s10462-021-10000-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-021-10000-8