Abstract.
Real-world text on street signs, nameplates, etc. often lies in an oblique plane and hence cannot be recognized by traditional OCR systems due to perspective distortion. Furthermore, such text often comprises only one or two lines, preventing the use of existing perspective rectification methods that were primarily designed for images of document pages. We propose an approach that reliably rectifies and subsequently recognizes individual lines of text. Our system, which includes novel algorithms for extraction of text from real-world scenery, perspective rectification, and binarization, has been rigorously tested on still imagery as well as on MPEG-2 video clips in real time.
Similar content being viewed by others
References
Clark P, Mirmehdi M (2002) Location and recovery of text on oriented surfaces. In: 7th SPIE conference on document recognition and retrieval, pp 267-277
Clark P, Mirmehdi M (2002) On the recovery of oriented documents from single images. In: Proc. 4th IEEE conference on advanced concepts for intelligent vision systems, pp 190-197
Clark P, Mirmehdi M (2002) Recognising text in real scenes. Int J Doc Anal Recog 4(4):243-257
Dance CR (2001) Perspective estimation for document images. In: Proceedings SPIE, 4670:244-254
Gandhi T, Kasturi R, Antani S (2000) Application of planar motion segmentation for scene text extraction. In: Proc. international conference on pattern recognition, 1:445-449
Hashizume A, Yeh PS, Rosenfeld A (1986) A method of detecting the orientation of aligned components. Pattern Recog Lett 4:125-132
Jain A, Bhattacharjee S (1992) Text segmentation using Gabor filters for automatic document processing. Mach Vis Appl 5:169-184
Jain A, Yu B (1998) Automatic text location in images and video frames. In: Proc. international conference on pattern recognition, pp 1497-1599
Li H, Doermann D (1998) Automatic identification of text in digital video key frames. In: Proc. international conference on pattern recognition, pp 129-132
Li H, Doermann D (1998) Automatic text tracking In digital videos. In: Proc. IEEE workshop on multimedia signal processing, pp 21-26
Li H, Doermann D, Kia O (1998) Text extraction and recognition in digital video. In: Proc. 3rd IAPR workshop on document analysis systems, pp 119-128
Li H, Doermann D, Kia O (1999) Automatic text detection and tracking in digital video. IEEE Trans Image Process 9(1):147-155
Lienhart R (1996) Indexing and retrieval of digital video sequences based on automatic text recognition. In: Proc. 4th ACM international multimedia conference, Boston
Mirmehdi M, Clark P, Lam J (2001) Extracting low resolution text with an active camera for OCR. In: Proc. 9th Spanish symposium on pattern recognition and image processing, pp 43-48
Nakano Y, Shima Y, Fujisawa H, Higashino J, Fojinawa M (1990) An algorithm for the skew normalization of document images. In: Proc. international conference on pattern recognition, 2:8-13
O’Gorman L (1993) The document spectrum for page layout analysis. IEEE Trans Pattern Anal Mach Intell 15(11):1162-1173
Ohya J, Shio A, Akamatsu S (1994) Recognizing characters in scene images. IEEE Trans Pattern Anal Mach Intell 16(2):214-220
Pilu M (2001) Extraction of illusory linear clues in perspectively skewed documents. In: Proc. CVPR, 1:363-368 (2001)
Sato T, Takeo K, Hughes E, Smith M (1998) Video OCR for digital news archive. In: Proc. international workshop on content-based access of image and video databases (CAIVD ‘98), Bombay, India. IEEE Press, New York, ISBN 0-8186-8329-5
Smith MA, Kanade T (1995) Video skimming for quick browsing based on audio and image characterization. Technical Report CMU-CS-95-186, Carnegie Mellon University, Pittsburgh, PA
Taylor MJ, Zappala A, Newman WM, Dance CR (1999) Documents through cameras. Image Vis Comput 17(11):831-844
Wu V, Manmatha R, Riseman E (1997) Automatic text detection and recognition. In: Proc. workshop on image understanding, pp 707-712
Yeo B-L, Liu B (1996) Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video in digital video compression: algorithms and technologies. In: Proc. SPIE, vol 2668
Zhong Y, Karu K, Jain A (1995) Locating text in complex color images. In: Proc. 3rd international conference on document analysis and recognition, Montreal
Author information
Authors and Affiliations
Corresponding author
Additional information
Received: 15 December 2003, Published online: 14 December 2004
Gregory K. Myers: Correspondence to
Rights and permissions
About this article
Cite this article
Myers, G.K., Bolles, R.C., Luong, QT. et al. Rectification and recognition of text in 3-D scenes. IJDAR 7, 147–158 (2005). https://doi.org/10.1007/s10032-004-0133-4
Issue Date:
DOI: https://doi.org/10.1007/s10032-004-0133-4