Rectification and recognition of text in 3-D scenes

Gregory K. Myers¹,
Robert C. Bolles¹,
Quang-Tuan Luong¹,
James A. Herson¹ &
…
Hrishikesh B. Aradhye¹

368 Accesses
33 Citations
3 Altmetric
Explore all metrics

Abstract.

Real-world text on street signs, nameplates, etc. often lies in an oblique plane and hence cannot be recognized by traditional OCR systems due to perspective distortion. Furthermore, such text often comprises only one or two lines, preventing the use of existing perspective rectification methods that were primarily designed for images of document pages. We propose an approach that reliably rectifies and subsequently recognizes individual lines of text. Our system, which includes novel algorithms for extraction of text from real-world scenery, perspective rectification, and binarization, has been rigorously tested on still imagery as well as on MPEG-2 video clips in real time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Clark P, Mirmehdi M (2002) Location and recovery of text on oriented surfaces. In: 7th SPIE conference on document recognition and retrieval, pp 267-277
Clark P, Mirmehdi M (2002) On the recovery of oriented documents from single images. In: Proc. 4th IEEE conference on advanced concepts for intelligent vision systems, pp 190-197
Clark P, Mirmehdi M (2002) Recognising text in real scenes. Int J Doc Anal Recog 4(4):243-257
Google Scholar
Dance CR (2001) Perspective estimation for document images. In: Proceedings SPIE, 4670:244-254
Gandhi T, Kasturi R, Antani S (2000) Application of planar motion segmentation for scene text extraction. In: Proc. international conference on pattern recognition, 1:445-449
Hashizume A, Yeh PS, Rosenfeld A (1986) A method of detecting the orientation of aligned components. Pattern Recog Lett 4:125-132
Google Scholar
Jain A, Bhattacharjee S (1992) Text segmentation using Gabor filters for automatic document processing. Mach Vis Appl 5:169-184
Google Scholar
Jain A, Yu B (1998) Automatic text location in images and video frames. In: Proc. international conference on pattern recognition, pp 1497-1599
Li H, Doermann D (1998) Automatic identification of text in digital video key frames. In: Proc. international conference on pattern recognition, pp 129-132
Li H, Doermann D (1998) Automatic text tracking In digital videos. In: Proc. IEEE workshop on multimedia signal processing, pp 21-26
Li H, Doermann D, Kia O (1998) Text extraction and recognition in digital video. In: Proc. 3rd IAPR workshop on document analysis systems, pp 119-128
Li H, Doermann D, Kia O (1999) Automatic text detection and tracking in digital video. IEEE Trans Image Process 9(1):147-155
Google Scholar
Lienhart R (1996) Indexing and retrieval of digital video sequences based on automatic text recognition. In: Proc. 4th ACM international multimedia conference, Boston
Mirmehdi M, Clark P, Lam J (2001) Extracting low resolution text with an active camera for OCR. In: Proc. 9th Spanish symposium on pattern recognition and image processing, pp 43-48
Nakano Y, Shima Y, Fujisawa H, Higashino J, Fojinawa M (1990) An algorithm for the skew normalization of document images. In: Proc. international conference on pattern recognition, 2:8-13
O’Gorman L (1993) The document spectrum for page layout analysis. IEEE Trans Pattern Anal Mach Intell 15(11):1162-1173
Google Scholar
Ohya J, Shio A, Akamatsu S (1994) Recognizing characters in scene images. IEEE Trans Pattern Anal Mach Intell 16(2):214-220
Google Scholar
Pilu M (2001) Extraction of illusory linear clues in perspectively skewed documents. In: Proc. CVPR, 1:363-368 (2001)
Sato T, Takeo K, Hughes E, Smith M (1998) Video OCR for digital news archive. In: Proc. international workshop on content-based access of image and video databases (CAIVD ‘98), Bombay, India. IEEE Press, New York, ISBN 0-8186-8329-5
Smith MA, Kanade T (1995) Video skimming for quick browsing based on audio and image characterization. Technical Report CMU-CS-95-186, Carnegie Mellon University, Pittsburgh, PA
Taylor MJ, Zappala A, Newman WM, Dance CR (1999) Documents through cameras. Image Vis Comput 17(11):831-844
Google Scholar
Wu V, Manmatha R, Riseman E (1997) Automatic text detection and recognition. In: Proc. workshop on image understanding, pp 707-712
Yeo B-L, Liu B (1996) Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video in digital video compression: algorithms and technologies. In: Proc. SPIE, vol 2668
Zhong Y, Karu K, Jain A (1995) Locating text in complex color images. In: Proc. 3rd international conference on document analysis and recognition, Montreal

Download references

Author information

Authors and Affiliations

SRI International, 333 Ravenswood Avenue, CA 94025, Menlo Park, USA
Gregory K. Myers, Robert C. Bolles, Quang-Tuan Luong, James A. Herson & Hrishikesh B. Aradhye

Authors

Gregory K. Myers
View author publications
You can also search for this author in PubMed Google Scholar
Robert C. Bolles
View author publications
You can also search for this author in PubMed Google Scholar
Quang-Tuan Luong
View author publications
You can also search for this author in PubMed Google Scholar
James A. Herson
View author publications
You can also search for this author in PubMed Google Scholar
Hrishikesh B. Aradhye
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gregory K. Myers.

Additional information

Received: 15 December 2003, Published online: 14 December 2004

Gregory K. Myers: Correspondence to

Rights and permissions

Reprints and permissions

About this article

Cite this article

Myers, G.K., Bolles, R.C., Luong, QT. et al. Rectification and recognition of text in 3-D scenes. IJDAR 7, 147–158 (2005). https://doi.org/10.1007/s10032-004-0133-4

Download citation

Issue Date: July 2005
DOI: https://doi.org/10.1007/s10032-004-0133-4

Abstract.

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Geometric Distortion Correction Technique of Text Images

Text Localization and Recognition in Images and Video

Can You Read Me Now? Content Aware Rectification Using Angle Supervision

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords:

Subscribe and save

Buy Now

Navigation

Rectification and recognition of text in 3-D scenes

Abstract.

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Geometric Distortion Correction Technique of Text Images

Text Localization and Recognition in Images and Video

Can You Read Me Now? Content Aware Rectification Using Angle Supervision

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords:

Subscribe and save

Buy Now

Search

Navigation