Nothing Special   »   [go: up one dir, main page]

Skip to main content

A Complete System for Detection and Identification of Tabular Structures from Document Images

  • Conference paper
Image Analysis and Recognition (ICIAR 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3212))

Included in the following conference series:

Abstract

The requirement of detection and identification of the tabular structures from a Document Image is crucial to any DIA and digital library system. Here in this paper we report a generic approach to detect any tabular structure in any form that may be present in the document as table, tabular displayed math, table of contents page and index page without any OCR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Liu, J., Wu, X.: Description and recognition of form and automated form data entry. In: Proc. Third Int. Conf. on Document Analysis and Recognition, ICDAR 1995, pp. 579–582 (1995)

    Google Scholar 

  2. Joseph, S.H.: Processing of engineering line drawings for automatic input to cad. Pattern Recognition 22, 1–11 (1989)

    Article  Google Scholar 

  3. Katsura, E., Takasu, A., Hara, S., Aizawa, A.: Design considerations for capturing an electronic library. Information Services and Use 99–112 (1992)

    Google Scholar 

  4. Satoh, S., Takasu, A., Katsura, E.: An automated generation of electronic library based on document image understanding. In: Proc. ICDAR 1995, pp. 163–166 (1995)

    Google Scholar 

  5. Baird, H.S.: Digital libraries and document image analysis. In: Proc. 7th International Conference on Document Image Analysis, vol. I, pp. 2–14. IEEE Computer Society, Los Alamitos (2003)

    Google Scholar 

  6. Chandran, S., Balasubramanian, S., Gandhi, T., Prasad, A., Kasturi, R., Chhabra, A.: Structure recognition and information extraction from tabular documents. In: IJIST, vol. 7, pp. 289–303 (1996)

    Google Scholar 

  7. Watanabe, T., Luo, Q.L., Sugie, N.: Layout recognition of multi-kinds of tableform documents. IEEE transactions on Pattern Analysis and Machine Intelligence 17, 432–446 (1995)

    Article  Google Scholar 

  8. Das, A.K., Chanda, B.: Detection of tables and headings from document image: A morphological approach. In: Int. Conf. on Computational linguistics, Speech and Document Processing (ICCLSDP 1998), Calcutta, India, February 18–20, pp. 57–64 (1998)

    Google Scholar 

  9. Tsuruoka, S., Takao, K., Tanaka, T., Yoshikawa, T., Shinogi, T.: Region segmentation for table image with unknown complex structure. In: Proc. of ICDAR 2001, pp. 709–713 (2001)

    Google Scholar 

  10. Zuyev, K.: Table image image segmentation. In: Proc. ICDAR 1997, Ulm, Germany, pp. 705–707 (1997)

    Google Scholar 

  11. Tanaka, T., Tsuruoka, S.: Table form document understanding using node classification method and HTML document generation. In: Proc. of 3rd IAPR Workshop on Document Analysis Systems (DAS 1998), Nagano, Japan, pp. 157–158 (1998)

    Google Scholar 

  12. Belaid, Y., Panchevre, J.L., Belaid, A.: Form analysis by neural classification of cells. In: Lee, S.-W., Nakano, Y. (eds.) DAS 1998. LNCS, vol. 1655, pp. 69–78. Springer, Heidelberg (1998)

    Google Scholar 

  13. Tersteegen, W.T., Wenzel, C.: Scantab: Table recognition by reference tables. In: DAS 1998, pp. 356–365 (1998)

    Google Scholar 

  14. Tsuruoka, S., Hirano, C., Yoshikawa, T., Shinogi, T.: Image-based structure analysis for a table of contents and conversion to a XML documents. In: Workshop on document layout interpretation and its application (DLIA 2001), Seattle, Washington, USA, September 9 (2001)

    Google Scholar 

  15. Mandal, S., Chowdhury, S.P., Das, A.K., Chanda, B.: Automated detection and segmentation of table of contents page and index pages from document images. In: 12th International Conf. on Image Analysis and Processing (ICIAP 2003), Mantova, Italy, September 17-19 (2003)

    Google Scholar 

  16. Belaid, A., Haton, J.P.: A syntactic approach for handwritten mathematical formula recognition. IEEE Trans. PAMI 6, 105–111 (1984)

    Google Scholar 

  17. Ha, J., Haralick, R.M., Phillips, I.T.: Understanding mathematical expressions from document images. In: Proc. of ICDAR 1995, Canada, pp. 956–959 (1995)

    Google Scholar 

  18. Fateman, R., Tokuyasu, T., Berman, B., Mitchell, N.: Optical character recognition and parsing of typeset mathematics. Visual Commn. And Image Representation 7(1), 2–15 (1996)

    Article  Google Scholar 

  19. Kacem, A., Belaid, A., Ahmed, M.B.: Automatic extraction of printed mathematical formulas using fuzzy logic and propagation of context. IJDAR 4(2), 97–108 (2001)

    Article  Google Scholar 

  20. Toumit, J.Y., Garcia-Salicetti, S., Emptoz, H.: A hierarchical and recursive model of mathematical expressions for automatic reading of mathematical documents. In: Proc. of ICDAR 1999, India, pp. 116–122 (1999)

    Google Scholar 

  21. Chowdhury, S.P., Mandal, S., Das, A.K., Chanda, B.: Automated segmentation of math-zones from document images. In: 7th International Conference on Document Analysis and Recognition, Edinburgh, U. K, August 3-6, vol. II, pp. 755–759 (2003)

    Google Scholar 

  22. Das, A.K., Chanda, B.: Text segmentation from document images: A morphological approach. Journal of Institute of Engineers 77(1), 50–56 (1996)

    Google Scholar 

  23. Otsu, N.: A threshold selection method from gray-level histogram. IEEE Trans. SMC 9(1), 62–66 (1979)

    MathSciNet  Google Scholar 

  24. Das, A.K., Chanda, B.: A fast algorithm for skew detection of document images using morphology. Intl. J. of Document Analysis and Recognition 4, 109–114 (2001)

    Article  Google Scholar 

  25. Gonzalez, R.C., Wood, R.: Digital Image Processing. Addison-Wesley, Reading (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mandal, S., Chowdhury, S.P., Das, A.K., Chanda, B. (2004). A Complete System for Detection and Identification of Tabular Structures from Document Images. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2004. Lecture Notes in Computer Science, vol 3212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30126-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30126-4_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23240-7

  • Online ISBN: 978-3-540-30126-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics