A Complete System for Detection and Identification of Tabular Structures from Document Images

S. Mandal¹⁸,
S. P. Chowdhury¹⁸,
A. K. Das¹⁸ &
…
Bhabatosh Chanda¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3212))

Included in the following conference series:

International Conference Image Analysis and Recognition

899 Accesses
3 Citations

Abstract

The requirement of detection and identification of the tabular structures from a Document Image is crucial to any DIA and digital library system. Here in this paper we report a generic approach to detect any tabular structure in any form that may be present in the document as table, tabular displayed math, table of contents page and index page without any OCR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Tabular Data Extraction From Documents

Robust Detection of Tables in Documents Using Scores from Table Cell Cores

Article Open access 12 February 2022

Table Structure Recognition Using Top-Down and Bottom-Up Cues

References

Liu, J., Wu, X.: Description and recognition of form and automated form data entry. In: Proc. Third Int. Conf. on Document Analysis and Recognition, ICDAR 1995, pp. 579–582 (1995)
Google Scholar
Joseph, S.H.: Processing of engineering line drawings for automatic input to cad. Pattern Recognition 22, 1–11 (1989)
Article Google Scholar
Katsura, E., Takasu, A., Hara, S., Aizawa, A.: Design considerations for capturing an electronic library. Information Services and Use 99–112 (1992)
Google Scholar
Satoh, S., Takasu, A., Katsura, E.: An automated generation of electronic library based on document image understanding. In: Proc. ICDAR 1995, pp. 163–166 (1995)
Google Scholar
Baird, H.S.: Digital libraries and document image analysis. In: Proc. 7th International Conference on Document Image Analysis, vol. I, pp. 2–14. IEEE Computer Society, Los Alamitos (2003)
Google Scholar
Chandran, S., Balasubramanian, S., Gandhi, T., Prasad, A., Kasturi, R., Chhabra, A.: Structure recognition and information extraction from tabular documents. In: IJIST, vol. 7, pp. 289–303 (1996)
Google Scholar
Watanabe, T., Luo, Q.L., Sugie, N.: Layout recognition of multi-kinds of tableform documents. IEEE transactions on Pattern Analysis and Machine Intelligence 17, 432–446 (1995)
Article Google Scholar
Das, A.K., Chanda, B.: Detection of tables and headings from document image: A morphological approach. In: Int. Conf. on Computational linguistics, Speech and Document Processing (ICCLSDP 1998), Calcutta, India, February 18–20, pp. 57–64 (1998)
Google Scholar
Tsuruoka, S., Takao, K., Tanaka, T., Yoshikawa, T., Shinogi, T.: Region segmentation for table image with unknown complex structure. In: Proc. of ICDAR 2001, pp. 709–713 (2001)
Google Scholar
Zuyev, K.: Table image image segmentation. In: Proc. ICDAR 1997, Ulm, Germany, pp. 705–707 (1997)
Google Scholar
Tanaka, T., Tsuruoka, S.: Table form document understanding using node classification method and HTML document generation. In: Proc. of 3rd IAPR Workshop on Document Analysis Systems (DAS 1998), Nagano, Japan, pp. 157–158 (1998)
Google Scholar
Belaid, Y., Panchevre, J.L., Belaid, A.: Form analysis by neural classification of cells. In: Lee, S.-W., Nakano, Y. (eds.) DAS 1998. LNCS, vol. 1655, pp. 69–78. Springer, Heidelberg (1998)
Google Scholar
Tersteegen, W.T., Wenzel, C.: Scantab: Table recognition by reference tables. In: DAS 1998, pp. 356–365 (1998)
Google Scholar
Tsuruoka, S., Hirano, C., Yoshikawa, T., Shinogi, T.: Image-based structure analysis for a table of contents and conversion to a XML documents. In: Workshop on document layout interpretation and its application (DLIA 2001), Seattle, Washington, USA, September 9 (2001)
Google Scholar
Mandal, S., Chowdhury, S.P., Das, A.K., Chanda, B.: Automated detection and segmentation of table of contents page and index pages from document images. In: 12th International Conf. on Image Analysis and Processing (ICIAP 2003), Mantova, Italy, September 17-19 (2003)
Google Scholar
Belaid, A., Haton, J.P.: A syntactic approach for handwritten mathematical formula recognition. IEEE Trans. PAMI 6, 105–111 (1984)
Google Scholar
Ha, J., Haralick, R.M., Phillips, I.T.: Understanding mathematical expressions from document images. In: Proc. of ICDAR 1995, Canada, pp. 956–959 (1995)
Google Scholar
Fateman, R., Tokuyasu, T., Berman, B., Mitchell, N.: Optical character recognition and parsing of typeset mathematics. Visual Commn. And Image Representation 7(1), 2–15 (1996)
Article Google Scholar
Kacem, A., Belaid, A., Ahmed, M.B.: Automatic extraction of printed mathematical formulas using fuzzy logic and propagation of context. IJDAR 4(2), 97–108 (2001)
Article Google Scholar
Toumit, J.Y., Garcia-Salicetti, S., Emptoz, H.: A hierarchical and recursive model of mathematical expressions for automatic reading of mathematical documents. In: Proc. of ICDAR 1999, India, pp. 116–122 (1999)
Google Scholar
Chowdhury, S.P., Mandal, S., Das, A.K., Chanda, B.: Automated segmentation of math-zones from document images. In: 7th International Conference on Document Analysis and Recognition, Edinburgh, U. K, August 3-6, vol. II, pp. 755–759 (2003)
Google Scholar
Das, A.K., Chanda, B.: Text segmentation from document images: A morphological approach. Journal of Institute of Engineers 77(1), 50–56 (1996)
Google Scholar
Otsu, N.: A threshold selection method from gray-level histogram. IEEE Trans. SMC 9(1), 62–66 (1979)
MathSciNet Google Scholar
Das, A.K., Chanda, B.: A fast algorithm for skew detection of document images using morphology. Intl. J. of Document Analysis and Recognition 4, 109–114 (2001)
Article Google Scholar
Gonzalez, R.C., Wood, R.: Digital Image Processing. Addison-Wesley, Reading (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

CST Department, B. E. College, Howrah, 711 103, India
S. Mandal, S. P. Chowdhury & A. K. Das
ECSU, Indian Statistical Institute, Kolkata, 700 035, India
Bhabatosh Chanda

Authors

S. Mandal
View author publications
You can also search for this author in PubMed Google Scholar
S. P. Chowdhury
View author publications
You can also search for this author in PubMed Google Scholar
A. K. Das
View author publications
You can also search for this author in PubMed Google Scholar
Bhabatosh Chanda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

FEUP - Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, 4200-465, Porto, Portugal
Aurélio Campilho
Electrical and Computer Engineering Department, University of Waterloo,
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mandal, S., Chowdhury, S.P., Das, A.K., Chanda, B. (2004). A Complete System for Detection and Identification of Tabular Structures from Document Images. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2004. Lecture Notes in Computer Science, vol 3212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30126-4_27

Download citation

DOI: https://doi.org/10.1007/978-3-540-30126-4_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23240-7
Online ISBN: 978-3-540-30126-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

A Complete System for Detection and Identification of Tabular Structures from Document Images

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Tabular Data Extraction From Documents

Robust Detection of Tables in Documents Using Scores from Table Cell Cores

Table Structure Recognition Using Top-Down and Bottom-Up Cues

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Complete System for Detection and Identification of Tabular Structures from Document Images

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Tabular Data Extraction From Documents

Robust Detection of Tables in Documents Using Scores from Table Cell Cores

Table Structure Recognition Using Top-Down and Bottom-Up Cues

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation