Abstract
The requirement of detection and identification of the tabular structures from a Document Image is crucial to any DIA and digital library system. Here in this paper we report a generic approach to detect any tabular structure in any form that may be present in the document as table, tabular displayed math, table of contents page and index page without any OCR.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Liu, J., Wu, X.: Description and recognition of form and automated form data entry. In: Proc. Third Int. Conf. on Document Analysis and Recognition, ICDAR 1995, pp. 579–582 (1995)
Joseph, S.H.: Processing of engineering line drawings for automatic input to cad. Pattern Recognition 22, 1–11 (1989)
Katsura, E., Takasu, A., Hara, S., Aizawa, A.: Design considerations for capturing an electronic library. Information Services and Use 99–112 (1992)
Satoh, S., Takasu, A., Katsura, E.: An automated generation of electronic library based on document image understanding. In: Proc. ICDAR 1995, pp. 163–166 (1995)
Baird, H.S.: Digital libraries and document image analysis. In: Proc. 7th International Conference on Document Image Analysis, vol. I, pp. 2–14. IEEE Computer Society, Los Alamitos (2003)
Chandran, S., Balasubramanian, S., Gandhi, T., Prasad, A., Kasturi, R., Chhabra, A.: Structure recognition and information extraction from tabular documents. In: IJIST, vol. 7, pp. 289–303 (1996)
Watanabe, T., Luo, Q.L., Sugie, N.: Layout recognition of multi-kinds of tableform documents. IEEE transactions on Pattern Analysis and Machine Intelligence 17, 432–446 (1995)
Das, A.K., Chanda, B.: Detection of tables and headings from document image: A morphological approach. In: Int. Conf. on Computational linguistics, Speech and Document Processing (ICCLSDP 1998), Calcutta, India, February 18–20, pp. 57–64 (1998)
Tsuruoka, S., Takao, K., Tanaka, T., Yoshikawa, T., Shinogi, T.: Region segmentation for table image with unknown complex structure. In: Proc. of ICDAR 2001, pp. 709–713 (2001)
Zuyev, K.: Table image image segmentation. In: Proc. ICDAR 1997, Ulm, Germany, pp. 705–707 (1997)
Tanaka, T., Tsuruoka, S.: Table form document understanding using node classification method and HTML document generation. In: Proc. of 3rd IAPR Workshop on Document Analysis Systems (DAS 1998), Nagano, Japan, pp. 157–158 (1998)
Belaid, Y., Panchevre, J.L., Belaid, A.: Form analysis by neural classification of cells. In: Lee, S.-W., Nakano, Y. (eds.) DAS 1998. LNCS, vol. 1655, pp. 69–78. Springer, Heidelberg (1998)
Tersteegen, W.T., Wenzel, C.: Scantab: Table recognition by reference tables. In: DAS 1998, pp. 356–365 (1998)
Tsuruoka, S., Hirano, C., Yoshikawa, T., Shinogi, T.: Image-based structure analysis for a table of contents and conversion to a XML documents. In: Workshop on document layout interpretation and its application (DLIA 2001), Seattle, Washington, USA, September 9 (2001)
Mandal, S., Chowdhury, S.P., Das, A.K., Chanda, B.: Automated detection and segmentation of table of contents page and index pages from document images. In: 12th International Conf. on Image Analysis and Processing (ICIAP 2003), Mantova, Italy, September 17-19 (2003)
Belaid, A., Haton, J.P.: A syntactic approach for handwritten mathematical formula recognition. IEEE Trans. PAMI 6, 105–111 (1984)
Ha, J., Haralick, R.M., Phillips, I.T.: Understanding mathematical expressions from document images. In: Proc. of ICDAR 1995, Canada, pp. 956–959 (1995)
Fateman, R., Tokuyasu, T., Berman, B., Mitchell, N.: Optical character recognition and parsing of typeset mathematics. Visual Commn. And Image Representation 7(1), 2–15 (1996)
Kacem, A., Belaid, A., Ahmed, M.B.: Automatic extraction of printed mathematical formulas using fuzzy logic and propagation of context. IJDAR 4(2), 97–108 (2001)
Toumit, J.Y., Garcia-Salicetti, S., Emptoz, H.: A hierarchical and recursive model of mathematical expressions for automatic reading of mathematical documents. In: Proc. of ICDAR 1999, India, pp. 116–122 (1999)
Chowdhury, S.P., Mandal, S., Das, A.K., Chanda, B.: Automated segmentation of math-zones from document images. In: 7th International Conference on Document Analysis and Recognition, Edinburgh, U. K, August 3-6, vol. II, pp. 755–759 (2003)
Das, A.K., Chanda, B.: Text segmentation from document images: A morphological approach. Journal of Institute of Engineers 77(1), 50–56 (1996)
Otsu, N.: A threshold selection method from gray-level histogram. IEEE Trans. SMC 9(1), 62–66 (1979)
Das, A.K., Chanda, B.: A fast algorithm for skew detection of document images using morphology. Intl. J. of Document Analysis and Recognition 4, 109–114 (2001)
Gonzalez, R.C., Wood, R.: Digital Image Processing. Addison-Wesley, Reading (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mandal, S., Chowdhury, S.P., Das, A.K., Chanda, B. (2004). A Complete System for Detection and Identification of Tabular Structures from Document Images. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2004. Lecture Notes in Computer Science, vol 3212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30126-4_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-30126-4_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23240-7
Online ISBN: 978-3-540-30126-4
eBook Packages: Springer Book Archive