Abstract
Automatic layout analysis has proven to be extremely important in the process of digitization of large amounts of documents. In this paper we present a mixed approach to layout analysis, introducing a SVM-aided layout segmentation process and a classification process based on local and geometrical features. The final output of the automatic analysis algorithm is a complete and structured annotation in JSON format, containing the digitalized text as well as all the references to the illustrations of the input page, and which can be used by visualization interfaces as well as annotation interfaces. We evaluate our algorithm on a large dataset built upon the first volume of the “Enciclopedia Treccani”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Antonacopoulos, A., Gatos, B., Karatzas, D.: ICDAR 2003 page segmentation competition. In: ICDAR, p. 688. IEEE (2003)
Appiani, E., Cesarini, F., Colla, A.M., Diligenti, M., Gori, M., Marinai, S., Soda, G.: Automatic document classification and indexing in high-volume applications. Int. J. Doc. Anal. Recogn. 4(2), 69–83 (2001)
Baird, H., Jones, S., Fortune, S.: Image segmentation by shape-directed covers. In: International Conference on Pattern Recognition, vol. 1, pp. 820–825, June 1990
Baraldi, L., Grana, C., Cucchiara, R.: A deep siamese network for scene detection in broadcast videos. In: ACM International Conference on Multimedia, pp. 1199–1202. ACM (2015)
Bertini, M., Del Bimbo, A., Serra, G., Torniai, C., Cucchiara, R., Grana, C., Vezzani, R.: Dynamic pictorial ontologies for video digital libraries annotation. In: IEEE MultiMedia Magazine, pp. 42–51. ACM (2009)
Cesarini, F., Lastri, M., Marinai, S., Soda, G.: Encoding of modified XY trees for document classification. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, pp. 1131–1136. IEEE (2001)
Chen, K., Yin, F., Liu, C.L.: Hybrid page segmentation with efficient whitespace rectangles extraction and grouping. In: 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 958–962. IEEE (2013)
Coüasnon, B., Lemaitre, A.: Recognition of tables and forms. In: Doermann, D., Tombre, K. (eds.) Handbook of Document Image Processing and Recognition, pp. 647–677. Springer, London (2014)
Mauro, N., Ferilli, S., Esposito, F.: Learning to Recognize Critical Cells in Document Tables. In: Agosti, M., Esposito, F., Ferilli, S., Ferro, N. (eds.) IRCDL 2012. CCIS, vol. 354, pp. 105–116. Springer, Heidelberg (2013). doi:10.1007/978-3-642-35834-0_12
Duda, R.O., Hart, P.E.: Use of the Hough transformation to detect lines and curves in pictures. Commun. ACM 15(1), 11–15 (1972)
Esposito, F., Malerba, D., Lisi, F.A.: Machine learning for intelligent processing of printed documents. J. Intell. Inf. Syst. 14(2–3), 175–198 (2000)
Grana, C., Serra, G., Manfredi, M., Coppi, D., Cucchiara, R.: Layout analysis and content enrichment of digitized books. Multimed. Tools Appl. 75(7), 3879–3900 (2016)
Ha, J., Haralick, R.M., Phillips, I.T.: Recursive XY cut using bounding boxes of connected components. In: Proceedings of the Third International Conference on Document Analysis and Recognition, vol. 2, pp. 952–955. IEEE (1995)
Kaur, S., Sharma, D.V.: Table structure identification from document images: a survey. Int. J. Innov. Adv. Comput. Sci. 4, 581–585 (2015)
Kise, K., Sato, A., Iwata, M.: Segmentation of page images using the area Voronoi diagram. Comput. Vis. Image Underst. 70(3), 370–382 (1998)
Lazzara, G., Levillain, R., Géraud, T., Jacquelet, Y., Marquegnies, J., Crépin-Leblond, A.: The scribo module of the olena platform: a free software framework for document image analysis. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 252–258. IEEE (2011)
Liu, Y., Mitra, P., Giles, C.L.: A fast preprocessing method for table boundary detection: narrowing down the sparse lines using solely coordinate information. In: The Eighth IAPR International Workshop on Document Analysis Systems, pp. 431–438. IEEE (2008)
Mandal, S., Chowdhury, S.P., Das, A.K., Chanda, B.: Detection and segmentation of tables and math-zones from document images. In: Proceedings of the 2006 ACM Symposium on Applied Computing. SAC 2006, pp. 841–846. ACM (2006)
Mandal, S., Chowdhury, S., Das, A., Chanda, B.: A simple and effective table detection system from document images. Int. J. Doc. Anal. Recogn. (IJDAR) 8(2–3), 172–182 (2006)
Matas, J., Galambos, C., Kittler, J.: Robust detection of lines using the progressive probabilistic Hough transform. Comput. Vis. Image Underst. 78(1), 119–137 (2000). http://dx.doi.org/10.1006/cviu.1999.0831
Phillips, I.T., Chhabra, A.K.: Empirical performance evaluation of graphics recognition systems. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 849–870 (1999)
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)
Smith, R.: An overview of the Tesseract OCR engine. In: International Conference on Document Analysis and Recognition, pp. 629–633. IEEE (2007)
Zanibbi, R., Blostein, D., Cordy, J.: A survey of table recognition. Doc. Anal. Recogn. 7(1), 1–16 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Corbelli, A., Baraldi, L., Balducci, F., Grana, C., Cucchiara, R. (2017). Layout Analysis and Content Classification in Digitized Books. In: Agosti, M., Bertini, M., Ferilli, S., Marinai, S., Orio, N. (eds) Digital Libraries and Multimedia Archives. IRCDL 2016. Communications in Computer and Information Science, vol 701. Springer, Cham. https://doi.org/10.1007/978-3-319-56300-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-56300-8_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56299-5
Online ISBN: 978-3-319-56300-8
eBook Packages: Computer ScienceComputer Science (R0)