Abstract
We propose a Transformer-based approach for information extraction from digitized handwritten documents. Our approach combines, in a single model, the different steps that were so far performed by separate models: feature extraction, handwriting recognition and named entity recognition. We compare this integrated approach with traditional two-stage methods that perform handwriting recognition before named entity recognition, and present results at different levels: line, paragraph, and page. Our experiments show that attention-based models are especially interesting when applied on full pages, as they do not require any prior segmentation step. Finally, we show that they are able to learn from key-value annotations: a list of important words with their corresponding named entities. We compare our models to state-of-the-art methods on three public databases (IAM, ESPOSALLES, and POPP) and outperform previous performances on all three datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., Vollgraf, R.: FLAIR: an easy-to-use framework for state-of-the-art NLP. In: 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)(Demonstrations), pp. 54–59 (2019)
Carbonell, M., Fornés, A., Villegas, M., Lladós, J.: A neural model for text localization, transcription and named entity recognition in full pages. Pattern Recogn. Lett. 136, 219–227 (2020). https://doi.org/10.1016/j.patrec.2020.05.001
Carbonell, M., Villegas, M., Fornés, A., Lladós, J.: Joint recognition of handwritten text and named entities with a neural end-to-end model. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 399–404. IEEE Computer Society, Los Alamitos, CA, USA, April 2018. https://doi.org/10.1109/DAS.2018.52
Constum, T., et al.: Recognition and information extraction in historical handwritten tables: toward understanding early 20th century Paris census. In: 15th International Workshop on Document Analysis Systems (DAS), pp. 143–157, May 2022. https://doi.org/10.1007/978-3-031-06555-2_10
Coquenet, D., Chatelain, C., Paquet, T.: DAN: a segmentation-free document attention network for handwritten document recognition. IEEE Trans. Pattern Anal. Mach. Intell. 45, 1–17 (2023). https://doi.org/10.1109/TPAMI.2023.3235826
Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network. IEEE Trans. Pattern Anal. Mach. Intell. 45, 508–524 (2023). https://doi.org/10.1109/TPAMI.2022.3144899
Davis, B., Morse, B., Price, B., Tensmeyer, C., Wigington, C., Morariu, V.: End-to-end Document Recognition and Understanding with Dessurt (2022). https://doi.org/10.48550/ARXIV.2203.16618
Fornés, A., Romero, V., Baro, A., Toledo, J., Sánchez, J.A., Vidal, E., Lladós, J.: ICDAR2017 competition on information extraction in historical handwritten records. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp. 1389–1394, November 2017. https://doi.org/10.1109/ICDAR.2017.227
Honnibal, M., Montani, I., Van Landeghem, S., Boyd, A.: spaCy: Industrial-strength Natural Language Processing in Python (2020). https://doi.org/10.5281/zenodo.1212303
Kang, L., Toledo, J.I., Riba, P., Villegas, M., Fornés, A., Rusiñol, M.: Convolve, attend and spell: an attention-based sequence-to-sequence model for handwritten word recognition. In: German Conference on Pattern Recognition, pp. 459–472 (2019)
Kiessling, B., Tissot, R., Stokes, P., Stökl Ben Ezra, D.: eScriptorium: an open source platform for historical document analysis. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW) (2019). https://doi.org/10.1109/ICDARW.2019.10032
Lombardi, F., Marinai, S.: Deep learning for historical document analysis and recognition-a survey. J. Imaging 6, 110 (2020)
Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recogn. 5, 39–46 (2002). https://doi.org/10.1007/s100320200071
Miret, B., Kermorvant, C.: Nerval: a python library for named-entity recognition evaluation on noisy texts (2021). http://gitlab.com/teklia/ner/nerval
Monroc, C.B., Miret, B., Bonhomme, M.L., Kermorvant, C.: A comprehensive study of open-source libraries for named entity recognition on handwritten historical documents. In: Document Analysis Systems, pp. 429–444 (2022). https://doi.org/10.1007/978-3-031-06555-2_29
Muehlberger, G., et al.: Transforming scholarship in the archives through handwritten text recognition: Transkribus as a case study. J. Doc. 75, 954–976 (2019). https://doi.org/10.1108/JD-07-2018-0114
Prasad, A., Déjean, H., Meunier, J., Weidemann, M., Michael, J., Leifert, G.: Bench-marking information extraction in semi-structured historical handwritten records. In: CoRR (2018). http://arxiv.org/abs/1807.06270
Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 67–72 (2017). https://doi.org/10.1109/ICDAR.2017.20
Qi, P., Zhang, Y., Zhang, Y., Bolton, J., Manning, C.D.: Stanza: a python natural language processing toolkit for many human languages. In: 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 101–108, January 2020. https://doi.org/10.18653/v1/2020.acl-demos.14
Romero, V., et al.: The ESPOSALLES database: an ancient marriage license corpus for off-line handwriting recognition. Pattern Recogn. 46, 1658–1669 (2013). https://doi.org/10.1016/j.patcog.2012.11.024
Rouhou, A.C., Dhiaf, M., Kessentini, Y., Salem, S.B.: Transformer-based approach for joint handwriting and named entity recognition in historical document. Pattern Recogn. Lett. 155, 128–134 (2022). https://doi.org/10.1016/j.patrec.2021.11.010
Rowtula, V., Krishnan, P., Jawahar, C.V.: POS tagging and named entity recognition on handwritten documents. In: Proceedings of the 15th International Conference on Natural Language Processing (2018)
Tarride, S., Lemaitre, A., Coüasnon, B., Tardivel, S.: A comparative study of information extraction strategies using an attention-based neural network. In: Document Analysis Systems, pp. 644–658 (2022). https://doi.org/10.1007/978-3-031-06555-2_43
Tarridea, S., et al.: Large-scale genealogical information extraction from handwritten Quebec parish records. Int. J. Document Anal. Recogn. (2023)
Toledo, J.I., Carbonell, M., Fornés, A., Lladós, J.: Information extraction from historical handwritten document images with a context-aware neural model. Pattern Recogn. 86, 27–36 (2019). https://doi.org/10.1016/j.patcog.2018.08.020
Tüselmann, O., Wolf, F., Fink, G.A.: Are end-to-end systems really necessary for ner on handwritten document images? In: Document Analysis and Recognition - ICDAR 2021, pp. 808–822 (2021). https://doi.org/10.1007/978-3-030-86331-9_52
Vidal, E., et al.: The Carabela project and manuscript collection: large-scale probabilistic indexing and content-based classification. In: In proceedings of the 17th International Conference on Frontiers in Handwriting Recognition (ICFHR 2020) (2020)
Yousef, M., Bishop, T.: OrigamiNet: weakly-supervised, segmentation-free, one-step, full page text recognition by learning to unfold. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14698–14707, June 2020
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tarride, S., Boillet, M., Kermorvant, C. (2023). Key-Value Information Extraction from Full Handwritten Pages. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14188. Springer, Cham. https://doi.org/10.1007/978-3-031-41679-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-41679-8_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41678-1
Online ISBN: 978-3-031-41679-8
eBook Packages: Computer ScienceComputer Science (R0)