Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3595916.3626365acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper
Open access

Few-Shot Learning for Word Recognition in Handwritten Seventeenth-Century Spanish American Notary Records

Published: 01 January 2024 Publication History

Abstract

Historical records are invaluable sources of information that provide insights into multiple aspects of past events and societies. The analysis of historical records using deep learning poses critical challenges such as the lack of sufficient labeled data and at times the poor quality of scanned images. In this paper, we propose SpanishFSL, a few-shot learning (FSL) approach for word recognition in 17th-century handwritten Spanish American notary records. SpanishFSL draws inspiration from a zero-shot learning approach developed for image classification. It leverages an autoencoder to construct class-attribute signatures to effectively bridge the gap between seen and unseen classes. This enables SpanishFSL to generalize and accurately recognize words not present in the training set. Our labeled dataset was prepared by paleography experts using a subset of the notary records drafted by two notaries. Through experimental evaluation, we observed that SpanishFSL can outperform other FSL classifiers in terms of word recognition accuracy.

References

[1]
Nouf Alrasheed, Shivika Prasanna, Ryan Rowland, Praveen Rao, Viviana Grieco, and Martin Wasserman. 2021. Evaluation of Deep Learning Techniques for Content Extraction in Spanish Colonial Notary Records. In Proceedings of the 3rd Workshop on Structuring and Understanding of Multimedia HeritAge Contents (Virtual Event, China) (SUMAC’21). 23–30.
[2]
Nouf Alrasheed, Praveen Rao, and Viviana Grieco. 2021. Character Recognition of seventeenth-century Spanish American notary records using deep learning. Digital Humanities Quarterly 15, 4 (2021).
[3]
Argentina. 2023. Archivo General de la Nación. https://www.argentina.gob.ar/
[4]
Sukalpa Chanda, Jochem Baas, Daniel Haitink, Sébastien Hamel, Dominique Stutzmann, and Lambert Schomaker. 2018. Zero-Shot Learning Based Approach for Nedieval Word Recognition Using Deep-Learned Features. In 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, 345–350.
[5]
Li Fei-Fei, Robert Fergus, and Pietro Perona. 2003. A Bayesian Approach to Unsupervised One-Shot Learning of Object Categories. In Proceedings Ninth IEEE International Conference on Computer Vision. 1134–1141 vol.2.
[6]
Li Fei-Fei, Robert Fergus, and Pietro Perona. 2006. One-Shot Learning of Object Categories. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 4 (2006), 594–611.
[7]
Michael Fink. 2004. Object Classification From a Single Example Utilizing Class Relevance Metrics. Advances in Neural Information Processing Systems 17 (2004).
[8]
Ian J. Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press, Cambridge, MA, USA.
[9]
Wen-Bo Hu, Hong-Jian Zhan, Cong Liu, Bing Yin, and Yue Lu. 2023. OTS: A One-shot Learning Approach for Text Spotting in Historical Manuscripts. arXiv preprint arXiv:2304.00746 (2023).
[10]
Philip Kahle, Sebastian Colutto, Günter Hackl, and Günter Mühlberger. 2017. Transkribus - A Service Platform for Transcription, Recognition and Retrieval of Historical Documents. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol. 04. 19–24.
[11]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems, F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger (Eds.). Vol. 25. Curran Associates, Inc.
[12]
Francesco Lombardi and Simone Marinai. 2020. Deep Learning for Historical Document Analysis and Recognition—A Survey. Journal of Imaging 6, 10 (2020), 110.
[13]
MLWave. 2015. Zero Shot Learning. https://github.com/MLWave/extremely-simple-one-shot-learning
[14]
Natalia Silva Prada. 2023. Paleografías Americanas (Americans Paleographies). https://paleografi.hypotheses.org/
[15]
Anuj Rai, Narayanan C Krishnan, and Sukalpa Chanda. 2021. Pho (SC) Net: An Approach Towards Zero-Shot Word Image Recognition in Historical Documents. In Proc. of 16th International Conference on Document Analysis and Recognition (ICDAR). 19–33.
[16]
Bernardino Romera-Paredes and Philip Torr. 2015. An Embarrassingly Simple Approach to Zero-Shot Learning. In International Conference on Machine Learning. PMLR, 2152–2161.
[17]
D. E. Rumelhart, G. E. Hinton, and R. J. Williams. 1986. Learning Internal Representations by Error Propagation. MIT Press, Cambridge, MA, USA, 318–362.
[18]
Aryan Jadon Shruti Jadon. 2020. An Overview of Deep Learning Architectures in Few-Shot Learning Domain. arXiv:2008.06365
[19]
Sicara. 2023. Easy Few-Shot Learning. https://github.com/sicara/easy-few-shot-learning
[20]
Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical Networks for Few-Shot Learning. Advances in Neural Information Processing Systems 30 (2017).
[21]
Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical Networks for Few-shot Learning. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Inc.
[22]
Mohamed Ali Souibgui, Alicia Fornés, Yousri Kessentini, and Beáta Megyesi. 2022. Few Shots are All You Need: A Progressive Learning Approach for Low Resource Handwritten Text Recognition. Pattern Recognition Letters 160 (2022), 43–49.
[23]
Mohamed Ali Souibgui, Alicia Fornés, Yousri Kessentini, and Crina Tudor. 2021. A Few-Shot Learning Approach for Historical Ciphered Manuscript Recognition. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 5413–5420.
[24]
Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip HS Torr, and Timothy M Hospedales. 2018. Learning to Compare: Relation Network for Few-Shot Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1199–1208.
[25]
Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, 2016. Matching Networks for One Shot Learning. Advances in Neural Information Processing Systems 29 (2016).
[26]
Yan Wang, Wei-Lun Chao, Kilian Q Weinberger, and Laurens Van Der Maaten. 2019. Simpleshot: Revisiting Nearest-Neighbor Classification for Few-Shot Learning. arXiv preprint arXiv:1911.04623 (2019).
[27]
Yaqing Wang, Quanming Yao, James T. Kwok, and Lionel M. Ni. 2020. Generalizing from a Few Examples: A Survey on Few-Shot Learning. Comput. Surveys 53, 3, Article 63 (2020), 34 pages.
[28]
Martín Leandro Ezequiel Wasserman. 2019. La escritura paleográfica iberoamericana: letras procesales y encadenadas (Ibero-American paleographic writing: procedural and chained letters). In Introduction to Paleography: Tools for Reading and Analyzing Ancient Documents. National University of La Plata. Faculty of Humanities and Educational Sciences, 199–217.
[29]
Yongqin Xian, Christoph H Lampert, Bernt Schiele, and Zeynep Akata. 2018. Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 9 (2018), 2251–2265.
[30]
Han-Jia Ye, Hexiang Hu, De-Chuan Zhan, and Fei Sha. 2020. Few-Shot Learning via Embedding Adaptation With Set-to-Set Functions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8808–8817.
[31]
Imtiaz Ziko, Jose Dolz, Eric Granger, and Ismail Ben Ayed. 2020. Laplacian Regularized Few-Shot Learning. In International Conference on Machine Learning. PMLR, 11660–11670.

Cited By

View all

Index Terms

  1. Few-Shot Learning for Word Recognition in Handwritten Seventeenth-Century Spanish American Notary Records

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia
    December 2023
    745 pages
    ISBN:9798400702051
    DOI:10.1145/3595916
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 January 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Few-shot learning
    2. historical documents
    3. word recognition

    Qualifiers

    • Short-paper
    • Research
    • Refereed limited

    Funding Sources

    Conference

    MMAsia '23
    Sponsor:
    MMAsia '23: ACM Multimedia Asia
    December 6 - 8, 2023
    Tainan, Taiwan

    Acceptance Rates

    Overall Acceptance Rate 59 of 204 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 180
      Total Downloads
    • Downloads (Last 12 months)180
    • Downloads (Last 6 weeks)31
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media