Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1891903.1891962acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
poster

Active learning strategies for handwritten text transcription

Published: 08 November 2010 Publication History

Abstract

Active learning strategies are being increasingly used in a variety of real-world tasks, though their application to handwritten text transcription in old manuscripts remains nearly unexplored. The basic idea is to follow a sequential, line-byline transcription of the whole manuscript in which a continuously retrained system interacts with the user to efficiently transcribe each new line. This approach has been recently explored using a conventional strategy by which the user is only asked to supervise words that are not recognized with high confidence. In this paper, the conventional strategy is improved by also letting the system to recompute most probable hypotheses with the constraints imposed by user supervisions. In particular, two strategies are studied which differ in the frequency of hypothesis recomputation on the current line: after each (iterative) or all (delayed) user corrections. Empirical results are reported on two real tasks showing that these strategies outperform the conventional approach.

References

[1]
T. Kristjannson, A. Culotta, P. Viola, and A. McCallum. Interactive information extraction with constrained conditional random fields. In AAAI 2004, pages 412--418, San Jose, CA (USA).
[2]
D. Pérez et al. The GERMANA database. In ICDAR 2009, pages 301--305, Barcelona (Spain).
[3]
N. Serrano, F. Castro, and A. Juan. The RODRIGO database. In LREC 2010, pages 2709--2712, Valleta (Malta).
[4]
N. Serrano, D. Pérez, A. Sanchis, and A. Juan. Adaptation from Partially Supervised Handwritten Text Transcriptions. In ICMI-MLMI 2009, pages 289--292, Cambridge, MA (USA).
[5]
N. Serrano, A. Sanchis, and A. Juan. Balancing error and supervision effort in interactive-predictive handwriting recognition. In IUI 2010, pages 373--376, Hong Kong (China).
[6]
N. Serrano, L. Tarazón, D. Pérez, O. Ramos-Terrades, and A. Juan. The GIDOC prototype. In PRIS 2010, Funchal (Portugal).
[7]
B. Settles. Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin--Madison, 2009.
[8]
A. H. Toselli, V. Romero, L. Rodríguez, and E. Vidal. Computer Assisted Transcription of Handwritten Text. In ICDAR 2007, pages 944--948, Curitiba (Brazil).

Cited By

View all
  • (2019)Making Large Collections of Handwritten Material Easily Accessible and SearchableDigital Libraries: Supporting Open Science10.1007/978-3-030-11226-4_2(18-28)Online publication date: 15-Jan-2019
  • (2018)From HMMs to RNNs: Computer-Assisted Transcription of a Handwritten Notarial Records Collection2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR)10.1109/ICFHR-2018.2018.00029(116-121)Online publication date: Aug-2018
  • (2017)$$\textit{TexT}$$ TexT - Text Extractor Tool for Handwritten Document Transcription and AnnotationDigital Libraries and Multimedia Archives10.1007/978-3-319-73165-0_8(81-92)Online publication date: 21-Dec-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI-MLMI '10: International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
November 2010
311 pages
ISBN:9781450304146
DOI:10.1145/1891903
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computer-assisted text transcription
  2. confidence measures
  3. document analysis
  4. handwriting recognition

Qualifiers

  • Poster

Funding Sources

Conference

ICMI-MLMI '10
Sponsor:

Acceptance Rates

ICMI-MLMI '10 Paper Acceptance Rate 41 of 100 submissions, 41%;
Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Making Large Collections of Handwritten Material Easily Accessible and SearchableDigital Libraries: Supporting Open Science10.1007/978-3-030-11226-4_2(18-28)Online publication date: 15-Jan-2019
  • (2018)From HMMs to RNNs: Computer-Assisted Transcription of a Handwritten Notarial Records Collection2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR)10.1109/ICFHR-2018.2018.00029(116-121)Online publication date: Aug-2018
  • (2017)$$\textit{TexT}$$ TexT - Text Extractor Tool for Handwritten Document Transcription and AnnotationDigital Libraries and Multimedia Archives10.1007/978-3-319-73165-0_8(81-92)Online publication date: 21-Dec-2017
  • (2014)Effective balancing error and user effort in interactive handwriting recognitionPattern Recognition Letters10.1016/j.patrec.2013.03.01037(135-142)Online publication date: 1-Feb-2014
  • (2014)Interactive handwriting recognition with limited user effortInternational Journal on Document Analysis and Recognition10.1007/s10032-013-0204-517:1(47-59)Online publication date: 1-Mar-2014
  • (2012)Transcribing handwritten text images with a word soup gameCHI '12 Extended Abstracts on Human Factors in Computing Systems10.1145/2212776.2223788(2273-2278)Online publication date: 5-May-2012
  • (2012)Character-Based Handwritten Text Recognition of Multilingual DocumentsAdvances in Speech and Language Technologies for Iberian Languages10.1007/978-3-642-35292-8_20(187-196)Online publication date: 2012
  • (2011)Language identification for interactive handwriting transcription of multilingual documentsProceedings of the 5th Iberian conference on Pattern recognition and image analysis10.5555/2021341.2021423(596-603)Online publication date: 8-Jun-2011
  • (2011)Language Identification for Interactive Handwriting Transcription of Multilingual DocumentsPattern Recognition and Image Analysis10.1007/978-3-642-21257-4_74(596-603)Online publication date: 2011

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media