Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3476887acmotherconferencesBook PagePublication PageshipConference Proceedingsconference-collections
HIP '21: Proceedings of the 6th International Workshop on Historical Document Imaging and Processing
ACM2021 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
HIP '21: The 6th International Workshop on Historical Document Imaging and Processing Lausanne Switzerland September 5 - 6, 2021
ISBN:
978-1-4503-8690-6
Published:
31 October 2021

Reflects downloads up to 09 Feb 2025Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
SESSION: Session 1: Optical Character Recognition
research-article
Optical Character Recognition of 19th Century Classical Commentaries: the Current State of Affairs

Together with critical editions and translations, commentaries are one of the main genres of publication in literary and textual scholarship, and have a century-long tradition. Yet, the exploitation of thousands of digitized historical commentaries was ...

research-article
Mixed Model OCR Training on Historical Latin Script for Out-of-the-Box Recognition and Finetuning

In order to apply Optical Character Recognition (OCR) to historical printings of Latin script fully automatically, we report on our efforts to construct a widely-applicable polyfont recognition model yielding text with a Character Error Rate (CER) ...

research-article
A survey of OCR evaluation tools and metrics

The millions of pages of historical documents that are digitized in libraries are increasingly used in contexts that have more specific requirements for OCR quality than keyword search. How to comprehensively, efficiently and reliably assess the ...

SESSION: Session 2: Segmentation
research-article
Segmentation of historical maps without annotated data

This paper presents the method which we submitted to the competition of Historical Map Segmentation, in ICDAR’21. The goal is to segment document images of Paris maps from the beginning of the 20th century: delineate the content of the map and locate ...

research-article
Text Detection and Recognition by using CNNs in the Austro-Hungarian Historical Military Mapping Survey

Historical maps include precious data about historical, geographical and economic perspectives of a period. However, several unique challenges and opportunities accompany historical maps compared to modern maps, such as low-quality images, degraded ...

research-article
Including Keyword Position in Image-based Models for Act Segmentation of Historical Registers

The segmentation of complex images into semantic regions has seen a growing interest these last years with the advent of Deep Learning. Until recently, most existing methods for Historical Document Analysis focused on the visual appearance of documents, ...

SESSION: Session 3: Datasets & Databases
research-article
The BIR database – Identifying typographic emphasis in list-like historical documents

Layout analysis and optical character recognition have become traditional tasks for processing historical prints, but are now insufficient. Additional information is found in typographic emphasis, such as bold and italic letters. They carry semantic ...

research-article
Digital Peter: New Dataset, Competition and Handwriting Recognition Methods

This paper presents a new dataset of Peter the Great’s manuscripts and describes a segmentation procedure that converts initial images of documents into lines. This new dataset may be useful for researchers to train handwriting text recognition models ...

research-article
GloSAT Historical Measurement Table Dataset: Enhanced Table Structure Recognition Annotation for Downstream Historical Data Rescue

Understanding and extracting tables from documents is a research problem that has been studied for decades. Table structure recognition is the labelling of components within a detected table, which can be detected automatically or manually provided. ...

SESSION: Session 4: Methods & Models
research-article
Generalized Template Matching for Semi-structured Text

Conventional template matching for named entity recognition on book-length text strings is generalized by allowing search phrases to capture distant tokens. Combined with word-type tagging and format variants (alternative name/date formats), a few ...

research-article
BiblIA - a General Model for Medieval Hebrew Manuscripts and an Open Annotated Dataset

The paper presents Open Source generalized models for recognition and page segmentation, intended for use on the eScriptorium platform or kraken OCR engine, of Medieval Hebrew manuscripts in square script that arrive at a character accuracy of more ...

research-article
Visual Analysis of Chapbooks Printed in Scotland

Chapbooks were short, cheap printed booklets produced in large quantities in Scotland, England, Ireland, North America and much of Europe between roughly the seventeenth and nineteenth centuries. A form of popular literature containing songs, stories, ...

Index terms have been assigned to the content through auto-classification.
Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

Acceptance Rates

Overall Acceptance Rate 52 of 90 submissions, 58%
YearSubmittedAcceptedRate
HIP '19261558%
HIP '17331958%
HIP '13311858%
Overall905258%