Proceedings of the 6th International Workshop on Historical Document Imaging and Processing

HIP '21: Proceedings of the 6th International Workshop on Historical Document Imaging and Processing

September 2021

2021 Proceeding

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

HIP '21: The 6th International Workshop on Historical Document Imaging and Processing Lausanne Switzerland September 5 - 6, 2021

ISBN:

978-1-4503-8690-6

Published:

31 October 2021

Recommend ACM DL

ALREADY A SUBSCRIBER?SIGN IN

Get Alerts for this ConferenceAlerts Save to BinderBinder

Save to Binder

Create a New Binder

Name

Export CitationCitation

Share on

Reflects downloads up to 09 Feb 2025Bibliometrics

Citation Count

Downloads (6 weeks)

Downloads (12 months)

552

Downloads (cumulative)

2,102

Sections

HIP '21: Proceedings of the 6th International Workshop on Historical Document Imaging and Processing

2021

Previous Next

Abstract

No abstract available.

Proceeding Downloads

PDFFront matter (Welcome to HIP'21!, People, Sponsored by, Acknowledgements)

Skip Table Of Content Section

Select All

Export Citations Save to Binder

SESSION: Session 1: Optical Character Recognition

research-article

Optical Character Recognition of 19th Century Classical Commentaries: the Current State of Affairs

Matteo Romanello,
Sven Najem-Meyer,
Bruce Robertson

Pages 1–6https://doi.org/10.1145/3476887.3476911

Together with critical editions and translations, commentaries are one of the main genres of publication in literary and textual scholarship, and have a century-long tradition. Yet, the exploitation of thousands of digitized historical commentaries was ...

- 7
- 193
Metrics
Total Citations7
Total Downloads193
Last 12 Months44
Last 6 weeks5

Abstract
Get Access

research-article

Mixed Model OCR Training on Historical Latin Script for Out-of-the-Box Recognition and Finetuning

Christian Reul,
Christoph Wick,
Maximilian Noeth,
Andreas Buettner,
Maximilian Wehner,
Uwe Springmann

Pages 7–12https://doi.org/10.1145/3476887.3476910

In order to apply Optical Character Recognition (OCR) to historical printings of Latin script fully automatically, we report on our efforts to construct a widely-applicable polyfont recognition model yielding text with a Character Error Rate (CER) ...

- 4
- 135
Metrics
Total Citations4
Total Downloads135
Last 12 Months20
Last 6 weeks3

Abstract
Get Access

research-article

A survey of OCR evaluation tools and metrics

Clemens Neudecker,
Konstantin Baierer,
Mike Gerber,
Christian Clausner,
Apostolos Antonacopoulos,
Stefan Pletschacher

Pages 13–18https://doi.org/10.1145/3476887.3476888

The millions of pages of historical documents that are digitized in libraries are increasingly used in contexts that have more specific requirements for OCR quality than keyword search. How to comprehensively, efficiently and reliably assess the ...

- 27
- 867
Metrics
Total Citations27
Total Downloads867
Last 12 Months332
Last 6 weeks35

Abstract
Get Access

SESSION: Session 2: Segmentation

research-article

Segmentation of historical maps without annotated data

Aurelie Lemaitre,
Jean Camillerapp

Pages 19–24https://doi.org/10.1145/3476887.3476909

This paper presents the method which we submitted to the competition of Historical Map Segmentation, in ICDAR’21. The goal is to segment document images of Paris maps from the beginning of the 20th century: delineate the content of the map and locate ...

- 3
- 79
Metrics
Total Citations3
Total Downloads79
Last 12 Months7
Last 6 weeks0

Abstract
Get Access

research-article

Text Detection and Recognition by using CNNs in the Austro-Hungarian Historical Military Mapping Survey

Yekta Said Can,
Mustafa Erdem Kabadayi

Pages 25–30https://doi.org/10.1145/3476887.3476904

Historical maps include precious data about historical, geographical and economic perspectives of a period. However, several unique challenges and opportunities accompany historical maps compared to modern maps, such as low-quality images, degraded ...

- 3
- 100
Metrics
Total Citations3
Total Downloads100
Last 12 Months12
Last 6 weeks0

Abstract
Get Access

research-article

Including Keyword Position in Image-based Models for Act Segmentation of Historical Registers

Melodie Boillet,
Martin Maarand,
Thierry Paquet,
Christopher Kermorvant

Pages 31–36https://doi.org/10.1145/3476887.3476905

The segmentation of complex images into semantic regions has seen a growing interest these last years with the advent of Deep Learning. Until recently, most existing methods for Historical Document Analysis focused on the visual appearance of documents, ...

- 2
- 45
Metrics
Total Citations2
Total Downloads45
Last 12 Months4
Last 6 weeks0

Abstract
Get Access

SESSION: Session 3: Datasets & Databases

research-article

The BIR database – Identifying typographic emphasis in list-like historical documents

Anna Scius-Bertrand,
Simon Gabay,
Juliette Janes,
Ljudmila Petkovic,
Caroline Corbieres,
Thibault Clerice

Pages 37–42https://doi.org/10.1145/3476887.3476913

Layout analysis and optical character recognition have become traditional tasks for processing historical prints, but are now insufficient. Additional information is found in typographic emphasis, such as bold and italic letters. They carry semantic ...

- 1
- 73
Metrics
Total Citations1
Total Downloads73
Last 12 Months7
Last 6 weeks2

Abstract
Get Access

research-article

Digital Peter: New Dataset, Competition and Handwriting Recognition Methods

Mark Potanin,
Denis Dimitrov,
Alex Shonenkov,
Vladimir Bataev,
Denis Karachev,
Maxim Novopoltsev,
Andrey Chertok

Pages 43–48https://doi.org/10.1145/3476887.3476892

This paper presents a new dataset of Peter the Great’s manuscripts and describes a segmentation procedure that converts initial images of documents into lines. This new dataset may be useful for researchers to train handwriting text recognition models ...

- 7
- 142
Metrics
Total Citations7
Total Downloads142
Last 12 Months22
Last 6 weeks6

Abstract
Get Access

research-article

GloSAT Historical Measurement Table Dataset: Enhanced Table Structure Recognition Annotation for Downstream Historical Data Rescue

Juliusz Ziomek,
Stuart E. Middleton

Pages 49–54https://doi.org/10.1145/3476887.3476890

Understanding and extracting tables from documents is a research problem that has been studied for decades. Table structure recognition is the labelling of components within a detected table, which can be detected automatically or manually provided. ...

- 1
- 116
Metrics
Total Citations1
Total Downloads116
Last 12 Months18
Last 6 weeks3

Abstract
Get Access

SESSION: Session 4: Methods & Models

research-article

Generalized Template Matching for Semi-structured Text

George Nagy

Pages 55–60https://doi.org/10.1145/3476887.3476895

Conventional template matching for named entity recognition on book-length text strings is generalized by allowing search phrases to capture distant tokens. Combined with word-type tagging and format variants (alternative name/date formats), a few ...

- 1
- 51
Metrics
Total Citations1
Total Downloads51
Last 12 Months6
Last 6 weeks1

Abstract
Get Access

research-article

BiblIA - a General Model for Medieval Hebrew Manuscripts and an Open Annotated Dataset

Daniel Stoekl Ben Ezra,
Bronson Brown-DeVost,
Pawel Jablonski,
Hayim Lapin,
Benjamin Kiessling,
Elena Lolli

Pages 61–66https://doi.org/10.1145/3476887.3476896

The paper presents Open Source generalized models for recognition and page segmentation, intended for use on the eScriptorium platform or kraken OCR engine, of Medieval Hebrew manuscripts in square script that arrive at a character accuracy of more ...

- 4
- 170
Metrics
Total Citations4
Total Downloads170
Last 12 Months50
Last 6 weeks17

Abstract
Get Access

research-article

Visual Analysis of Chapbooks Printed in Scotland

Abhishek Dutta,
Giles Bergel,
Andrew Zisserman

Pages 67–72https://doi.org/10.1145/3476887.3476893

Chapbooks were short, cheap printed booklets produced in large quantities in Scotland, England, Ireland, North America and much of Europe between roughly the seventeenth and nineteenth centuries. A form of popular literature containing songs, stories, ...

- 7
- 131
Metrics
Total Citations7
Total Downloads131
Last 12 Months30
Last 6 weeks0

Abstract
Get Access

Save to Binder

Create a New Binder

Name

Index Terms

Proceedings of the 6th International Workshop on Historical Document Imaging and Processing

Index terms have been assigned to the content through auto-classification.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

HIP '11: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
HIP '13: Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing
DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems

Acceptance Rates

Overall Acceptance Rate 52 of 90 submissions, 58%

Year	Submitted	Accepted	Rate
HIP '19	26	15	58%
HIP '17	33	19	58%
HIP '13	31	18	58%
Overall	90	52	58%

Export Citations

Select Citation format

Please download or close your previous search result export first before starting a new bulk export.
Preview is not available.
By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress.
Download
- Download citation
- Copy citation

Save to Binder

Sections

Proceeding Downloads

Save to Binder

Index Terms

Recommendations

HIP '11: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing

HIP '13: Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing

DAS '10: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems

Acceptance Rates