HTR for Greek Historical Handwritten Documents
<p>Schematic diagram of the proposed architecture, consisting of (<b>a</b>) the CNN stage and (<b>b</b>) the recurrent stage.</p> "> Figure 2
<p>An example text line image (<b>a</b>) before and (<b>b</b>) after the preprocessing stage.</p> "> Figure 3
<p>Feature maps produced by each layer of the Octave-CNN model.</p> "> Figure 4
<p>An illustration of the gated recurrent unit (GRU) [<a href="#B8-jimaging-07-00260" class="html-bibr">8</a>].</p> "> Figure 5
<p>Example document image from the collection <math display="inline"><semantics> <mrow> <mi>χ</mi> <mi>ϕ</mi> </mrow> </semantics></math>53.</p> "> Figure 6
<p>Example document image from the collection <math display="inline"><semantics> <mrow> <mi>χ</mi> <mi>ϕ</mi> </mrow> </semantics></math>79.</p> "> Figure 7
<p>Example document image from the collection <math display="inline"><semantics> <mrow> <mi>χ</mi> <mi>ϕ</mi> </mrow> </semantics></math>114.</p> "> Figure 8
<p>Floating characters appearing at word endings. The floating portion of the word is represented by a rectangle, while the rest of the word is underlined.</p> "> Figure 9
<p>‘Minuscule’ writing example. Key locations in the text line that correspond to this particularity are underlined.</p> "> Figure 10
<p>Polytonic orthography example.</p> "> Figure 11
<p>An example of a correctly predicted text line image along with the corresponding (<b>a</b>) groud-truth and (<b>b</b>) predicted texts.</p> "> Figure 12
<p>An example of a problematic text line image along with the corresponding (<b>a</b>) groud-truth and (<b>b</b>) predicted texts. The errors concern diacritics (circle), spacing (red line) and abbreviations (square).</p> ">
Abstract
:1. Introduction
2. Related Work
3. Methodology
3.1. Preprocessing
3.2. Octave-CNN Architecture
3.3. Recurrent BGRU Stage
4. Experimental Evaluation
4.1. Datasets
4.1.1. Stavronikita Monastery Collection No. 53 (53)
4.1.2. Stavronikita Monastery Collection No. 79 (79)
4.1.3. Stavronikita Monastery Collection No. 114 (114)
4.1.4. Historical Greek ‘EPARCHOS’ Dataset
4.1.5. IAM
4.1.6. RIMES
4.2. Particularities
- Floating characters: This is a common characteristic appearing in the word endings, where the last characters of the word are written in an abbreviated manner. Floating characters can appear both in uninflected or inflected words. Two examples of floating characters are shown in Figure 8.
- Minuscule writing: A notable distinction is the usage of a lowercase letter rather than an uppercase letter following a ’full stop’ character, as shown in the example of Figure 9. This is owing to the fact that there were no capital letters in use at the time, and this style of writing is known as ‘Minuscule’.
- Polytonic orthography: The polytonic system is common in all Byzantine manuscripts, as illustrated in the example document images shown in s Figure 5, Figure 6 and Figure 7. A particularity of this polytonic system are the characters greek ϊ and greek ϋ, which were used in this form to be distinguished from the diphthong letters, as shown in Figure 10. The problem with these characters concerns their transcription, which is not unique but it relies upon the context. In particular, either the character is transcribed as shown or transcribed as a character without the specific diacritic marks (diaeresis).
4.3. Experimental Setup
4.4. Results
4.5. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- De Sousa Neto, A.F.; Bezerra, B.L.D.; Toselli, A.H.; Lima, E.B. HTR-Flor++: A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models. In Proceedings of the DocEng ’20: ACM Symposium on Document Engineering 2020, Virtual Event, 29 September–1 October 2020; pp. 17:1–17:4. [Google Scholar] [CrossRef]
- Puigcerver, J. Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition? In Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, Kyoto, Japan, 9–15 November 2017; pp. 67–72. [Google Scholar] [CrossRef]
- Chen, Y.; Fan, H.; Xu, B.; Yan, Z.; Kalantidis, Y.; Rohrbach, M.; Yan, S.; Feng, J. Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 3434–3443. [Google Scholar] [CrossRef] [Green Version]
- Markou, K.; Tsochatzidis, L.T.; Zagoris, K.; Papazoglou, A.; Karagiannis, X.; Symeonidis, S.; Pratikakis, I. A Convolutional Recurrent Neural Network for the Handwritten Text Recognition of Historical Greek Manuscripts. In Proceedings of the Pattern Recognition, ICPR International Workshops and Challenges, Virtual Event, 10–15 January 2021; Volume 12667, pp. 249–262. [Google Scholar] [CrossRef]
- Marti, U.; Bunke, H. The IAM-database: An English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 2002, 5, 39–46. [Google Scholar] [CrossRef]
- Grosicki, E.; Carré, M.; Brodin, J.; Geoffrois, E. Results of the RIMES Evaluation Campaign for Handwritten Mail Processing. In Proceedings of the 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, 26–29 July 2009; pp. 941–945. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
- Voigtlaender, P.; Doetsch, P.; Ney, H. Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks. In Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition, Shenzhen, China, 23–26 October 2016; pp. 228–233. [Google Scholar] [CrossRef]
- Graves, A.; Schmidhuber, J. Offline handwriting recognition with multidimensional recurrent neural networks. Adv. Neural Inf. Process. Syst. 2008, 21, 545–552. [Google Scholar]
- Pham, V.; Bluche, T.; Kermorvant, C.; Louradour, J. Dropout Improves Recurrent Neural Networks for Handwriting Recognition. In Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition, Crete, Greece, 1–4 September 2014; pp. 285–290. [Google Scholar] [CrossRef] [Green Version]
- Bluche, T.; Messina, R.O. Gated Convolutional Recurrent Neural Networks for Multilingual Handwriting Recognition. In Proceedings of the 14th IAPR International Conference on Document Analysis and Recognitionf, Kyoto, Japan, 9–15 November 2017; pp. 646–651. [Google Scholar] [CrossRef]
- Dauphin, Y.N.; Fan, A.; Auli, M.; Grangier, D. Language modeling with gated convolutional networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 933–941. [Google Scholar]
- De Sousa Neto, A.F.; Bezerra, B.L.D.; Toselli, A.H.; Lima, E.B. HTR-Flor: A Deep Learning System for Offline Handwritten Text Recognition. In Proceedings of the 33rd SIBGRAPI Conference on Graphics, Patterns and Images, Recife/Porto de Galinhas, Brazil, 7–10 November 2020; pp. 54–61. [Google Scholar] [CrossRef]
- Retsinas, G.; Sfikas, G.; Nikou, C. Iterative Weighted Transductive Learning for Handwriting Recognition. In Proceedings of the 16th International Conference on Document Analysis and Recognition, Lausanne, Switzerland, 5–10 September 2021; Volume 12824, pp. 587–601. [Google Scholar] [CrossRef]
- Kang, L.; Riba, P.; Rusiñol, M.; Fornés, A.; Villegas, M. Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition. arXiv 2020, arXiv:2005.13044. [Google Scholar]
- Wick, C.; Zöllner, J.; Grüning, T. Transformer for Handwritten Text Recognition Using Bidirectional Post-decoding. In Proceedings of the 16th International Conference on Document Analysis and Recognition, Lausanne, Switzerland, 5–10 September 2021; Volume 12823, pp. 112–126. [Google Scholar] [CrossRef]
- Wick, C.; Zöllner, J.; Grüning, T. Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes. arXiv 2021, arXiv:2110.05909. [Google Scholar]
- Chen, K.; Chen, C.; Chang, C. Efficient illumination compensation techniques for text images. Digit. Signal Process. 2012, 22, 726–733. [Google Scholar] [CrossRef]
- Vinciarelli, A.; Luettin, J. A new normalization technique for cursive handwritten words. Pattern Recognit. Lett. 2001, 22, 1043–1050. [Google Scholar] [CrossRef] [Green Version]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 448–456. [Google Scholar]
- Maas, A.L.; Hannun, A.Y.; Ng, A.Y. Rectifier Nonlinearities Improve Neural Network Acoustic Models. 2013. Available online: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.693.1422&rep=rep1&type=pdf (accessed on 25 October 2021).
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Papazoglou, A.; Pratikakis, I.; Markou, K.; Tsochatzidis, L. EPARCHOS - Historical Greek Handwritten Document Dataset. (1.0) [Data Set]. Zenodo. 2020. Available online: https://zenodo.org/record/4095301#.YaneeroxVPY (accessed on 25 October 2021).
- Pratikakis, I.; Papazoglou, A.; Symeonidis, S.; Tsochatzidis, L. Stavronikita Monastery Greek Handwritten Document Collection No. 53. (1.0) [Data Set]. Zenodo. 2021. Available online: https://zenodo.org/record/5595669#.YaneoLoxVPY (accessed on 25 October 2021).
- Pratikakis, I.; Papazoglou, A.; Symeonidis, S.; Tsochatzidis, L. Stavronikita Monastery Greek Handwritten Document Collection No. 79. (1.0) [Data Set]. Zenodo. 2021. Available online: https://zenodo.org/record/5578136#.YaneuboxVPY (accessed on 25 October 2021).
- Pratikakis, I.; Papazoglou, A.; Symeonidis, S.; Tsochatzidis, L. Stavronikita Monastery Greek Handwritten Document Collection No. 114. (1.0) [Data Set]. Zenodo. 2021. Available online: https://zenodo.org/record/5578251#.YaneyboxVPY (accessed on 25 October 2021).
- Tieleman, T.; Hinton, G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. Coursera Neural Netw. Mach. Learn. 2012, 4, 26–31. [Google Scholar]
Dataset | Total Pages | Total Lines | Total Words | Training | Validation | Test |
---|---|---|---|---|---|---|
53 | 54 | 1038 | 5592 | 622 | 104 | 312 |
79 | 40 | 803 | 4389 | 481 | 80 | 242 |
114 | 44 | 1051 | 5467 | 603 | 100 | 302 |
Eparchos | 120 | 2272 | 18809 | 1363 | 227 | 682 |
IAM | 1539 | 8922 | 10,841 | 6161 | 900 | 1861 |
RIMES | 1500 | 12,104 | 6358 | 10,193 | 1133 | 778 |
Dataset | Doc (Deslanting) | Doc (no Deslanting) |
---|---|---|
53 | 6.77/30.09 | 7.19/31.22 |
79 | 6.51/28.51 | 6.73/28.21 |
114 | 7.71/34.30 | 8.32/36.44 |
Eparchos | 4.53/20.03 | 4.16/19.67 |
HTR System | Parameters | Time per Iteration (ms) |
---|---|---|
Puigcerver | 9,982,713 | 12.8 |
Flor | 994,057 | 13.7 |
μDoc | 2,246,137 | 8.0 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tsochatzidis, L.; Symeonidis, S.; Papazoglou, A.; Pratikakis, I. HTR for Greek Historical Handwritten Documents. J. Imaging 2021, 7, 260. https://doi.org/10.3390/jimaging7120260
Tsochatzidis L, Symeonidis S, Papazoglou A, Pratikakis I. HTR for Greek Historical Handwritten Documents. Journal of Imaging. 2021; 7(12):260. https://doi.org/10.3390/jimaging7120260
Chicago/Turabian StyleTsochatzidis, Lazaros, Symeon Symeonidis, Alexandros Papazoglou, and Ioannis Pratikakis. 2021. "HTR for Greek Historical Handwritten Documents" Journal of Imaging 7, no. 12: 260. https://doi.org/10.3390/jimaging7120260
APA StyleTsochatzidis, L., Symeonidis, S., Papazoglou, A., & Pratikakis, I. (2021). HTR for Greek Historical Handwritten Documents. Journal of Imaging, 7(12), 260. https://doi.org/10.3390/jimaging7120260