Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3469096.3474938acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
short-paper

Evaluating deep neural networks for image document enhancement

Published: 16 August 2021 Publication History

Abstract

This work evaluates six state-of-the-art deep neural network (DNN) architectures applied to the problem of enhancing camera-captured document images. The results from each network were evaluated both qualitatively (i. e. with manual visual inspection) and quantitatively (i. e. using Image Quality Assessment metrics - IQA), and also compared with an existing approach based on traditional computer vision techniques. The best performing architectures generally produced good enhancement compared to the existing algorithm, showing that it is possible to use DNNs for document image enhancement. Furthermore, the best performing architectures could work as a baseline for future investigations on document enhancement using deep learning techniques. The main contributions of this paper are: a baseline of deep learning techniques that can be further improved to provide better results, and a evaluation methodology using IQA metrics for quantitatively comparing the produced images from the neural networks to a ground truth.

Supplementary Material

PDF File (a24-kirsten-supp.pdf)
Supplemental material.

References

[1]
Jongmin Baek. 2016. Fast Document Rectification and Enhancement. Available at: https://dropbox.tech/machine-learning/fast-document-rectification-and-enhancement. Accessed in: 2020-05-13.
[2]
Dor Bank, Noam Koenigstein, and Raja Giryes. 2020. Autoencoders. arXiv preprint arXiv:2003.05991 (2020).
[3]
Sebastian Bosse, Dominique Maniry, Klaus-Robert Müller, Thomas Wiegand, and Wojciech Samek. 2017. Deep neural networks for no-reference and full-reference image quality assessment. IEEE Transactions on Image Processing 27, 1 (2017), 206--219.
[4]
HD Cheng and XJ Shi. 2004. A simple and effective histogram equalization approach to image enhancement. Digital signal processing 14, 2 (2004), 158--170.
[5]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.
[6]
Jian Fan. 2007. Enhancement of camera-captured document images with watershed segmentation. CBDAR07 (2007), 87--93.
[7]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.
[8]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[9]
Sheng He and Lambert Schomaker. 2019. DeepOtsu: Document enhancement and binarization using iterative deep learning. Pattern Recognition 91 (2019), 379--390.
[10]
José Luis Hidalgo, Salvador Espana, María José Castro, and José Alberto Pérez. 2005. Enhancement and cleaning of handwritten data by using neural networks. In Iberian Conference on Pattern Recognition and Image Analysis. Springer, 376--383.
[11]
Xiaodan Hu, Mohamed A Naiel, Alexander Wong, Mark Lamm, and Paul Fieguth. 2019. RUNet: A robust UNet architecture for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 0--0.
[12]
Jie Huang, Pengfei Zhu, Mingrui Geng, Jiewen Ran, Xingguang Zhou, Chen Xing, Pengfei Wan, and Xiangyang Ji. 2018. Range scaling global u-net for perceptual image enhancement on mobile devices. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 0--0.
[13]
Matthew Johnson-Roberson, Charles Barto, Rounak Mehta, Sharath Nittur Sridhar, Karl Rosaen, and Ram Vasudevan. 2016. Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? arXiv preprint arXiv:1610.01983 (2016).
[14]
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4681--4690.
[15]
Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 136--144.
[16]
Ke Ma, Zhixin Shu, Xue Bai, Jue Wang, and Dimitris Samaras. 2018. Docunet: document image unwarping via a stacked U-Net. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700--4709.
[17]
Pedram Mohammadi, Abbas Ebrahimi-Moghadam, and Shahram Shirani. 2014. Subjective and objective quality assessment of image: A survey. arXiv preprint arXiv:1406.7799 (2014).
[18]
Yingxue Pang, Jianxin Lin, Tao Qin, and Zhibo Chen. 2021. Image-to-Image Translation: Methods and Applications. arXiv preprint arXiv:2101.08629 (2021).
[19]
Ekta Prashnani, Hong Cai, Yasamin Mostofi, and Pradeep Sen. 2018. PieAPP: Perceptual Image-Error Assessment Through Pairwise Preference. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20]
R Priyadarshini, Arvind Bharani, E Rahimankhan, and N Rajendran. 2021. Low-Light Image Enhancement Using Deep Convolutional Network. In Innovative Data Communication Technologies and Application. Springer, 695--705.
[21]
Hussam Qassim, Abhishek Verma, and David Feinzimer. 2018. Compressed residual-VGG16 CNN model for big data places image recognition. In 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC). IEEE, 169--175.
[22]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234--241.
[23]
Tim Salimans and Durk P Kingma. 2016. Weight normalization: A simple reparameterization to accelerate training of deep neural networks. In Advances in neural information processing systems. 901--909.
[24]
Fuyu Tao, Xiaomin Yang, Wei Wu, Kai Liu, Zhili Zhou, and Yiguang Liu. 2018. Retinex-based image enhancement framework by using region covariance filter. Soft Computing 22, 5 (2018), 1399--1420.
[25]
Chunwei Tian, Lunke Fei, Wenxian Zheng, Yong Xu, Wangmeng Zuo, and Chia-Wen Lin. 2020. Deep learning on image denoising: An overview. Neural Networks (2020).
[26]
Thang Vu, Cao Van Nguyen, Trung X Pham, Tung M Luu, and Chang D Yoo. 2018. Fast and efficient image quality enhancement via desubpixel convolutional neural networks. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 0--0.
[27]
Zhou Wang, Eero P Simoncelli, and Alan C Bovik. 2003. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2. Ieee, 1398--1402.
[28]
Jiahui Yu, Yuchen Fan, Jianchao Yang, Ning Xu, Zhaowen Wang, Xinchao Wang, and Thomas Huang. 2018. Wide activation for efficient and accurate image super-resolution. arXiv preprint arXiv:1808.08718 (2018).
[29]
Kai Zhang, Wangmeng Zuo, Shuhang Gu, and Lei Zhang. 2017. Learning deep CNN denoiser prior for image restoration. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3929--3938.
[30]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223--2232.

Cited By

View all
  • (2023)Handling complex representations in visual modeling tools for MDSD/DSM by means of code generator languagesJournal of Computer Languages10.1016/j.cola.2023.10120875(101208)Online publication date: Jun-2023
  • (2022)An Ensembled Spatial Enhancement Method for Image Enhancement in HealthcareJournal of Healthcare Engineering10.1155/2022/96608202022(1-12)Online publication date: 4-Jan-2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
DocEng '21: Proceedings of the 21st ACM Symposium on Document Engineering
August 2021
178 pages
ISBN:9781450385961
DOI:10.1145/3469096
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep learning
  2. document analysis
  3. image enhancement

Qualifiers

  • Short-paper

Conference

DocEng '21
Sponsor:
DocEng '21: ACM Symposium on Document Engineering 2021
August 24 - 27, 2021
Limerick, Ireland

Acceptance Rates

Overall Acceptance Rate 194 of 564 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Handling complex representations in visual modeling tools for MDSD/DSM by means of code generator languagesJournal of Computer Languages10.1016/j.cola.2023.10120875(101208)Online publication date: Jun-2023
  • (2022)An Ensembled Spatial Enhancement Method for Image Enhancement in HealthcareJournal of Healthcare Engineering10.1155/2022/96608202022(1-12)Online publication date: 4-Jan-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media