OntoHuman: Ontology-Based Information Extraction Tools with Human-in-the-Loop Interaction

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13492))

Included in the following conference series:

International Conference on Cooperative Design, Visualization and Engineering

725 Accesses
5 Citations
1 Altmetric

Abstract

This paper presents OntoHuman, a toolchain for involving humans in a process of automatic information extraction and ontology enhancement. Document Semantic Annotation Tool (DSAT) [13], a user interface of OntoHuman, offers an automatic function to extract information in the form of key-value-unit tuples from PDF documents based on ontologies. Additionally, it allows users to provide feedback to improve the ontologies used. Although the information extraction can be improved with the ontology, our use cases were previously limited to an area of space engineering. OntoHuman now tackles this shortcoming by allowing users to upload their customized ontologies. This entends usages to various domains and enables this shareable knowledge to be used cooperatively. Then we display the ontologies in a node-link representation so they are easier to understand. Another major improvement in OntoHuman is the graph data points extraction, which is still missing in the existing information extraction tools. The application of OntoHuman can be used for documents related to any engineering domain and makes the work with ontologies intuitive and collaborative for users.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Document Layout Analysis for Semantic Information Extraction

Linked Data Creation with ExcelRDF

Navigating OWL 2 Ontologies Through Graph Projection

Notes

References

Adnan, K., Akbar, R.: Limitations of information extraction methods and techniques for heterogeneous unstructured big data. Int. J. Eng. Bus. Manag. 11 (2019). https://doi.org/10.1177/1847979019890771
Anikin, A., Litovkin, D., Kultsova, M., Sarkisova, E., Petrova, T.: Ontology visualization: approaches and software tools for visual representation of large ontologies in learning. In: Kravets, A., Shcherbakov, M., Kultsova, M., Groumpos, P. (eds.) CIT &DS 2017. CCIS, vol. 754, pp. 133–149. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65551-2_10
Chapter Google Scholar
Classifying text with AWS Textract. https://www.bakertilly.com/insights/classifying-text-with-aws-textract. Accessed 8 Apr 2022
Buey, M.G., Garrido, A.L., Bobed, C., Ilarri, S.: The AIS project: boosting information extraction from legal documents by using ontologies. In: ICAART (2016)
Google Scholar
Camelot: PDF Table Extraction for Humans. https://camelot-py.readthedocs.io/en/master/. Accessed 8 Apr 2022
ConTrOn. Contron - spacecraft parts ontology 1.2, May 2020
Google Scholar
Decatur, D., Krishnan, S.: Vizextract: automatic relation extraction from data visualizations. CoRR abs/2112.03485 (2021)
Google Scholar
Dudáš, M., Lohmann, S., Svátek, V., Pavlov, D.: Ontology visualization methods and tools: a survey of the state of the art. Knowl. Eng. Rev. 33, e10 (2018)
Article Google Scholar
Jusoh, S., Awajan, A., Obeid, N.: The use of ontology in clinical information extraction. J. Phys. Conf. Ser. 1529(5), 052083 (2020)
Google Scholar
Kaló, A.Z., Sipos, M.L.: Key-value pair searching system via tesseract OCR and post processing. In: 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI), pp. 000461–000464 (2021)
Google Scholar
Konys, A.: Towards knowledge handling in ontology-based information extraction systems. Procedia Comput. Sci. 126, 2208–2218 (2018). Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 22nd International Conference, KES-2018, Belgrade, Serbia
Google Scholar
Luo, J., Li, Z., Wang, J., Lin, C.-Y.: Chartocr: data extraction from charts images via a deep hybrid framework. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1916–1924 (2021)
Google Scholar
Opasjumruskit, K., Peters, D., Schindler, S.: DSAT: ontology-based information extraction on technical data sheets. In: SEMWEB (2020)
Google Scholar
How to extract data out of a PDF, February 2021. https://academy.datawrapper.de/article/135-how-to-extract-data-out-of-pdfs
PDFMiner - a python package for extracting information from PDF documents. https://pdfminersix.readthedocs.io/en/latest/. Accessed 8 Apr 2022
Peters, D., Fischer, P.M., Schäfer, P.M., Opasjumruskit, K., Gerndt, A.: Digital availability of product information for collaborative engineering of spacecraft. In: Luo, Y. (ed.) CDVE 2019. LNCS, vol. 11792, pp. 74–83. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30949-7_9
Chapter Google Scholar
Rizvi, S.T.R., Mercier, D., Agne, S., Erkel, S., Dengel, A., Ahmed, S.: Ontology-based information extraction from technical documents. In: Proceedings of the 10th International Conference on Agents and Artificial Intelligence. SCITEPRESS - Science and Technology Publications (2018)
Google Scholar
Tesseract Open Source OCR Engine. https://tesseract-ocr.github.io/. Accessed 13 Apr 2022
Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)
Article Google Scholar
Wang, Z., Zhan, M., Liu, X., Liang, D.: Docstruct: a multimodal method to extract hierarchy structure in document for general form understanding. arXiv:abs/2010.11685 (2020)

Download references

Author information

Authors and Affiliations

German Aerospace Center (DLR), Institute of Data Science, Jena, Germany
Kobkaew Opasjumruskit, Sarah Böning, Sirko Schindler & Diana Peters

Authors

Kobkaew Opasjumruskit
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Böning
View author publications
You can also search for this author in PubMed Google Scholar
Sirko Schindler
View author publications
You can also search for this author in PubMed Google Scholar
Diana Peters
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kobkaew Opasjumruskit .

Editor information

Editors and Affiliations

University of Balearic Islands, Palma de Mallorca, Spain
Yuhua Luo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Opasjumruskit, K., Böning, S., Schindler, S., Peters, D. (2022). OntoHuman: Ontology-Based Information Extraction Tools with Human-in-the-Loop Interaction. In: Luo, Y. (eds) Cooperative Design, Visualization, and Engineering. CDVE 2022. Lecture Notes in Computer Science, vol 13492. Springer, Cham. https://doi.org/10.1007/978-3-031-16538-2_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-16538-2_7
Published: 20 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16537-5
Online ISBN: 978-3-031-16538-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

OntoHuman: Ontology-Based Information Extraction Tools with Human-in-the-Loop Interaction

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Document Layout Analysis for Semantic Information Extraction

Linked Data Creation with ExcelRDF

Navigating OWL 2 Ontologies Through Graph Projection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

OntoHuman: Ontology-Based Information Extraction Tools with Human-in-the-Loop Interaction

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Document Layout Analysis for Semantic Information Extraction

Linked Data Creation with ExcelRDF

Navigating OWL 2 Ontologies Through Graph Projection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation